I have the status of a particular application being reported as string over SNMP.
When everything is working as expected, that monitoring reports an empty string and when some of the data sources have problems, it reports back an string with the names of the affected data sources.
I want to display that string as an alert/info on Zenoss' event console whenever it is not empty.
The problem is that whenever I add a Data Source of type SNMP to a Monitoring Template, it assumes numeric value.
The idea was to use a StatusThreshold just to see when the value changes.
Having that setup results in no events appearing at the Event Console.
From my understanding, the thresholds are evaluated against the value in the RRD database, which is numeric only, so the string gets consistently turned into 'NaN'.
The Zenoss' Monitoring Templates interface doesn't show an intuitive way to deal with strings. How can that be done?
One possible solution turns out to be very simple but searching the internet yields a lot of similar question that are either not answered or ended up taking a different approach or being a different problem to begin with.
Zenoss offers the ability to integrate with Nagios plugins, which might not draw your attention if you don't know how this integration works.
Basically, it means you can call an arbitrary command as long as it returns exit code 0 for success (clear event) and 1, 2 or 3 for error (generate event), as well as a properly formatted string on STDOUT.
Armed with that information, you can write a simple script that reads your SNMP data using
snmpget
, parses it and prints the message you want to appear on the event.A simple and generic Perl example could be:
save that as
$ZENHOME/libexec/string_monitor
on your zenoss server andchmod +x $ZENHOME/libexec/string_monitor
Then on Monitoring Templates, add a Data Source of type COMMAND and fill the Command Template field like this:
string_monitor ${here/manageIp} <OID> "Error reported on:"
A few things to remember if you're not familiar with Zenoss:
If you have a setup with a zenoss server and many collectors, this will be called from the collector your device is assigned to, not from your main server, so remember to have the script available on all collectors. When it tries to call
string_monitor
on the collector and it is not there you will end up with an event on the Event Console that reads "Code: 2 - Msg: Misuse of shell builtins"That is very misleading as it is easy to imply there is a problem with your script when the error was that
$ZENHOME/libexec/string_monitor
was not found. :-)This can be particularly confusing if you use the "Test" button when setting up the Data Source, in which case the request will be made from the main server instead and it will appear to work and then display the above error on the console instead.
Also, to make your process faster, you might want to ssh into the collector,
sudo -i -uzenoss
and runzencommand run -d your.targetdevice.com -v 10
This should generate a full run on that device and your event should appear on the event console (assuming it wasn't empty).
After you get it working, it is also possible to pass actual numeric data from your script into Zenoss, the format for the printed line is:
failing, good and warning will be passed as data points on that data source and you can use MinMax Thresholds or plot graphics on them.
For a simple start, I recommend getting it working with a minimal/dummy script and expanding from there: