I'd like to use Nagios to monitor the redundant PSUs in my servers (running Debian Wheezy).
I've run the sensors-detect
script in the lm-sensors
package, and the only thing it can find is
Driver `ipmisensors':
* ISA bus, address 0xca2
Chip `IPMI BMC KCS' (confidence: 8)
I then installed freeipmi-tools
, and I find that I can get some useful output from ipmi-sensors
:
$ sudo ipmi-sensors --group='Power Supply'
5: Power Supply 1 (Power Supply): [Presence detected]
6: Power Supply 2 (Power Supply): [Presence detected]
7: Power Supplies (Power Supply): [Fully Redundant]
I can write a Nagios plugin to run ipmi-sensors
locally, parse its output, and alert if it changes, but I'm reluctant to rely on the output format staying the same, and I can't figure out how to get more machine-readable output.
I've looked at check_ipmi_sensor, but it seems only to operate where the IPMI device is available on the network; mine is not.
Is there a better way than parsing the output of ipmi-sensors
?
There are several other plugins for IPMI listed in Nagios Exchange. This is (sometimes) a better place to start looking than Google.
For example:
ipmitool
free-ipmi
There is no reason to parse the IPMI data. It takes a CPU thread to read and a thread to parse and if you are scaling to data center size systems, thousands of servers thats a lot of threads. Instead use an API, java(Vrx or Hemi) or C library(ipmitool or freeipmi) to access the IPMI data directly. Data Centers (40 k servers) can read 6 million IPMI sensors/minute and thread creation becomes the limiting factor.
The advantage an API is that IPMB bus wirte errors, as in the bus is busy or has a permement hardware error are reported and you can decide to retry retrieving the data.