Ideally, with as simple of install as possible and without requiring rebooting the servers. Mostly for DL380 G5's
if it helps.
Ideally, with as simple of install as possible and without requiring rebooting the servers. Mostly for DL380 G5's
if it helps.
This depends slightly on the operating systems you're running on the servers, but in general, it is possible to obtain alerts from HP ProLiant servers and Smart Array RAID controllers.
The full driver and software support listing for your DL380 G5 systems is listed here.
SNMP and a monitoring solution is the best approach... But you can augment that with some of HP's tools. HP offers the HP Systems Insight Manager, which is available for download and also comes with the servers. This is ideal for collections of servers. If you're looking for one-off alerts without building a management or monitoring infrastructure, you can simply install the HP Management Agents (aka ProLiant Support Pack).
For standalone Linux systems, I'll have the agents send traps via email. I'll usually configure the support pack with defaults or a custom bundle, then edit
/opt/hp/hp-snmp-agents/cma.conf
and change thetrapemail
line to point to the recipient address:If you're running Linux and don't want to install the full HP management suite, you can develop a script around the cciss_vol_status utility to query controller/disk status. Also see: Installing HP Agents on OpenFiler
Check out HP Insight Manager
https://www.hpe.com/us/en/product-catalog/detail/pip.489496.html#
I believe it should work with your Servers.
I used the lightweight program that @ewwite mentioned in his answer: cciss_vol_status
If you follow the accompanying INSTALL instructions, the script is placed in
/usr/local/bin/cciss_vol_status
.Here is a wrapper script I use to grep the output of cciss_vol_status, and send an email if any array has a status of FAILED.
Call the above script in cron. I run the check every two minutes:
We do use HP System Insight Manager to check if our HP's are up and running, but nothing beyond that. I found the Linux agent to be overkill for us, since we have other monitoring solutions in place, so this script above serves its specific purpose well.
UPDATE
Just a troubleshooting tip in case you run into this. This script proved helpful this morning when I got an email about a failed array with:
The device went read-only and was not visible in
/proc/partitions
. I rebooted the server and saw these messages on boot:I selected F2 and the RAID was fine and mounted on boot.
install smartmontools. Mails you BEFORE a drive fails.