Glusterfs, while being a nice distributed filesystem, provides almost no way to monitor it's integrity. Servers can come and go, bricks might get stale or fail and I afraid to know about that when it is probably too late.
Recently we had an strange failure when everything appeared working, but one brick fell out from the volume (found by pure coincidence).
Is there a simple and reliable way (cron script?) that will let me know about health status of my GlusterFS 3.2 volume?
This has been a request to the GlusterFS developers for a while now and there is nothing out-of-the-box solution you can use. However, with a few scripts it's not impossible.
Pretty much entire Gluster system is managed by a single gluster command and with a few options, you can write yourself health monitoring scripts. See here for listing info on bricks and volumes -- http://gluster.org/community/documentation/index.php/Gluster_3.2:_Displaying_Volume_Information
To monitor performance, look at this link -- http://gluster.org/community/documentation/index.php/Gluster_3.2:_Monitoring_your_GlusterFS_Workload
UPDATE: Do consider upgrading to http://gluster.org/community/documentation/index.php/About_GlusterFS_3.3
You are always better off with being on the latest release since they seem to have more bug fixes and well supported. Ofcourse, run your own tests before moving to a newer release -- http://vbellur.wordpress.com/2012/05/31/upgrading-to-glusterfs-3-3/ :)
There is an admin guide with specific section for monitoring your GlusterFS 3.3 installation in Chapter 10 -- http://www.gluster.org/wp-content/uploads/2012/05/Gluster_File_System-3.3.0-Administration_Guide-en-US.pdf
See here for another nagios script -- http://code.google.com/p/glusterfs-status/
Please check the attached script at https://www.gluster.org/pipermail/gluster-users/2012-June/010709.html for gluster 3.3; it's probably easily adaptable to gluster 3.2.
There is a nagios plugin available for monitoring. You may have to edit it for your version though.
@Arie Skliarouk, your
check_gluster.sh
has a typo—on the last line, you grep forexitst
instead ofexist
. I went ahead and rewrote it to be a bit more compact, and to remove the requirement for a temporary file.I was able to configure the nagios monitoring for glusterfs as mentioned below :
http://gopukrish.wordpress.com/2014/11/16/monitor-glusterfs-using-nagios-plugin/