Just installed nagios on a central machine and nrpe on 10 remote linux machines and just started monitoring them. It works great. I can get cpu load, current users, processes, mysql, etc. I can't find a way to monitor memory usage using the core plugins. What am I missing? Do I need an external plugin to do this?
i suggest you rather monitor swap usage. check out check_swap plugin - it comes by default [at least in debian].
yes, i do thought why they didnt include a plugin which will check the memory, when they have included check_swap and check_disk... ok, something wrong... but the check_memory plugin is available from the exchange.nagios.org http://exchange.nagios.org/directory/Plugins/Operating-Systems/Solaris/check_mem-2Epl/details
this one works pretty nice.. its a perl script, so not even an installation needed... this is the link http://sysadminsjourney.com/content/2009/06/04/new-and-improved-checkmempl-nagios-plugin/ from the above plugin page...simple and good documentation...
WARNING - 9.9% (406520 kB) free!|TOTAL=4113824KB;;;; USED=3707304KB;;;; FREE=406520KB;;;; CACHES=816947KB;;;;
So, the configuration steps are simple:
Working with Nagios Server and NRPE Client to automatically restart services on failure January 13, 2015
Executing Remote Commands using Nagios and NRPE
1. Installing NRPE on remote host:
rpm -Uvh http://download.fedoraproject.org/pub/epel/6/i386/epel-release-6-8.noarch.rpm yum –enablerepo=epel -y install nrpe nagios-plugins yum –enablerepo=epel -y list nagios-plugins* vim /etc/nagios/nrpe.cfg
service nrpe start chkconfig nrpe on
2. To execute commands on remote nagios client follow below steps:
vim /etc/nagios/nrpe.cfg a. Change from 0 to 1, as shown below: dont_blame_nrpe=1 b. Configuring the Command in NRPE Open your nrpe.cfg file in a text editor and add the following line to define the command in NRPE. Add below mentioned command below the commands section in the nrpe.cfg file: command[runcmd]=sudo service $ARG1$ restart
vim /etc/sudoers nagios ALL = NOPASSWD: /sbin/service nrpe ALL = NOPASSWD: /sbin/service
Testing runcmd command from the nagios server:
Testing the Commands from Nagios Server Moving over to the Nagios command line, the service restart script will be using Check_Nrpe to send the command to the Service’s Host. Navigate to the Nagios command line and enter the following commands:
/usr/local/nagios/libexec/check_nrpe -H 192.168.5.180 -p 5666 -c runcmd -a httpd
Executing remote commands using Nagios NRPE & Event Handler Configuring the nagios client/remote machine 1. Modify the nrpe.cfg as shown below: Change dont_blame_nrpe=0 to don’t_blame_nrpe=1 2. Add the below custom command in the nrpe.cfg as shown below: command[event-ntp]=/usr/lib64/nagios/plugins/event-ntp $ARG1$ $ARG2$ $ARG3$ 3. Create a event handler, as shown below: vim /usr/lib64/nagios/plugins/event-ntp
Give execution permission and change owner and group: chmod +x /usr/lib64/nagios/plugins/event-ntp chown nagios:nagios /usr/lib64/nagios/plugins/event-ntp
The tricky, and probably not the most secure way to do this, is to modify the sudoer’s file to allow the nagios user to execute system commands
visudo add the following: User_Alias NAGIOS = nagios,nagcmd Cmnd_Alias NAGIOSCOMMANDS = /sbin/service Defaults:NAGIOS !requiretty NAGIOS ALL=(ALL) NOPASSWD: NAGIOSCOMMANDS
Configuring Event Handler on the Nagios Server First thing you need to do is create an event-ntp command in the commands.cfg file:
define command{ command_name event-ntp command_line /usr/local/nagios/libexec/check_nrpe -H $HOSTNAME$ -c event-ntp -a $SERVICESTATE$ $SERVICESTATETYPE$ $SERVICEATTEMPT$ }
define service{ use local-service host_name neeraj-test service_description Time Sync Check event_handler event-ntp check_command check_nrpe!check_ntp }
If all goes well, you should receive an email from your client with an output from ntpq -p, and an ntpd service restart. If you have any problems, not receiving email, or not executing the said script, set the debug level=1 on nrpe.cfg, restart nrpe, execute the above event-ntp test, and check your logs. As you can see, it’s not too difficult to execute event-handler scripts, and saves Administrator’s time when nagios can do the leg-work on system/host critical alerts.