The EPEL package for Nagios on RedHat Enterprise Linux 6 has been updated some weeks ago from version 3.4 to version 4.3. The update has been done by a simple yum update
command, and nothing indicated a major version change.
Although after the update Nagios was seemingly working fine (all services are properly visible in the Web interface), not much is actually working: service checks are not performed and mails are not sent either. Dozens of error messages are visible in /var/log/messages
:
Jan 26 15:58:55 srv1 nagios: Unable to send check for host 'srv3' to worker (ret=-2)
Jan 26 15:58:58 srv1 nagios: Unable to run check for service 'Total Processes' on host 'srv4'
Jan 26 15:59:05 srv1 nagios: Unable to run check for service 'Lab Home Partition' on host 'srv1'
Furthermore, trying to restart nagios ends up in an error that was not present before the update: No usable PID found in /var/run/nagios/nagios.pid
. This part of the issue seems to have a solution here: Nagios Woudn't Start, now won't Stop!
After noticing that the update created a /etc/nagios/nagios.cfg.rpmnew
file, I ran a diff
with the original config file from the 3.5.1 RPM to see what the differences were, and changed the actual config file accordingly. The changes are mainly concerning the position of some files used at runtime (here are the values of the new version):
object_cache_file=/var/spool/nagios/objects.cache
precached_object_file=/var/spool/nagios/objects.precache
lock_file=/var/run/nagios/nagios.pid
temp_file=/var/spool/nagios/nagios.tmp
check_result_path=/var/spool/nagios/checkresults
This solves the stop/restart issue mentioned above, however, it breaks the web interface, which now displays Error: Could not read object configuration data!
. And the service checks still don't run.
Error messages are also present in /var/log/audit/audit.log
, indicating that the issue is probably related to SELinux (the system is running in enforced mode):
type=AVC msg=audit(1516991640.421:263116): avc: denied { getattr } for pid=29_exec_t:s0 tclass=file
type=SYSCALL msg=audit(1516991640.421:263116): arch=c000003e syscall=4 success=n fsgid=494 tty=(none) ses=4000 comm="check_procs" exe="/usr/lib64/nagios/plugins
Indeed, temporarily setting SELinux to permissive mode completely solves the issue; however, this is not a solution. How to properly update the SELinux settings while keeping SELinux in enforcing mode?
The required SELinux profiles are available in the
nagios-selinux
package also available on EPEL. Unfortunately the update doesn't automatically installs it while switching from Nagios 3.5.1 to Nagios 4.3.4, and it must therefore be added manually:Of course the changes to the configuration files (importing the new paths from the
.rpmnew
config file) are also necessary for Nagios to work properly.