is it possible to have nagios check for host down before a service failure is sent? if a host is down or rebooted we get a lot of service notifications but just need the 1 host down..
its a really annoying issue because we linked nagios to our ticket system
update:
Im not sure what happened, we have 2 nagios environments, I just inherited the environment from the other department and it was one of their major complains with it (was also new to me since my own environment has more checks and never had this issue)
After cleaning up the (hardly functioning) environment and implementing it in a helpdesk tool (otrs) I didnt see this behaviour so I suspect the messages were just in the minds of the coworkers (since nagios was mailing several times a minute!)
its now green after a few weeks hard work and the department is very happy with it...
sorry that I didnt close this issue before and thank you for your time!
another update: finally figured it out (I think). the nagios agent (opsview) crashed so it sends out the connection refused by host messages. I think that was bothering the IT department.
Something must be misconfigured somewhere, or the host is coming up quicker than the service checks failing. Even in the URL that Khaled posted it says host checks are done on-demand when a service changes state:
This basically says that Nagios will check on set intervals, and when a service changes state. When a service breaks (goes into a WARNING/CRITICAL state), a host check is executed, and if the host is seen as down, it should suppress the service notifications, assuming you have it configured that way. Can you show us your service and host definitions, masking any hostname/addresses to protect the innocent if you want.
As a side note, I've been using Nagios for years, and never had a service alert when a host is in a down state, unless I specifically configured it to do so.
Old post, but probably worth mentioning the common enough case with hosts that are permanently configured to reject pings etc, but need to have some visible services monitored.
In these cases, as mentioned here: Nagios Hosts Down but services up, a dummy check can be used to ignore the host and use services instead.
What I tend to do (nagios 3.x) to get a meaningful host check in these cases is change the host check command to use check_tcp on a port that I know there's a monitored service on, commonly port 80, and change the check-host-alive to call this with an appropriate port:
and configure the host with
and rely on service dependencies to decide whether to check other services, and to link to your ticket system (via a custom service notification command). At least you know if the host check is down, something's wrong.
I don't think it is possible. Nagios does its regular scheduling for service and host checks. Also, it checks host status when a service status changes. You can have a look at this page.
I think you need to implement this mechanism yourself if you need it. For example, you can receive and store the service status change. Then, you can send the notification only if the host status does not change (as a result of another check or after some timeout).
This should be possible with Dependencies
https://assets.nagios.com/downloads/nagioscore/docs/nagioscore/3/en/dependencies.html
Heres a snippet from the site: