Am trying to detect disk thrashing by monitoring si, so from the vmstat command. I am monitoring other services using nagios and service check happens after every 5 minutes. For this thrashing service I want that nagios should check it after every 20 minutes and if the status returned is not OK(ie warning or critical) then thrashing service should be checked after every 3 minutes till the status returned by the service becomes OK. The service check time for all other service remains unchanged.
I am new to Nagios and any help on this would be really appreciated.
Assuming that the
interval_length
directive is set to 60 by the default:For the special services, you need to define a different template for it in
/usr/local/nagios/etc/objects/templates.cfg
:Pay attention to the:
normal_check_interval
: this service is check every 20 minutes under normal conditionretry_check_interval
: the number of minutes to wait before scheduling a re-check when service has changed to non-OK state. Notice that if the service has been retriedmax_attempts
time without a change in its status, it will revert to being scheduled atcheck_interval
rate.and use this template for your service:
You may also need to define a service escalation to change the
notification_interval
based on the service state, something like this:It means that this service escalation is used when service is in WARNING, UNKNOWN, or CRITICAL state. And you now have a new notification intervals: 10 minutes.