Ping a Specific Port

Question

Kev

Asked: 2011-09-07 18:07:39 +0800 CST2011-09-07 18:07:39 +0800 CST 2011-09-07 18:07:39 +0800 CST

Dynamically setting a new test interval for Nagios checks

772

Nagios check notification intervals must be >= to a check interval because this prevents Nagios from sending out false alarm notifications should a service return to an UP status between checks. I understand the reasoning behind that.

We have a number of checks that run every 30 minutes. This means that if a check fails only one notification is sent out each time the service is checked after the retries are used up.

What I need is to be able to keep pestering the duty admin pager every two minutes after a check has gone HARD DOWN/CRITICAL. I can't do this because the next notification will only go out on the next check i.e. in another 30 minutes.

A feature we had on our old monitoring system was to set a new lower check interval as soon as the check had gone HARD DOWN/CRITICAL. This meant we could keep rechecking every two minutes (and sending alerts) until the alert was acknowledged by a human or changed its status to UP, after which the check interval would revert to 30 minutes.

Is there a way to facilitate this on Nagios?

I've had some thoughts about writing an event handler which will reschedule a check for two minutes in the future after a check has gone HARD DOWN/CRITICAL (by directly sending a command to Nagios).

I'm wondering if anyone else has had to do a similar thing?

I'm running Nagios Core 3.2.3.

1 Answers

Voted

quanta · Answer 1 · 2011-09-07T18:24:24+08:00

Best Answer

quanta

2011-09-07T18:24:24+08:002011-09-07T18:24:24+08:00

You can do it by using CHANGE_NORMAL_SVC_CHECK_INTERVAL and CHANGE_NORMAL_HOST_CHECK_INTERVAL.

Add an event handler for your service:

define service {
    host_name              ...
    service_description    ...
    check_command          ...
    contact_groups         ...
    event_handler          change_check_interval
}

The change_check_interval was defined in commands.cfg:

define command {
    command_name    change_check_interval
    command_line    $USER1$/eventhandlers/change_check_interval.sh $SERVICESTATE$ $SERVICESTATETYPE$ $SERVICEATTEMPT$ $HOSTADDRESS$
}

The content of change_check_interval.sh:

#!/bin/bash

now=`date +%s`
commandfile='/usr/local/nagios/var/rw/nagios.cmd'

case "$1" in
    OK)
        ;;
    WARNING)
        ;;
    UNKNOWN)
        ;;
    CRITICAL)
        /bin/printf "[%lu] CHANGE_NORMAL_SVC_CHECK_INTERVAL;host1;service1;2\n" $now > $commandfile
        ;;
esac

exit 0

Make sure that external commands is enabled in nagios.cfg:

check_external_commands=1

5

Dynamically setting a new test interval for Nagios checks

Ping a Specific Port

Check if port is open or closed on a Linux server?

How to automate SSH login with password?

How do I tell Git for Windows where to find my private RSA key?

What's the default superuser username/password for postgres after a new install?

What port does SFTP use?

Resolve host name from IP address

Command line to list users in a Windows Active Directory group?

What is a Pem file and how does it differ from other OpenSSL Generated Key File Formats?

How to determine if a bash variable is empty?