In our Nagios setup we're using templates and object inheritance for services and hosts.
#Le Hosts
define host{
use linux-nrpe,linux-dc3,linux-cassandra
host_name tigris
alias tigris
address 192.168.4.72
}
define host{
use linux-nrpe,linux-dc3,linux-cassandra
host_name euphrates
alias euphrates
address 192.168.4.177
}
#Le Templates
define host{
name linux-nrpe
use all-hosts
hostgroups linux-nrpe
contact_groups rhands,usergroup1,opcomms
register 0
}
#Le Services
define service{
hostgroup_name linux-nrpe
use high-priority-service,graphed-service
service_description Load
check_command check_by_nrpe!check_load!5,5,6!9,9,9
contact_groups rhands,usergroup1,opcomms
}
[...etc...]
The problem with this setup is all servers in the linux-nrpe
group trigger alerts when their load levels hit whatever is defined in the service, but our workhorse servers might run 24/7 at a load of 20 but our DB servers sit quite happily at ~1 unless something goes wrong, so we find the system sending out too many alerts or having to ignore/not alert on things. Defining individual service definitions for each server (lots of them) would take ages, what we'd really like to do is something like
define host{
name linux-nrpe
use all-hosts
hostgroups linux-nrpe
contact_groups rhands,usergroup1,opcomms
register 0
perf_load 2,2,3 5,5,6
perf_mem 95% 97%
[...more...]
}
define service{
hostgroup_name linux-nrpe
use high-priority-service,graphed-service
service_description Load
check_command check_by_nrpe!check_load!$perf_mem$
contact_groups rhands,usergroup1,opcomms
}
I looked through the docs and couldn't see anything, unless I'm missing something. Any ideas?
We have a quite similar solution running here in our Nagios Monitoring. Custom Host/Service Variables have to start with an underscore on definition and on reference you have to add _HOST or _SERVICE as prefix and all uppercase as name.
Therefore you perf_load and perf_mem custom variable has to be defined as
and referenced as
A snippet from a running config of our Nagios:
You find more details in the Nagios Documentation.
For the reference, this work also fine in Icinga.
You can also define the thresholds in the NRPE config, on the hosts themselves. This isn't practical if you have more than a few dozens hosts, unless you have some sort of conf management (something like puppet, or even just git/hg/svn/whatever) and use 'includes' in nrpe.cfg.
Lairsdragon's suggestion is much better, though. The one thing I would add is:
It can be helpful to name custom object vars with two leading underscores ($__FOO), so they can be called as "$_HOST_FOO".