I'm using quite a big nagios configuration (about 4000 services), without any dependencies. This results in a huge mess of notifications when something goes wrong.
I try to look for best practices with Nagios Dependencies, but all I find on the web is basic understanding with a single example. What I need is deeper information, best practices on how to manage such a config file.
Example : On a cluster of 100 servers with apache listening on each, I'm monitoring the number of apache processes and the listening tcp port 80. I want to make one depend on the other, but dependent_hostgroup_name won't do the trick as it results in all "check process" services being dependent on each "check_http" services.
Questions are : How do you manage your dependencies ? Do you use scripts to generate them ?
Agreed that its pretty hard to do without scripting.
For every service check command, I have defined (in a db table) what it typically depends on, which saves me from having to manually configure every service dependency. Host dependencies I do by hand, but doing mac address discovery on switches via a script is something that would help automate that.
examples:
"check_http_content" would depend on a "check_http" which would depend on a "check_ping".
"check_cisco_ifstate" would depend on a "check_snmp_ok" which would depend on a "check_ping"
If you build your config from a database using a script, this isn't too hard to implement. Otherwise, you would want to write a parser to go through your config file, and insert the dependencies based on the rules.
I can't imagine having any sizable nagios implementation without having a configuration database that you build your configs from, it allows you to add your own abstractions when nagios lacks them, and makes life simpler in many other ways.