I need to set up a monitoring and alerting system for one of our business teams-- they're smart people, but not engineers or sysadmins. I basically need to do the same thing that happens for system monitoring, but with all application-specific metrics. I'll be writing all of the monitoring scripts myself.
I'm familiar with Nagios, and it's what we use in-house for our system-level stuff, but it's not the right fit for this problem. My needs:
- A clean and simple dashboard
- Alerting capabilities
- Graphs
- Obviously, the ability to write my own monitors
In other words, what's like Nagios but dumbed down?
I would have a look at Zenoss. I've used Nagios for years to monitor and alert on issues but we recently switched to Zenoss b/c we wanted easier management and integrated RRD graphing capabilities. Zenoss has a nice web based interface which isn't just the dashboard for dealing with events but is also used for almost all configuration, device management, and alerting rules. It's much easier to setup compared to Nagios and has integrated RRD graphing capabilities.
As a bonus Zenoss supports using Nagios plugins and scripts to monitor and alert on things so if your team already has experience with writing Nagios monitoring scripts you can continue to leverage that knowledge.
+1 for ZenOSS, but it depends on what version you use. There is the free community version, and their "enterprisey" offering which I assume has shinier bells and louder whistles.
The former does not do a very good job of talking to Winboxen if you have a more recently configured Windows domain authentication policy or security guys breathing down your neck (hello NTLMv2). So, for the community version, you get to have fun configuring Samba for yourself to force that kind of authentication if you want native polling of Windows servers/clients without installing anything on the target (since it basically does remote WMI calls with an admin account of your choice). This is why it appealed to my co-workers, since it was supposed to be shiny, fun, and no-hassle, but this confused them since they never played that much with Linux. I lost time to tinker, so I never got that part to work with the community version. Just my two cents . . .
we wrapped nagios in a "cleaner" interface for the suits. nagios calls a custom script, which executes the relevant probe and returns the results. If a status changes, another script is called, which takes the appropriate actions (emails, sms etc.).
Everything the two scripts is configurable via a web interface we rigged up in PHP, operating on a DB. It doesn't do viz but the alerts work fine.
Try Zabbix. Easy to configure, not as powerfull as ZenOSS or Nagios. Has graphs, different alerting, etc.
Try PRTG Network Monitor, it should be very easy point and click install. It evens have a Windows GUI, which non-sysadmin people will love it.
It can monitor as much as Nagios can. There is 30 days free trial and also free version with limited monitoring nodes.