I'm looking at setting up a tool for collecting usage data and KPIs from multiple systems on various platforms. We'd like to regularly report on key indicators on system usage and health. It doesn't need to be real time monitoring just monthly indicators of performance and usage.
The systems that would feed into would be vary from Solaris boxes running large ERP app, to IIS running our intranet. We'll agree on 2-5 kpi's for each, then write some sort of script to extract the data from each system. The data would vary from usage by user name, usage by app, to performance data such as response time from each site.
Is there any off the shelf applications out there for storing and reporting varied metrics?
In my case it needs to be free to cheap, otherwise we'll just create and maintain a small DB ourselves.
Have a look at Polymon. http://polymon.codeplex.org
From what your describing, it's exactly what you're after.
And free.
"Is there any off the shelf applications out there for storing and reporting varied metrics?"
Your operating system? :)
Does the metrics data consist of simple numbers, the semantics of which are understood by your scripts? SNMP can pull a variety of data from cross-platform systems but you have to specify exec calls for anything not in the usual MIB.
As you say I think the custom approach is the one that will work best for you, just some scripts and a database.
There are a number of different commercial tools for monitoring, HP OpenView Operations specifically the performance monitor come to mind for your purpose, but they are all very expensive. I think you should start by not saying what you want but rather what you are hoping to achieve: "We'd like to regularly report on key indicators on system usage and health". Are you looking at the hardware health of servers in your environment? HP SIM or another SNMP based tool is going to be appropriate here. Are you looking for system vital stats like CPU usage, hard disk space, network usage? For linux you want sar, collectd. For windows you can get these stats I think by talking SNMP.
It may be more appropriate, depending on your environment, to concentrate on monitoring application performance and health rather than the underlying OS. CPU, network spikes are crude measures and it in a complex environment are not necessarily something you care about. Measure your transactions per/second first so you know whether you even have an issue to investigate.
It could be appropriate to look at log monitoring tools like splunk as your systems will often tell you if something is amiss. Again, it depends on what you are really trying to achieve.
OVO is probably the tool that would do what you want, but it is expensive. For open source tools look at cacti, nagios and collectd.