I have nagios up and running with nrpe, but I'm relatively new to configuring it myself.
Is there any way to get the raw numbers for the checks, instead of just ok/not ok?
For example, if I want to monitor memory usage of a host over a process that runs for a few hours and see how it fluctuates, can nagios do that, or will it only tell me if it trips some threshold?
I believe what you're looking for is an RRDtool to collect data for you. I use check_mk which is a collection of extensions for Nagios, but there are a ton of other options.
Generally, nagios notifies you of ok/not okay. I think it's safe to say that most people use nagios to let them know whether something odd is going on in their environment. It does display the numbers for the current state, but that doesn't sound like what you're asking for.
I've grepped values out of nagios.log before. It's not pretty, but it's doable, and if this is a one-shot that might be your best bet. (Example: I was once asked to pull out the history of Exchange eating all its storage over a period of time.)
If you're instead looking for nagios to check for fluctuations all the time, you might be able to write some kind of custom script for npre to run that defines ok/not okay as "fluctuating too much."
I hope that helps.
There are various RRD ( a special type of database for collecting this kind of time data ) add-ons to nagios.
However, nagios is poor choice for this kind of performance monitoring. It really functions best as an alert system and most sites use something else like ganglia or cactus for ongoing performance tracking.
Nagios is the smoke alarm, ganglia is the thermometer.
There is a third field in every check in which you can put any kind of data you want, but that requires searching the nagios.log file.
Nagios is best for monitoring services to ensure you get notified if service /box is down. I would suggest munin for graphing system resources, munin has the feature of alerting too ,in case a resource value goes beyond a threshold. I am using munin (as a secondary monitoring tool) with Amazon SNS to get alerts.
Out of the box, so to speak, Nagios does nothing with the actual data that's returned; You use an add-on that hooks into Nagios to process the data. One of the more popular add-ons for graphing this perfdata is pnp4nagios. It integrates well with the web UI, is easy to set up, and it packaged in several Linux distros too.
For other options, see the graphing/trending add-on category over at Nagios Exchange.
Nagios is basically an alert system as the others have said, i.e. there is no historical-reporting system embedded.
However there are many addons that will do the job for you. I believe NagiosGraph is the most common one, that is simple enough in installation and in use.
You can have daily, weekly, monthly and yearly views of your metric and you can also create your own graphs for your plugins by manipulating the metric mapping (the nagiosgraph map file).