I know there are several tools I can use to find out what's causing heavy network and CPU usage right now, but every so often on my server I'll check the logs and notice that there were periods of very high network/CPU activity. The most recent occurrence was on a particular day last week.
How can I "look back" and find out who or what is using those resources, without "catching them in the act"?
I'm using Ubuntu 10.04.
Absent full audit logging (every process run and the resources it consumed), you really can't. The best you can do is review all scheduled tasks (
cron
jobs,at
jobs) and all the external influences you can catalog (scheduled jobs on other systems, an unusual request for a report coming from The Big Boss, etc.) to make an educated guess.The best way to find out what's causing load spikes is real-time monitoring/alerting: A system to tell you "Right now we have a problem" so that you can log in and determine the cause.
In addition to having resource monitoring that looks for periods of high CPU utilization for the system as a whole you can set up monitoring that looks for extended periods of high CPU utilization by process. I have something along this vein set up for my web and SQL boxes, both Linux and Windows.
Occasionally I see something spike up and use 100% of a single core. This wouldn't be enough to trigger an alert looking at straight CPU on a quad core system but it is enough to warrant looking at.
There are a number of tools that can track cumulative CPU utilization by process and/or record that value over time. If you want fine grained detail of what's happening:
Now you have a network 'black box' that will record all events for the past X number of minutes/hours (depending on capture file size).
Seeing the full packet dump will give you exactly what is occurring, and what endpoint is asking for it. Works great for hard to nail down chronic issues. When a user reports occasional failures and the log shows nothing.
Ubuntu should have installed "systat" package for you, the package setup a cron job automatically to save all system metrics(CPU/MEM/DISK...) to "/usr/lib/sysstat", you just need to read historical data with sar -u|-d|-? -f filename