On two different servers (with Ubuntu 12.04LTS AMD64) I have seen the following behaviour:
op - 10:50:05 up 305 days, 21:17, 1 user, load average: 1.94, 2.52, 2.97
Tasks: 141 total, 2 running, 139 sleeping, 0 stopped, 0 zombie
Cpu(s): 41.5%us, 6.5%sy, 0.0%ni, 51.8%id, 0.0%wa, 0.2%hi, 0.1%si, 0.0%st
Mem: 8178432k total, 5753740k used, 2424692k free, 159480k buffers
Swap: 15625208k total, 0k used, 15625208k free, 4905292k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
1 root 20 0 23928 2072 1216 S 0 0.0 0:56.42 init
2 root 20 0 0 0 0 S 0 0.0 0:00.01 kthreadd
3 root RT 0 0 0 0 S 0 0.0 0:01.23 migration/0
4 root 20 0 0 0 0 S 0 0.0 2:39.82 ksoftirqd/0
5 root RT 0 0 0 0 S 0 0.0 0:00.00 watchdog/0
6 root RT 0 0 0 0 S 0 0.0 0:02.99 migration/1
7 root 20 0 0 0 0 S 0 0.0 2:32.15 ksoftirqd/1
8 root RT 0 0 0 0 S 0 0.0 0:00.00 watchdog/1
9 root RT 0 0 0 0 S 0 0.0 0:11.67 migration/2
10 root 20 0 0 0 0 S 0 0.0 29:00.34 ksoftirqd/2
The server is working fine, but top shows all processes as using 0% CPU. A reboot fixed this on an earlier machine, but I haven't yet tried it on this one.
I have tried top
several times, and so am sure that I haven't accidentally pressed '<' or '>' to sort by a different column. Sorting the process list by all of the available columns, stills shows 0% CPU for all displayed processes.
What is going on? If this a kernel bug?
Update: If I use top -p <PID>
for a known, busy process, top still displays 0% CPU for that process.
Update2: My point is that ALL processes are reporting 0% CPU usage ALL of the time.
Have a look at this article. http://blog.scoutapp.com/articles/2009/07/31/understanding-load-averages
Typically Load with little to no CPU usage indicates I/O to disk/network. Load isn't a bad thing, but keeping an eye on the trends of your 1,5, and 15 min metrics will help you triage a real issue vs. trends.
Maybe check out what your disks are doing via 'iostat'.
Load isn't CPU usage. Load is "amount of runnable processes". Seeing a load of almost 2 with no CPU usage means that some processes are probably doing a lot of IO, or maybe even stuck. Check with PS whether you have processes in D state for instance.
(I had a mailserver with load 2200 last week, its storage failed. Everything else worked normally though :))