I have a VPS with the OS centos 5.7, and it is behaving very weirdly. My VPS is located on a 2-core machine.
For a 2-core machine, the load average I can see is very high, as evident when I use the top
command to investigate:
- 04:04:40 up 1 day, 22:43, 1 user, load average: 6.23, 5.19, 4.72
Tasks: 59 total, 1 running, 58 sleeping, 0 stopped, 0 zombie
Cpu(s): 5.4%us, 3.4%sy, 0.0%ni, 85.4%id, 5.8%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 1376256k total, 755908k used, 620348k free, 0k buffers
Swap: 0k total, 0k used, 0k free, 0k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
1 root 15 0 2172 664 572 S 0.0 0.0 0:02.59 init
1135 root 18 -4 2276 556 344 S 0.0 0.0 0:00.00 udevd
1231 root 19 0 32716 564 460 S 0.0 0.0 0:00.00 brcm_iscsiuio
1542 root 16 0 1828 580 488 S 0.0 0.0 0:03.24 syslogd
1599 named 23 0 50596 3984 2024 S 0.0 0.3 0:01.26 named
1615 root 18 0 7228 1044 644 S 0.0 0.1 0:00.00 sshd
1626 root 15 0 2848 844 676 S 0.0 0.1 0:00.00 xinetd
1638 root 18 0 3728 1316 1144 S 0.0 0.1 0:00.00 mysqld_safe
1662 mysql 15 0 252m 99m 4876 S 0.0 7.4 9:21.01 mysqld
1738 postgres 15 0 20348 3412 2900 S 0.0 0.2 0:00.26 postmaster
1740 postgres 15 0 10128 904 388 S 0.0 0.1 0:01.42 postmaster
1742 postgres 15 0 20348 984 468 S 0.0 0.1 0:05.20 postmaster
1743 postgres 18 0 11128 812 292 S 0.0 0.1 0:00.13 postmaster
1744 postgres 15 0 10308 1060 440 S 0.0 0.1 0:00.00 postmaster
1757 mailnull 15 0 9524 2328 1836 S 0.0 0.2 0:00.99 exim
1786 root 18 0 2172 720 552 S 0.0 0.1 0:02.58 dovecot
1787 root 18 0 2648 1040 832 S 0.0 0.1 0:02.04 dovecot-auth
As you can see, the load is 6 ( for a 2-core machine), but when all the top processes added together, the memory and CPU consumption is minimum!
I thought this was an IO wait issue, so I used iostat -cx 30
to check:
avg-cpu: %user %nice %system %iowait %steal %idle
5.43 0.02 3.36 5.80 0.00 85.39
avg-cpu: %user %nice %system %iowait %steal %idle
3.79 0.00 0.33 2.09 0.00 93.79
avg-cpu: %user %nice %system %iowait %steal %idle
3.61 0.00 0.30 5.67 0.00 90.42
avg-cpu: %user %nice %system %iowait %steal %idle
1.91 0.00 0.22 1.04 0.00 96.83
avg-cpu: %user %nice %system %iowait %steal %idle
3.47 0.00 0.28 0.75 0.00 95.49
avg-cpu: %user %nice %system %iowait %steal %idle
3.93 0.00 0.44 2.62 0.00 93.01
As you can see, the %iowait
is only 5%, it means that my processes only use 5% of the time waiting for IO operation, so it shows that the disk is not busy, there is no possibility that the high load average is caused by the processes are waiting for the disk, right?
Finally, to further confirm my point, I type in vmstat
:
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------
r b swpd free buff cache si so bi bo in cs us sy id wa st
1 0 0 751928 0 0 0 0 120 99 0 105 5 3 85 6 0
As you can see, the process running is minimum, the b
column is 0, indicating that the number of processes on UNINTERRUPTIBLE_SLEEP
is 0. Further more, the bi
column (blocks read from a block device) is only 120, not so high right? The si
column (memory read from swap/disk) is 0. Finally, under the cpu header, the wa
column shows that the CPU spends only 6% of time waiting for IO to complete.
All these rule out the possibility of IO operation as the bottleneck.
So, the conclusion is, the load average is very high and it degrades the performance of my website, however, this high load average is not caused by any of the following:
- High CPU or memory usage by my processes
- IO operation.
What can cause the high load average?
CPU load is the average number of processes ready to run. A process waiting for I/O by definition doesn't add into it.
The numbers are certainly weird, with a load average of 6 I'd expect much higher CPU utilization than 5 to 6%. But then again, the load is decreasing, perhaps there was a CPU spike a while back? Anything special about the workload?
Install
sysstat
, learn how to use it (it isn't simple, mind you) and milk it for insight...