tl;dr: First CPU core is consistently saturated, all other cores are consistently under-loaded.
A VM, inside Ubuntu-based Xen XCP:
$ uname -a Linux MYHOST 2.6.38-15-virtual #59-Ubuntu SMP Fri Apr 27 16:40:18 UTC 2012 i686 i686 i386 GNU/Linux $ lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 11.04 Release: 11.04 Codename: natty
This VM has 8 CPU cores.
There are 10 single-threaded worker processes running on this VM, which are connected via FCGI interface to the nginx server (listening on a local network port).
Under synthetic load from AB, only first core of eight is ever loaded to 100% (as seen from htop
). It remains under very high load more or less constantly, and all other cores are loaded anywhere from 0 to 100%, more or less randomly (and CPU load of these cores is jumping around).
Here is what I typically see under load in htop
:
1 [||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||99.3%] Tasks: 70, 35 thr; 11 running 2 [||||||||||||||| 15.0%] Load average: 3.86 1.05 0.39 3 [||||||||||||||||||||||||||||||||||| 36.7%] Uptime: 22 days, 06:31:57 4 [|||||||||||||||| 15.7%] 5 [||||||||||||||||||||| 22.4%] 6 [||||||||||||||||||| 19.9%] 7 [|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 71.2%] 8 [|||||||||||||||||||||||||||||| 31.3%] Mem[|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||394/4028MB] Swp[ 0/5362MB] PID USER PRI NI VIRT RES SHR S CPU% MEM% TIME+ Command 26213 www-data 20 0 49748 26952 2448 R 29.0 0.7 10:42.61 /usr/bin/luajit2 26227 www-data 20 0 50172 27412 2452 R 27.0 0.7 10:43.53 /usr/bin/luajit2 26221 www-data 20 0 50736 27948 2452 R 27.0 0.7 10:39.02 /usr/bin/luajit2 26234 www-data 20 0 50128 27232 2452 R 27.0 0.7 10:36.36 /usr/bin/luajit2 26218 www-data 20 0 50232 27376 2452 R 26.0 0.7 10:39.32 /usr/bin/luajit2 26214 www-data 20 0 51268 28496 2452 R 26.0 0.7 10:58.15 /usr/bin/luajit2 26232 www-data 20 0 50420 27588 2452 R 25.0 0.7 10:39.21 /usr/bin/luajit2 26217 www-data 20 0 50236 27348 2452 R 25.0 0.7 10:34.44 /usr/bin/luajit2 26219 www-data 20 0 50748 27960 2448 R 23.0 0.7 10:45.30 /usr/bin/luajit2 26239 www-data 20 0 49772 27188 2452 R 22.0 0.7 10:39.39 /usr/bin/luajit2 26368 www-data 20 0 10856 3796 968 S 15.0 0.1 1:12.62 nginx: worker process 26369 www-data 20 0 10652 3504 968 S 2.0 0.1 1:12.75 nginx: worker process 26372 www-data 20 0 10520 3504 968 S 0.0 0.1 1:18.64 nginx: worker process ...
During the load test all worker processes are in R
, load test runs for about 10-15 minutes (and performance is about 700-900 hits/second). Traffic is, of course, generated from external machines.
Looks like this CPU core load disbalance is the main performance bottleneck, and if all cores were loaded evenly, performance could be higher.
Any clues on how to troubleshoot this issue?
Please tell me if I can provide more information.
It looks like CPU0 receives every eth1 interrupt, and there are a lot of them.
Umm,
Why have you not mentioned taskset ? taskset -p pid will retrieve the affinity...
Add -c to taskset to specify a cpulist : in this case anything but 0.