As far as I can tell, the load average on my server (Ubuntu Linux 8.04.1) is way too high, and in practice I see it slows down or stops serving during peak hours.
It's a fairly stock LAMP powering a single site (image hosting) that obviously servers a lot of content (images) from disk, but they need to go through PHP to be served. Aside from general advice to use a cache/proxy approach for this, I'm lost at why it's apparently using less than half of the available resources (4GB RAM, it's a Linode 4096).
I'm quite a noob at Linux, so please ask for whatever might be useful. This is a portion of htop
(MySQL shows 98.9% CPU usage but that was marginal, it uses 0.*% almost all the time):
1 [||||||||||||||||||||||||||||||||||| 69.0%] Tasks: 355 total, 6 running
2 [||||||||||||||||||||||| 44.8%] Load average: 18.32 15.02 11.58
3 [|||||||||||||||||||||||||||||||||||| 71.9%] Uptime: 01:10:22
4 [||||||||||||||||||||||||||||| 57.9%]
Mem[||||||||||||||||||||||||||||||||||||||2190/4096MB]
Swp[| 0/127MB]
PID USER PRI NI VIRT RES SHR S CPU% MEM% TIME+ Command
2345 mysql 18 0 177M 72640 5140 S 98.9 1.7 7:47.58 /usr/sbin/mysqld --basedir=/usr --datadir=/var/lib/mysql
9350 www-data 16 0 48940 24304 4376 R 13.7 0.6 0:01.05 /usr/sbin/apache2 -k start
9301 mysql 15 0 177M 72640 5140 S 10.0 1.7 0:00.17 /usr/sbin/mysqld --basedir=/usr --datadir=/var/lib/mysql
9186 mysql 17 0 177M 72640 5140 S 10.0 1.7 0:00.22 /usr/sbin/mysqld --basedir=/usr --datadir=/var/lib/mysql
9150 www-data 15 0 58400 33900 4476 S 8.1 0.8 0:02.03 /usr/sbin/apache2 -k start
9077 mysql 15 0 177M 72640 5140 S 8.1 1.7 0:00.39 /usr/sbin/mysqld --basedir=/usr --datadir=/var/lib/mysql
9270 mysql 15 0 177M 72640 5140 S 7.5 1.7 0:00.12 /usr/sbin/mysqld --basedir=/usr --datadir=/var/lib/mysql
9037 mysql 16 0 177M 72640 5140 S 7.5 1.7 0:00.45 /usr/sbin/mysqld --basedir=/usr --datadir=/var/lib/mysql
9333 www-data 15 0 35724 11260 4560 S 6.2 0.3 0:03.88 /usr/sbin/apache2 -k start
This is the current apache2.conf
, though I've tried lots of combinations and asked here in the past:
Timeout 90
KeepAlive On
MaxKeepAliveRequests 150
KeepAliveTimeout 3
<IfModule mpm_prefork_module>
StartServers 1
MinSpareServers 1
MaxSpareServers 5
MaxClients 275
ServerLimit 275
MaxRequestsPerChild 1250
</IfModule>
UPDATE: As asked, this is a portion of top
:
top - 15:07:31 up 1:46, 2 users, load average: 12.83, 10.64, 10.14
Tasks: 223 total, 17 running, 206 sleeping, 0 stopped, 0 zombie
Cpu(s): 84.3%us, 8.8%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 1.0%si, 5.9%st
Mem: 4194528k total, 3555696k used, 638832k free, 27748k buffers
Swap: 131064k total, 588k used, 130476k free, 1458672k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
2345 mysql 17 0 180m 76m 5140 S 55 1.9 13:09.79 mysqld
12479 www-data 18 0 73224 47m 4552 S 48 1.2 0:03.74 apache2
12294 www-data 17 0 71788 46m 4472 R 39 1.1 0:05.78 apache2
12382 www-data 17 0 73744 48m 4460 R 33 1.2 0:03.19 apache2
UPDATE: As suggested (by Christopher Karel, thanks), here are the active processes (output from ps -efl | cut -c3- | egrep -v "^S"
). On average, it shows 1-5 apache2
processes. Does this make sense given my current apache2.conf
and load average?
T root 12519 12508 0 75 0 - 612 finish 15:07 pts/1 00:00:00 top
R www-data 18677 2774 1 76 0 - 17130 - 16:23 ? 00:00:04 /usr/sbin/apache2 -k start
R www-data 18965 2774 2 76 0 - 13397 - 16:26 ? 00:00:04 /usr/sbin/apache2 -k start
R www-data 19047 2774 2 76 0 - 11613 - 16:28 ? 00:00:00 /usr/sbin/apache2 -k start
R www-data 19088 2774 55 76 0 - 10482 - 16:29 ? 00:00:00 /usr/sbin/apache2 -k start
R www-data 19091 2774 0 81 0 - 8579 - 16:29 ? 00:00:00 /usr/sbin/apache2 -k start
R www-data 19092 2774 0 81 0 - 8355 - 16:29 ? 00:00:00 /usr/sbin/apache2 -k start
R www-data 19093 2774 0 82 0 - 8322 - 16:29 ? 00:00:00 /usr/sbin/apache2 -k start
R root 19094 18557 0 77 0 - 593 - 16:29 pts/2 00:00:00 ps -efl
R root 19095 18557 0 78 0 - 729 - 16:29 pts/2 00:00:00 -bash
R root 19096 18557 0 78 0 - 729 - 16:29 pts/2 00:00:00 -bash
You might want to enable Apache's mod_status ( http://httpd.apache.org/docs/2.0/mod/mod_status.html ) so you can see exactly what's happening inside your webserver. Specifically, you'll be get numbers on per-request CPU consumption.
A few snapshots from vmstat/iostat wouldn't hurt, either.
Also, are you using MyISAM or InnoDB tables? When you get one of these load spikes, what do you get from "SHOW FULL PROCESSLIST\G" in MySQL? I have a feeling you're getting lock/query contention in MySQL which is blowing up the length of your kernel run queue.
Any command not in state S (sleep) will be counted as an active process. This includes those in R running state, and D blocking state. (The latter usually occurring when it's waiting from IO from a disk or network device) You may also have Zombie processes hanging around running up the load average.
To find a list of those specifically, try the following command:
ps -efl | cut -c3- | egrep -v "^S"
You don't have a lot of iowait time listed, so it might turn out to be zombies.The 100% CPU usage from mysqld might also explain your intermittent hangups. (Maybe it only 'sometimes' gets pegged?) The load average might be a red herring, or not the root cause of your problem.
Also, it appears your machine is using 3.5GB out of 4GB of your RAM.
free -m
can give you a bit better view of what's getting used.I don't have a full solution for you, but I have some guesses.
However, it is not obvious to me from above data that there is much wrong with your setup. While you can probably wring a bit more performance out of the box, you may simply be approaching the limit.
Update:
Clarification re: comment below.
A typical network-oriented TCP server consists of a daemon that has a listening socket and a number of open connections to clients. Each of these sockets has a process waiting on it (one process may wait on numerous sockets). Those processes will be in sleeping state and will be woken up by the OS when some data arrives. If it is efficient (say static web server) you may never catch it running, as it takes only some 100 microseconds to wake up, serve some data and go back to sleep.
Update 2:
A modern OS allocates free memory to new disk buffers until it runs out of memory and then reuses the least used buffers. Thus, memory will always be full. Furthermore, there are several ways in which two processes may report the same page of memory as part of its size. The upshot of this is that a) a modern OS is always out of memory, and b) it is difficult to tell exactly how memory is used. The best simple indication is to strive for buffer and cached numbers as a large fraction of physical memory. On this box more than 30% of memory is in cached disk data.
I had this same problem. mytop showed lots of queries in the queue. I added indexes to my tables and the problem went away.
If you are serving mainly images (static files) probably it'd be better to switch to NGINX and if you use PHP for resizing pics probably you should use memcached (directly serving from NGINX you can set that in the confing file of NGINX) That'd have a huge impact. Apache is not good for serving static file (I don't think it's good for anything nowadays)