I'm trying to manage a server on Amazon for a network of sites that receives about 100 million pageviews per month. Unfortunately, nobody out of my team of 5 developers has much server admin experience.
Right now we have the MaxClients set to 1400. Currently our traffic is about average, and we have 1150 total Apache processes running, which use about 2% CPU each! Out of those 1150, 800 of them are currently sleeping, but still taking up CPU. I'm sure there are ways to optimize this. I have a few thoughts:
- It appears Apache is creating a new process for every single connection. Is this normal?
- Is there a way to more quickly kill the sleeping processes?
- Should we turn KeepAlive on? Each page loads about 15-20 medium-sized graphics and a lot of javascript/css.
So, here's our Apache setup. We do plan on contracting a server admin asap, but I would really appreciate some advice until we can find someone.
Timeout 25
KeepAlive Off
MaxKeepAliveRequests 200
KeepAliveTimeout 5
<IfModule prefork.c>
StartServers 100
MinSpareServers 20
MaxSpareServers 50
ServerLimit 1400
MaxClients 1400
MaxRequestsPerChild 5000
</IfModule>
<IfModule worker.c>
StartServers 4
MaxClients 400
MinSpareThreads 25
MaxSpareThreads 75
ThreadsPerChild 25
MaxRequestsPerChild 0
</IfModule>
Full top output:
top - 23:44:36 up 1 day, 6:43, 4 users, load average: 379.14, 379.17, 377.22
Tasks: 1153 total, 379 running, 774 sleeping, 0 stopped, 0 zombie
Cpu(s): 71.9%us, 26.2%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 1.9%si, 0.0%st
Mem: 70343000k total, 23768448k used, 46574552k free, 527376k buffers
Swap: 0k total, 0k used, 0k free, 10054596k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
1756 mysql 20 0 10.2g 1.8g 5256 S 19.8 2.7 904:41.13 mysqld
21515 apache 20 0 396m 18m 4512 R 2.1 0.0 0:34.42 httpd
21524 apache 20 0 396m 18m 4032 R 2.1 0.0 0:32.63 httpd
21544 apache 20 0 394m 16m 4084 R 2.1 0.0 0:36.38 httpd
21643 apache 20 0 396m 18m 4360 R 2.1 0.0 0:34.20 httpd
21817 apache 20 0 396m 17m 4064 R 2.1 0.0 0:38.22 httpd
22134 apache 20 0 395m 17m 4584 R 2.1 0.0 0:35.62 httpd
22211 apache 20 0 397m 18m 4104 R 2.1 0.0 0:29.91 httpd
22267 apache 20 0 396m 18m 4636 R 2.1 0.0 0:35.29 httpd
22334 apache 20 0 397m 18m 4096 R 2.1 0.0 0:34.86 httpd
22549 apache 20 0 395m 17m 4056 R 2.1 0.0 0:31.01 httpd
22612 apache 20 0 397m 19m 4152 R 2.1 0.0 0:34.34 httpd
22721 apache 20 0 396m 18m 4060 R 2.1 0.0 0:32.76 httpd
22932 apache 20 0 396m 17m 4020 R 2.1 0.0 0:37.34 httpd
22933 apache 20 0 396m 18m 4060 R 2.1 0.0 0:34.77 httpd
22949 apache 20 0 396m 18m 4060 R 2.1 0.0 0:34.61 httpd
22956 apache 20 0 402m 24m 4072 R 2.1 0.0 0:41.45 httpd
There are entire books written on this topic, but to keep it simple:
Run the database in a separate tier
Your database workload and webserver workload are entirely different and will be thrashing resources in competing ways. Its best to keep them separate, this will help you scale out in the future.
Isolate static and dynamic content
Consider running a faster webserver like nginx for static content and ditching apache entirely. If you can, run nginx everywhere.
KeepAlive will definitely help
A lot of your resources are being burned by tear-up and tear-down of connections.
PageTest
For more good advice, I highly recommend : http://www.webpagetest.org/ . This will show you why the site is taking a long time to load and has a number of best-practice tips to fixing performance: enabling gzip compression, minifying javascript and css, etc. etc. Give it a read.
Looks to me like you're using the prefork mpm. Most of this answer assumes as much.
For prefork? Yes.
Are you sure these processes are not doing anything? With your MaxSpareSevers setting you should only have up to 50 idle processes. Enable mod_status and set ExtendedStatus on should allow you to view the Apache scoreboard and allow you to see what is going on.
Turning on KeepAlive is a good idea. It will allow clients to pipeline requests and allow you to reuse Apache processes more efficiently.
As with most tuning. Measure first to create you baseline, then change one thing and then re-measure to try determine what effect you change may have had. Using (and graphing) mod_status is handy for this.
You may be able to use the worker mpm, which tends to help with performance. However, some libraries (notably some PHP libraries) do not operate well with the worker mpm. YMMV
To determine which mpm you're using run: apache2 -V (or httpd -V depending on the distro)