I rent a Gentoo server with the usual LAMP stack (prefork Apache MPM) + suPHP.
From time to time, my server runs out of memory and slows down to a crawl (responds to pings, but it's practically impossible to log in, and keystrokes sent through SSH can take minutes to be echoed back, much less processed). Lots of oom_killer stuff in the system logs, too.
This is what I see in top
during one of these moments:
top - 16:45:05 up 22 days, 8:08, 3 users, load average: 104.26, 103.87, 93.3 Tasks: 393 total, 1 running, 388 sleeping, 0 stopped, 4 zombie Cpu(s): 4.6%us, 9.3%sy, 0.8%ni, 0.0%id, 84.8%wa, 0.0%hi, 0.5%si, 0.0%st Mem: 2042128k total, 1634392k used, 407736k free, 1792k buffers Swap: 0k total, 0k used, 0k free, 27724k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 3125 apache 20 0 288m 105m 1368 S 0 5.3 0:01.00 apache2 2886 apache 20 0 285m 102m 1368 S 0 5.1 0:02.44 apache2 3048 apache 20 0 279m 96m 1192 D 0 4.8 0:01.58 apache2 3037 apache 20 0 278m 95m 1076 S 0 4.8 0:02.23 apache2 3014 apache 20 0 278m 94m 1204 D 0 4.8 0:01.81 apache2 2859 apache 20 0 274m 91m 1368 S 0 4.6 0:00.63 apache2 3016 apache 20 0 269m 86m 1368 S 0 4.3 0:01.49 apache2 2887 apache 20 0 269m 86m 1192 D 0 4.3 0:01.06 apache2 2753 apache 20 0 269m 86m 1368 S 0 4.3 0:01.09 apache2 3036 apache 20 0 266m 83m 1372 S 1 4.2 0:01.10 apache2 3006 apache 20 0 266m 83m 1368 S 0 4.2 0:01.98 apache2 3007 apache 20 0 265m 82m 1372 S 0 4.1 0:02.00 apache2 3064 apache 20 0 264m 81m 1368 S 0 4.1 0:00.57 apache2 3045 apache 20 0 263m 80m 1368 S 0 4.1 0:00.60 apache2 2888 apache 20 0 263m 79m 416 S 0 4.0 0:01.09 apache2 2862 apache 20 0 260m 77m 1368 S 1 3.9 0:01.95 apache2 2891 apache 20 0 259m 76m 1332 D 0 3.9 0:01.98 apache2 3046 apache 20 0 258m 75m 1080 S 0 3.8 0:01.20 apache2 2873 apache 20 0 255m 72m 1380 S 0 3.6 0:01.51 apache2 2987 apache 20 0 252m 69m 1368 S 0 3.5 0:01.04 apache2 2666 apache 20 0 250m 67m 1368 S 0 3.4 0:00.72 apache2 2903 apache 20 0 248m 66m 1368 S 0 3.3 0:01.02 apache2 3013 apache 20 0 247m 63m 416 S 0 3.2 0:01.02 apache2
Note that PHP is running in CGI mode, so this is just Apache without any PHP modules.
Frankly I don't understand why else would it be slow other than running oun of RAM, yet it claims it has 400 MB of free RAM. "84.8%wa" also indicates that the system is waiting for I/O operations (paging?).
Things I tried:
- Disabling swap, with the hope that things that start eating memory rampantly will just crash instead of bringing the server down to a grind - this didn't work, it probably just began paging out memory-mapped files (executables and SOs)
- Setting oom_adj of the root Apache process to 15
Tweaking the MPM settings:
StartServers 5 MinSpareServers 5 MaxSpareServers 10 MaxClients 50 MaxRequestsPerChild 10 MaxMemFree 1024
For now I've reduced MaxClients
to 25, but now page requests take several seconds to process, and some kid with FlashGet can theoretically clog all Apache processes and effectively make all websites inaccessible :/
Questions:
Can anyone suggest some Apache configuration tweaks which could radically improve my situation?
Is it possible to tell Linux to not swap/page out sshd, bash, and everything else required for me to ssh in and kill runaway processes?
If the answer to the above question is 'no', someone please explain to me how is it that in this day and age modern operating systems are such horrendously flawed. Sounds like epic fail in OS design to me :(
Your apache processes each use about 80 MB of memory (RES column). This is huge - here I have a web server with apache 2.2 used for running CGIs and each process uses about 5 MB. Now, if you have 50 apache processes, each using 80MB, then you're swapping like hell. Once we find why you are consumming so much memory, we'll likely resolve the issue.
Can you post your apache configuration file ?