Ping a Specific Port

Question

Alexey

Asked: 2014-04-10 11:52:08 +0800 CST2014-04-10 11:52:08 +0800 CST 2014-04-10 11:52:08 +0800 CST

Apache 2.4.9 100% CPU load of random workers

772

I own a t1.micro EC2 instance. After release upgrade of Ubuntu to saucy I got Apache 2.4.6 and I started noticing 100% CPU load like this

top - 19:37:58 up  2:55,  2 users,  load average: 3.90, 2.90, 1.82
Tasks:  95 total,   4 running,  91 sleeping,   0 stopped,   0 zombie
%Cpu(s):  3.9 us,  7.2 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si, 88.9 st
KiB Mem:    629976 total,   588412 used,    41564 free,    39412 buffers
KiB Swap:  2097144 total,        0 used,  2097144 free,   326932 cached

  PID USER      PR  NI  VIRT  RES  SHR S  %CPU %MEM    TIME+  COMMAND
 6688 www-data  20   0  114m  17m  12m R  34.3  2.9   3:07.74 apache2
 6719 www-data  20   0  113m  10m 6052 R  33.3  1.8   3:16.99 apache2
 6721 www-data  20   0  113m 9.8m 5004 R  32.0  1.6   3:04.13 apache2

I decided to install another Apache version from this ppa, so I got 2.4.9 now, but the problem is still there.

What I tried: 1. removed all modules and added only necessary ones. At the moment the loaded ones are:

lrwxrwxrwx  1 root root   36 Apr  9 19:16 access_compat.load -> ../mods-available/access_compat.load
lrwxrwxrwx  1 root root   28 Apr  9 19:17 alias.conf -> ../mods-available/alias.conf
lrwxrwxrwx  1 root root   28 Apr  9 19:17 alias.load -> ../mods-available/alias.load
lrwxrwxrwx  1 root root   33 Apr  9 19:18 authz_core.load -> ../mods-available/authz_core.load
lrwxrwxrwx  1 root root   38 Apr  9 19:18 authz_groupfile.load -> ../mods-available/authz_groupfile.load
lrwxrwxrwx  1 root root   33 Apr  9 19:18 authz_host.load -> ../mods-available/authz_host.load
lrwxrwxrwx  1 root root   33 Apr  9 19:18 authz_user.load -> ../mods-available/authz_user.load
lrwxrwxrwx  1 root root   26 Apr  9 19:18 dir.conf -> ../mods-available/dir.conf
lrwxrwxrwx  1 root root   26 Apr  9 19:18 dir.load -> ../mods-available/dir.load
lrwxrwxrwx  1 root root   27 Apr  9 19:19 mime.conf -> ../mods-available/mime.conf
lrwxrwxrwx  1 root root   27 Apr  9 19:19 mime.load -> ../mods-available/mime.load
lrwxrwxrwx  1 root root   34 Apr  9 19:11 mpm_prefork.conf -> ../mods-available/mpm_prefork.conf
lrwxrwxrwx  1 root root   34 Apr  9 19:11 mpm_prefork.load -> ../mods-available/mpm_prefork.load
lrwxrwxrwx  1 root root   27 Apr  9 14:39 php5.conf -> ../mods-available/php5.conf
lrwxrwxrwx  1 root root   27 Apr  9 14:39 php5.load -> ../mods-available/php5.load
lrwxrwxrwx  1 root root   30 Apr  9 19:19 rewrite.load -> ../mods-available/rewrite.load
lrwxrwxrwx  1 root root   29 Apr  9 19:20 status.conf -> ../mods-available/status.conf
lrwxrwxrwx  1 root root   29 Apr  9 19:20 status.load -> ../mods-available/status.load

reinstalling Apache to a newer version (see above, I'm on 2.4.9 now and still it is there)
stopping and starting the instance (hoping it would get onto another hardware) - no luck
The site I'm testing with is WP-based with W3TC plugin installed, I disabled minification of W3TC static files. I decide to do so because I noticed that it stucks sometimes on minified files in status module.
upgrading WP to the latest version (3.8.2) - nothing changed
Looking now at status module output, I'm seeing that there is no pattern as to what requests are stuck - absolute random. For example: http://pastebin.com/JxLbbzCB - see that POST /wp-admin/admin-ajax.php has been stuck for 62 seconds in W (sending reply), 5-0 request is also in W status for 250 seconds.

All these "W" workers load CPU by 100% and last really long causing EC2 instance to start CPU throttle.

UPDATE: here is strace info for one of hanged processes:

select(17, [16], [16], NULL, {1, 0})    = 1 (out [16], left {0, 999998})
send(16, "", 0, MSG_NOSIGNAL)           = 0
select(17, [16], [16], NULL, {1, 0})    = 1 (out [16], left {0, 999998})
send(16, "", 0, MSG_NOSIGNAL)           = 0
select(17, [16], [16], NULL, {1, 0})    = 1 (out [16], left {0, 999998})
send(16, "", 0, MSG_NOSIGNAL)           = 0
select(17, [16], [16], NULL, {1, 0})    = 1 (out [16], left {0, 999998})
send(16, "", 0, MSG_NOSIGNAL)           = 0
select(17, [16], [16], NULL, {1, 0})    = 1 (out [16], left {0, 999998})
send(16, "", 0, MSG_NOSIGNAL)           = 0
select(17, [16], [16], NULL, {1, 0})    = 1 (out [16], left {0, 999997})
send(16, "", 0, MSG_NOSIGNAL)           = 0
select(17, [16], [16], NULL, {1, 0})    = 1 (out [16], left {0, 999998})
send(16, "", 0, MSG_NOSIGNAL)           = 0
select(17, [16], [16], NULL, {1, 0})    = 1 (out [16], left {0, 999998})
send(16, "", 0, MSG_NOSIGNAL)           = 0
select(17, [16], [16], NULL, {1, 0})    = 1 (out [16], left {0, 999998})

and it lasts while the process is not killed

UPDATE: output of ps auxf|grep apache2 :

root@domU-12-31-39-02-26-E9:~# ps auxf|grep apache2
root      2761  0.0  0.1   4168   852 pts/1    S+   19:52   0:00                      \_ grep --color=auto apache2
root      2549  0.0  2.3 115720 15104 ?        Ss   19:50   0:00 /usr/sbin/apache2 -k start
www-data  2554  0.0  0.8 115800  5660 ?        S    19:50   0:00  \_ /usr/sbin/apache2 -k start
www-data  2555  0.2  3.6 117944 22872 ?        S    19:50   0:00  \_ /usr/sbin/apache2 -k start
www-data  2556  0.2  3.7 119252 23360 ?        S    19:50   0:00  \_ /usr/sbin/apache2 -k start
www-data  2557  9.9  1.2 115932  8068 ?        R    19:50   0:14  \_ /usr/sbin/apache2 -k start
www-data  2558  9.7  1.2 115932  8068 ?        R    19:50   0:14  \_ /usr/sbin/apache2 -k start
www-data  2562  0.0  0.8 115800  5660 ?        S    19:50   0:00  \_ /usr/sbin/apache2 -k start
www-data  2564  0.0  0.8 115800  5656 ?        S    19:50   0:00  \_ /usr/sbin/apache2 -k start
www-data  2566 20.7  1.2 115932  8044 ?        R    19:50   0:28  \_ /usr/sbin/apache2 -k start
www-data  2567 35.9  1.2 115932  8072 ?        R    19:50   0:49  \_ /usr/sbin/apache2 -k start
www-data  2568 10.8  1.2 115932  8080 ?        R    19:50   0:14  \_ /usr/sbin/apache2 -k start
www-data  2571  0.0  0.8 115800  5644 ?        S    19:51   0:00  \_ /usr/sbin/apache2 -k start
www-data  2572  0.0  0.8 115800  5644 ?        S    19:51   0:00  \_ /usr/sbin/apache2 -k start
www-data  2573  0.0  0.8 115800  5644 ?        S    19:51   0:00  \_ /usr/sbin/apache2 -k start
www-data  2574  0.0  0.7 115752  4900 ?        S    19:51   0:00  \_ /usr/sbin/apache2 -k start
root@domU-12-31-39-02-26-E9:~#

Output of tail /var/log/apache2/error.log (error logs configured for other vhosts are empty at the moment):

[Mon Apr 21 19:50:22.201343 2014] [:notice] [pid 2552] FastCGI: process manager initialized (pid 2552)
[Mon Apr 21 19:50:22.692477 2014] [mpm_prefork:notice] [pid 2549] AH00163: Apache/2.4.9 (Ubuntu) mod_fastcgi/mod_fastcgi-SNAP-0910052141 PHP/5.5.3-1ubuntu2.3 configured -- resuming normal operations
[Mon Apr 21 19:50:22.692580 2014] [core:notice] [pid 2549] AH00094: Command line: '/usr/sbin/apache2'

free -m

             total       used       free     shared    buffers     cached
Mem:           615        596         18          0        217        142
-/+ buffers/cache:        236        378
Swap:         2047         22       2025

df -h

Filesystem      Size  Used Avail Use% Mounted on
/dev/sda1        30G   22G  6.1G  79% /
devtmpfs        299M   12K  299M   1% /dev
none             62M  208K   62M   1% /run
none            5.0M     0  5.0M   0% /run/lock
none            308M     0  308M   0% /run/shm
none            100M     0  100M   0% /run/user

2 Answers

Voted

taddy hoops · Answer 1 · 2014-04-16T20:34:53+08:00

taddy hoops

2014-04-16T20:34:53+08:002014-04-16T20:34:53+08:00

Heavy usage Apache 2.4 on a t1.mirco instance is like humpty dumpty sitting on a couple toothpicks. He's too big and heavy! So of course he's squashed the toothpicks to pieces and rolled down the hill.

You have at least two basic solutions to the root cause of this problem:

Get more server resources, and less steal time. Either with AWS, or go to Digital Ocean for value.
Stop using Apache 2 as the client facing daemon. Either migrate to something more lightweight like Lighttpd, Nginx, or implement a caching proxy layer like Varnish.

3

Alec Istomin · Answer 2 · 2014-04-17T16:58:41+08:00

Alec Istomin

2014-04-17T16:58:41+08:002014-04-17T16:58:41+08:00

Most likely the problem is memory shortage for apache. Apache doesn't have enough memory to to complete properly and thus get into unpredictable state resulting in the CPU loop.

I recommend to reduce the memory footprint for your VM (or move to a larger VM). Find the right setting for MaxClients by experimenting. Start with minimum setting and if this fixes CPU issue, increase MaxClients and MaxSpareServers until you find the balance.

Start with this (assuming you use prefork) and see if the issue persists:

ListenBacklog 10

<IfModule prefork.c>
StartServers         1
MinSpareServers      1
MaxSpareServers      1
MaxClients           1
MaxRequestsPerChild  1000
</IfModule>

-2

Apache 2.4.9 100% CPU load of random workers

Can you pass user/pass for HTTP Basic Authentication in URL parameters?

Ping a Specific Port

Check if port is open or closed on a Linux server?

How to automate SSH login with password?

How do I tell Git for Windows where to find my private RSA key?

What's the default superuser username/password for postgres after a new install?

What port does SFTP use?

Command line to list users in a Windows Active Directory group?

What is a Pem file and how does it differ from other OpenSSL Generated Key File Formats?

How to determine if a bash variable is empty?