I have an ubuntu apache/php server running php doing appx 100 hits/sec and a PHP cron running in the background.
I get occasionally high CPU load on one of the Apache processes which stays high regardless of traffic or cron activity. It seems to me that its stuck in some kind of loop or something.
Below you will find the top and strace info.
How can I find where the bad code is and what causes this?
top - 14:45:24 up 3 days, 3:38, 1 user, load average: 5.10, 5.88, 5.85
Tasks: 163 total, 5 running, 158 sleeping, 0 stopped, 0 zombie
Cpu(s): 47.8%us, 18.5%sy, 0.0%ni, 10.2%id, 0.0%wa, 0.0%hi, 1.8%si, 21.6%st
Mem: 7885012k total, 3858484k used, 4026528k free, 177444k buffers
Swap: 0k total, 0k used, 0k free, 1037868k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
10736 www-data 20 0 769m 559m 478m R 69 7.3 29:08.30 apache2
10844 www-data 20 0 824m 601m 492m S 17 7.8 4:37.90 apache2
1016 root 20 0 242m 25m 4628 S 6 0.3 162:07.93 scalarizr
9030 www-data 20 0 879m 619m 492m S 4 8.0 5:06.82 apache2
20216 www-data 20 0 747m 228m 170m S 4 3.0 0:01.94 apache2
10807 www-data 20 0 814m 584m 492m S 3 7.6 4:54.10 apache2
10455 www-data 20 0 831m 574m 492m S 3 7.5 4:32.65 apache2
10495 www-data 20 0 849m 592m 492m S 3 7.7 4:41.10 apache2
10884 www-data 20 0 840m 581m 492m S 3 7.6 4:25.06 apache2
^CProcess 10736 detached
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
74.55 0.148052 1 109755 gettimeofday
25.36 0.050370 0 164634 clock_gettime
0.09 0.000178 0 54878 poll
------ ----------- ----------- --------- --------- ----------------
100.00 0.198600 329267 total
root@ec2-67-202-54-36:~# ^C
I would recommend enabling Apache mod_status and turn ExtendedStatus On. Slicehost has an excellent article on how to accomplish this (I would use the "elinks" package vs. "lynx" but that is a personal preference). When you view the Apache server-status URL, there will be a PID, VHost and Request columns - they should go a long way towards pinpointing the URI that is being called which you can use to trace back to the specific code that is being run.
Here is a customized version of the Slicehost article to enable mod_status:
Then to view the server-status:
Holy Mahoney! Your Apache seems to consume way too much memory. What the heck it is running? Are there tons of Apache modules loaded? Do you have mod_security with not-so-memory-friendly rules in use? Is your site running something from hell like Magenta? Also, something truly makes your PHP script really curious about the current time. :D
Well anyway, for PHP profiling you can use XDebug and for analyzing the results, for example KCacheGrind which shows you the results in graphical easy-to-read form.
For real-time performance analytics modern Linux distros has the perf command, it is like the traditional top but you can drill down to single processes and if you're willing, you can see at assembly level what's going on.
There are several options you can use for isolating your issue.
Obviously you want to check your logs and identify if there are any issues that are being reported.
Pressing
C
while top is running will also give you additional data on the processes that are running with such a high cpuAs dialtOne mentioned you can install mod_status to get additional details as well.
I am not sure how you are using php but installing memcache and APC on the machines will provide some additional resource savings. With memcache you will need to configure your code to look to it's caches for database lookups first. This can save a great deal of overhead on heavily accessed sites with many recurring database lookups.
Adjusting your memory setting on php and whatever your database uses could also help with controlling loads.
This is in your php.ini and whatever your database config file is.
If you are using database calls you can look for slow queries.
Other options are expanding out your child processes and number of processes per child this can be configured in the httpd.conf.
A load of 5 isn't the worst depending on how many processors you have. Some of our larger web servers I have seen really high loads before and the site still is delivered fine. It's really how much energy you want to spend tweaking the site.
Best of luck!
You should use a debugger for example xdebug, to walk through your program to find the infinite loop.