I have troubles understanding what problems may cause the occasional hangs that the server gets due to sudden high load spikes. I am not a system administrator (I'm a PHP programmer) but since the official sysadmin is quite lacking on effort I'm asked to find a solution myself.
The server runs on a Debian Lenny and serves via apache a wordpress + vbulletin based site with 40-60k visits/day. Having done all the application-side optimization I could, we got to the situation where the site runs smoothly even for weeks, then it trips on something that makes the server load jump up to 80+. Stopping apache to restart it helps, but it usually calms down by itself, if given enough time. It can "crash" twice in a day, or see no problems for weeks. It seems to be totally random.
One weird specific thing happened though. I was warned of a strange behavior and after inspection I found the .htaccess
file changed to redirect traffic coming from search engines to some external site. I checked the code and every plugin (all up to date) and finally tried the "hard way" chown
ing .htaccess
to root.root
. The weird part is that when another issue came up, I found that file changed back to be owned by the user assigned to the website virtualhost. I understand there is no way for this to happen just via some web exploit, or am I mistaken?
How can I find the cause of this high load spikes?
What can explain a root.root
file changing permissions, other that someone with root access doing it?
Could these two things be linked to some kind of attack?
Regarding the Apache issue, one possible cause is that your MaxClient/MaxServer setting is too high. When you get spike in traffic you use up all your RAM and cause Apache to start using swap which will very quickly kill your performance. Next time you have the issue check the output of top/free and see if any swap is being used. If it is try reducing the MaxClient/MaxServer values.
I also had an issue with Apache 1.3 where some connections wouldn't ever close and after a few days there would 90% of the connections doing nothing leaving not clients for handling incoming new connections. I solved it by simply restarting Apache each day. From the sounds of it you don't have enough traffic or time between issues for this to be a likely cause.
It sounds to me as if someone may have compromised your server, more because of the redirection of search engine traffic to a different site than because of the file ownership issue. I'm afraid that some web exploits can give an attacker root access to your system.
I would download and run a rootkit detector such as Rootkit Hunter. If you have been rooted, you'll probably want to get someone experienced to help you to fix it.
It may be an attack, but then why only change the
.htaccess
file, probably there are more things to check. Maybe a php scripts is generating the file automatically. What are the permissions on the file?Regarding the load, it may be a lot of things, specially because its random nature. Watch the logs and use tools (webalizer, to see if the load spikes coincide with heavy access times or access to a specific resource that may be the case of it.
Check your server with chkrootkit and similar tools to check for compromise. If it happened, you may need/want to reinstall your server from scratch.
On its own this wouldn't account for the permissions of a root:root owned file changing - but a web compromise might enable an attacker to deploy malware on the system which could compromise the system in other ways. And there are other ways to compromise a system.
If you're confident that your code has not been modified since you last checked, then its time to eliminate everythnig else - using rootkit checkers is a good idea on a system you believe to be secure - but if you've got doubts you should reformat/reinstall.
On webserver run
# ps -uapache -o wchan=WIDE-WCHAN-COLUMN,cmd
try find something similar toflock_lock_file_w /usr/sbin/httpd
on first field.What is your
session.save_handler
?