My dedicated server has a problem after i reboot itself. The cpu load average is very high as following detail after i run top command
top - 23:40:41 up 50 min, 3 users, load average: 236.24, 146.96, 124.29
Tasks: 556 total, 1 running, 555 sleeping, 0 stopped, 0 zombie
Cpu(s): 1.2%us, 0.2%sy, 0.0%ni, 0.0%id, 98.6%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 16230212k total, 2994040k used, 13236172k free, 26404k buffers
Swap: 2097144k total, 0k used, 2097144k free,
i tried to stop httpd, it show 'OK' but after i run "service httpd status" it still show it's running.
there are many process related to httpd after i run "ps -ef | grep httpd"
apache 7984 7209 0 23:42 ? 00:00:00 /usr/sbin/httpd -k start -DSSL
apache 7985 7209 0 23:42 ? 00:00:00 /usr/sbin/httpd -k s
I have no idea what it is but it keep duplicate itself every second (the pid run very fast)
after i dig into the unix log (var/log/message) it show some log which may related to harddisk, i'm not quite sure, is it?
Nov 10 00:16:13 host kernel: ata1.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x0
Nov 10 00:16:13 host kernel: ata1.00: irq_stat 0x40000008
Nov 10 00:16:13 host kernel: ata1.00: failed command: READ FPDMA QUEUED
Nov 10 00:16:13 host kernel: ata1.00: cmd 60/08:00:f0:e1:4a/00:00:6b:00:00/40 tag 0 ncq 4096 in
Nov 10 00:16:13 host kernel: res 41/40:08:f0:e1:4a/00:00:6b:00:00/00 Emask 0x409 (media error) <F>
Nov 10 00:16:13 host kernel: ata1.00: status: { DRDY ERR }
Nov 10 00:16:13 host kernel: ata1.00: error: { UNC }
Nov 10 00:16:13 host kernel: ata1.00: configured for UDMA/133
Nov 10 00:16:13 host kernel: ata1: EH complete
Please advice me how should i do next to have my server become normal again.
Best Regards,
I assume you're running Red Hat (Because of 'HTTPD'). It looks like your hard drive is failing. I suggest you install and run SMART on your drive(s) to confirm.
And then for example to check your first drive
This will output a whole boatload of information, you will want to pay attention near the bottom where you may see something like
This means your drive is failing and you should backup and replace the drive ASAP. If you post the output we can have a more detailed look.
Your hard drive has failed. Replace the defective hard drive.