I have a Linux server I've just set up, debian squeeze, 2.6.32-5-amd64, and over the past week it's rebooted three times, twice in one day. There was no power outage that I am aware of (and it's running on a UPS), and there are no errors in syslog, besides a few to-be-expected ones on bootup to do with clearing out entries in the ext4 journal due to the unclean shutdown.
What steps can I take to determine the cause of the reboots? Is there a way to get it to hang instead of rebooting, so I can copy stack traces or something off the screen? Any way to increase debug messages, or get it to dump things to disk, or something?
That may be some hardware problem; the most common are failed RAM and overheating. You could install
mbmon
to monitor motherboard and CPU temperature; and runmemtest86+
to check your RAM and CPU cache.There is a chance it is a 'kernel panic' and a kernel 'oops' message is sent to the console before the reboot. The kernel can be configured to reboot on 'panic' or to stay on. Check:
If it is non-zero try putting 0 there (you can do it directly writting to the file, via /etc/sysctl.conf which is usually parsed on boot, or using the
sysctl
utility), this should stop rebooting. If it is already 0, then the reboots are not caused by kernel panics.Check the output of
last
. Look for reboot. Try to correlate that with who was logged in if anyone and who has superuser privileges. If it is not a user, you may have power/heat issues or some type of kernel panic causing issues. Try to rule those out one by one.