First instance, had a Centos 5.4 (64-bit), plenty of resources, installed Hudson (http://wiki.hudson-ci.org/display/HUDSON/Meet+Hudson) and everything was honkey-dorey. Several days or weeks later (can't remember which), the entire server would randomly freeze, requiring a hard reboot. There was nothing running on it other than the resources required for Hudson.
New gig: freshly installed Centos 5.5 (64-bit). Within a month or so, freezing has started again. No apparent reason.
We have identical servers running all over the place, serving everything from Tomcat to Jboss to basic Apache stuff, all without ever freezing or crashing.
It seems Hudson is the problem - we just can't figure out what it does differently from typical configs.
So 2 questions:
- Any Hudson experts out there want to chime in?
- Troubleshooting: What are the right logs to be looking at? Where might we find an entry that says "X caused the system to crash" etc.?