I have a ubuntu server that I suspect failed due to overheating. I'm doing a post-mortem and I'm not exactly sure what to look for to confirm my hunch.
Any thoughts on what logged information would indicate a failure from overheating?
I have a ubuntu server that I suspect failed due to overheating. I'm doing a post-mortem and I'm not exactly sure what to look for to confirm my hunch.
Any thoughts on what logged information would indicate a failure from overheating?
If you did not have
sensord
from thelm-sensors
package running, you probably would never know for sure. Maybe you could try looking at side-channel data like SMART attribute logging done bysmartd
(smartmontools) - it logs attribute changes to syslog and may contain disk temperature - which in turn would allow an educated guess about the system's temperature.