So i have a server that has locked up twice recently.
What i want to do is catch any kernel panic/oops written to the serial console. I can ssh in and start the VSP and i'm running my SSH session in bash using script so anything written to the console should be logged by script.
The problem is iLO times-out and closes the SSH session.
A quick look in the web interface shows there is an iLO idle timeout but that can only bet set to max 120 minutes and not disabled.
Is there any way to disable iLO closing the SSH session so i can effectively log whats put out over the virtual serial port for some time, and hopefully catch what ever panic/oops is spat out at these seemingly random lockups.
Are you seeing anything in the system's IML logs? Do you have the management agents installed for the hardware?
What specific OS and model of ProLiant server are you using? If you have the agents installed, you should be getting some information relevant to the lockups or possibly tripping the ASR watchdog timer.
For the VSP. have you made the necessary changes to the kernel command line and Grub to output to the console?
For ILO, the model and firmware revision makes a difference. Current ILO2 and ILO3 firmware allows for an Infinite idle timeout under
Access Options
.If you can use PuTTY, maybe you can prevent a time out from occurring...
I'm not sure if the ilo's SSH implementation supports it but using the openssh client and setting
ServerAliveInterval
may help. It defaults to 0. I use it to keep my ssh connections active to prevent nat table timer timeouts. In~/.ssh/config
put: