We've had a server (CentOS) running in EC2 for a few months. It had been going pretty smoothly until today when we got an alarm that the server was unavailable (HTTP service couldn't be reached). So I tried SSHing into the box but that timed out as well. I logged into the EC2 console and it said the instance was running and there wasn't anything in the system log. One odd thing I noticed is that even though we have an Elastic IP attached to it (which shows in the Elastic IP management area), the instance detail is not showing that there is an EIP associated with the instance.
I looked through the message log and the last thing I see around the time we got our alert was the dhclient renewed the lease. I'm guessing there may have been some sort of issue with the networking.
How might I check if that was the problem, or if there were any other issues that may have caused our instance to stop responding?
In short, you can't. If you have gold support, you can open a ticket and sometimes they will give you a little bit more information, but otherwise all you can do is terminate the instance and start a new one.
We experience failures like this from time to time with EC2. We just keep instances on standby ready to take over should this arise.
Instead of keeping an instance on standby, wouldn't an appropriate autoscale metric achieve the same solution?