I have an GCE instance which has been running for several years. During night, the instance was restarted with following logs:
2022-02-13 04:46:36.370 CET compute.instances.hostError Instance terminated by Compute Engine.
2022-02-13 04:47:08.279 CET compute.instances.automaticRestart Instance automatically restarted by Compute Engine.
However the instance did not restart.
I can connect to the serial console where I see this:
serialport: Connected to ***.europe-west1-b.*** port 1 (
[ TIME ] Timed out waiting for device ***
[DEPEND] Dependency failed for File… ***.
[DEPEND] Dependency failed for /data.
[DEPEND] Dependency failed for Local File Systems.
[ OK ] Stopped Dispatch Password …ts to Console Directory Watch.
[ OK ] Stopped Forward Password R…uests to Wall Directory Watch.
[ OK ] Reached target Timers.
Starting Raise network interfaces...
[ OK ] Closed Syslog Socket.
[ OK ] Reached target Login Prompts.
[ OK ] Reached target Paths.
[ OK ] Reached target Sockets.
[ OK ] Started Emergency Shell.
[ OK ] Reached target Emergency Mode.
Starting Create Volatile Files and Directories...
[ OK ] Finished Create Volatile Files and Directories.
Starting Network Time Synchronization...
Starting Update UTMP about System Boot/Shutdown...
[ OK ] Finished Update UTMP about System Boot/Shutdown.
Starting Update UTMP about System Runlevel Changes...
[ OK ] Finished Update UTMP about System Runlevel Changes.
[ OK ] Started Network Time Synchronization.
[ OK ] Reached target System Time Set.
[ OK ] Reached target System Time Synchronized.
Stopping Network Time Synchronization...
[ OK ] Stopped Network Time Synchronization.
Starting Network Time Synchronization...
[ OK ] Started Network Time Synchronization.
[ OK ] Finished Raise network interfaces.
[ OK ] Reached target Network.
[ OK ] Reached target Network is Online.
You are in emergency mode. After logging in, type "journalctl -xb" to view
system logs, "systemctl reboot" to r
Cannot open access to console, the root account is locked.
See sulogin(8) man page for more details.
Press Enter to continue.
It seems that one of the disks cannot be connected – but what can I do about it now? The disk seems to be normally available within the compute engine.
I am afraid that you cannot do anything with this affected VM.
In Host Events documentation or FAQ you can find information:
VM instance which is in the "Cloud", it's still a physical machine that is running your workload. Unfortunately this instance had a hardware or software failure and there is nothing you can do.
GCP introduced something called Live migration which prevents this kind of situation.
Possible Workaround
As you mention that disks are persistent and still visible in the GCP, you could try to reattach them to another VM. How to Guide can be found in Creating and attaching a disk documentation.
I finally found the strange reason for this error - see original
/etc/fstab
:But there is no such device on this path. I solved this by attaching
/dev/sdb
instead, but I guess thi is not the best solution. I wonder how does this happen that the device suddenly completely disappears and in the end kills the machine.