This is the third time this has happened in last 4-5 months (16.04 as well as now 18.04 LTS). I run some engineering simulations on my Ubuntu machine and normally everything would be alright. Occasionally when I'm running these simulations and leave my PC on for over 2-3 days without interacting with it, this happens.
The machine is unresponsive to keyboard or mouse clicks. Earlier I thought it was the system that was suspended but it looks like it doesn't respond to mouse clicks or keyboard keys (realised this from the numlock LED behaviour). So I had to do hard reset. Why does this happen? Any clues? Where do I look?
I ran tail -500 /var/log/syslog
to notice nothing unusual but the log apparently has entries only for the day (from 00:05 onward). Relevant entries here:
Sep 25 02:46:08 mae-hen-8163-db org.gnome.Shell.desktop[2028]: GetDntDataTotal sql exec success
Sep 25 02:46:08 mae-hen-8163-db org.gnome.Shell.desktop[2028]: GetDntDataQueue sql exec success
Sep 25 02:46:14 mae-hen-8163-db org.gnome.Shell.desktop[2028]: SchedulePriorTracktype called
Sep 25 02:46:14 mae-hen-8163-db org.gnome.Shell.desktop[2028]: GetDntDataPriorQueue sql exec success
Sep 25 02:46:47 mae-hen-8163-db org.gnome.Shell.desktop[2028]: SchedulePriorTracktype called
Sep 25 02:46:47 mae-hen-8163-db org.gnome.Shell.desktop[2028]: GetDntDataPriorQueue sql exec success
\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00Sep 25 10:29:23 mae-hen-8163-db systemd[1]: Started Uncomplicated firewall.
Sep 25 10:29:23 mae-hen-8163-db systemd[1]: Started udev Coldplug all Devices.
Sep 25 10:29:23 mae-hen-8163-db systemd-modules-load[354]: Inserted module 'lp'
Sep 25 10:29:23 mae-hen-8163-db systemd-modules-load[354]: Inserted module 'ppdev'
Sep 25 10:29:23 mae-hen-8163-db systemd-modules-load[354]: Inserted module 'parport_pc'
Sep 25 10:29:23 mae-hen-8163-db systemd[1]: Started Load Kernel Modules.
Sep 25 10:29:23 mae-hen-8163-db systemd[1]: Mounting FUSE Control File System...
Sep 25 10:29:23 mae-hen-8163-db systemd[1]: Mounting Kernel Configuration File System...
......Some entries...
Sep 25 10:29:23 mae-hen-8163-db systemd[1]: Started Remount Root and Kernel File Systems.
Sep 25 10:29:23 mae-hen-8163-db kernel: [ 0.000000] Linux version 4.15.0-34-generic (buildd@lgw01-amd64-047) (gcc version 7.3.0 (Ubuntu 7.3.0-16ubuntu3)) #37-Ubuntu SMP Mon Aug 27 15:21:48 UTC 2018 (Ubuntu 4.15.0-34.37-generic 4.15.18)
The relevant line is the one with the zeros. That's when I did a hard reset and the boot sequence followed. There's a gap after 2:46 until the hard reset. What does that indicate?
I also checked with sudo dmesg --ctime
to get following output (I do not understand any of it):
[Tue Sep 25 10:28:38 2018] efifb: scrolling: redraw
[Tue Sep 25 10:28:38 2018] efifb: Truecolor: size=8:8:8:8, shift=24:16:8:0
[Tue Sep 25 10:28:38 2018] Console: switching to colour frame buffer device 240x67
[Tue Sep 25 10:28:38 2018] fb0: EFI VGA frame buffer device
[Tue Sep 25 10:28:38 2018] intel_idle: MWAIT substates: 0x142120
[Tue Sep 25 10:28:38 2018] intel_idle: v0.4.1 model 0x5E
[Tue Sep 25 10:28:38 2018] intel_idle: lapic_timer_reliable_states 0xffffffff
[Tue Sep 25 10:28:38 2018] input: Sleep Button as /devices/LNXSYSTM:00/LNXSYBUS:00/PNP0C0E:00/input/input0
[Tue Sep 25 10:28:38 2018] ACPI: Sleep Button [SLPB]
[Tue Sep 25 10:28:38 2018] input: Power Button as /devices/LNXSYSTM:00/LNXSYBUS:00/PNP0C0C:00/input/input1
[Tue Sep 25 10:28:38 2018] ACPI: Power Button [PWRB]
[Tue Sep 25 10:28:38 2018] input: Power Button as /devices/LNXSYSTM:00/LNXPWRBN:00/input/input2
[Tue Sep 25 10:28:38 2018] ACPI: Power Button [PWRF]
[Tue Sep 25 10:28:38 2018] (NULL device *): hwmon_device_register() is deprecated. Please convert the driver to use hwmon_device_register_with_info().
[Tue Sep 25 10:28:38 2018] thermal LNXTHERM:00: registered as thermal_zone0
[Tue Sep 25 10:28:38 2018] ACPI: Thermal Zone [TZ00] (28 C)
[Tue Sep 25 10:28:38 2018] thermal LNXTHERM:01: registered as thermal_zone1
[Tue Sep 25 10:28:38 2018] ACPI: Thermal Zone [TZ01] (30 C)
[Tue Sep 25 10:28:38 2018] Serial: 8250/16550 driver, 32 ports, IRQ sharing enabled
[Tue Sep 25 10:28:38 2018] 00:01: ttyS0 at I/O 0x3f8 (irq = 4, base_baud = 115200) is a 16550A
[Tue Sep 25 10:28:38 2018] serial 0000:00:16.3: enabling device (0000 -> 0003)
[Tue Sep 25 10:28:38 2018] 0000:00:16.3: ttyS4 at I/O 0xf0a0 (irq = 20, base_baud = 115200) is a 16550A
[Tue Sep 25 10:28:38 2018] Linux agpgart interface v0.103
[Tue Sep 25 10:28:38 2018] tpm_tis MSFT0101:00: 2.0 TPM (device-id 0xFE, rev-id 4)
[Tue Sep 25 10:28:38 2018] tpm tpm0: A TPM error (2314) occurred continue selftest
[Tue Sep 25 10:28:38 2018] tpm tpm0: A TPM error (2314) occurred continue selftest
[Tue Sep 25 10:28:39 2018] tpm tpm0: A TPM error (2314) occurred continue selftest
[Tue Sep 25 10:28:39 2018] tpm tpm0: A TPM error (2314) occurred continue selftest
[Tue Sep 25 10:28:39 2018] tpm tpm0: A TPM error (2314) occurred continue selftest
[Tue Sep 25 10:28:39 2018] tpm tpm0: A TPM error (2314) occurred continue selftest
[Tue Sep 25 10:28:39 2018] clocksource: Switched to clocksource tsc
[Tue Sep 25 10:28:40 2018] tpm tpm0: A TPM error (2314) occurred continue selftest
[Tue Sep 25 10:28:41 2018] tpm tpm0: TPM self test failed
[Tue Sep 25 10:28:41 2018] loop: module loaded
[Tue Sep 25 10:28:41 2018] libphy: Fixed MDIO Bus: probed
[Tue Sep 25 10:28:41 2018] tun: Universal TUN/TAP device driver, 1.6
[Tue Sep 25 10:28:41 2018] PPP generic driver version 2.4.2
[Tue Sep 25 10:28:41 2018] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
Is the above log relevant? Does it say something? Can someone please help me find a permanent solution to this issue?
Thanks