Issue: My EC2 instance (used to host a tomcat based web application) goes non-responsive. It stops serving web page and does not let me login through SSH. It starts working back if I reboot it from AWS console.
Version: Ubuntu 18.04.1 LTS
Log analysis:
I found following in syslog:
systemd-networkd[621]: Failed to save lease data /run/systemd/netif/leases/2: No space left on device
systemd-networkd[621]: eth0: Failed to save link data to /run/systemd/netif/links/2: No space left on device
systemd-timesyncd[519]: Network configuration changed, trying to establish connection.
systemd-timesyncd[519]: Synchronized to time server 91.189.94.4:123 (ntp.ubuntu.com).
systemd-resolved[651]: Failed to write private resolv.conf contents: No space left on device
systemd-networkd[621]: Failed to save lease data /run/systemd/netif/leases/2: No space left on device
systemd-networkd[621]: eth0: Failed to save link data to /run/systemd/netif/links/2: No space left on device
systemd-networkd[621]: eth0: Could not set DHCPv4 address: Connection timed out
systemd-timesyncd[519]: Network configuration changed, trying to establish connection.
systemd-networkd[621]: eth0: Failed
systemd-timesyncd[519]: Synchronized to time server 91.189.94.4:123 (ntp.ubuntu.com).
systemd-resolved[651]: Failed to write private resolv.conf contents: No space left on device
systemd-networkd[621]: Failed to save network state to /run/systemd/netif/state: No space left on device
Cause:
Above logs suggest that /run
goes out of space and systemd-networkd is unable to perform write operations.
Here is mount info about the /run
. Currently it is using just 1% because I recorded it after the instance was rebooted and it started working again.
Filesystem Size Used Avail Use% Mounted on
udev 3.9G 0 3.9G 0% /dev
tmpfs 798M 760K 797M 1% /run
tmpfs 5.0M 0 5.0M 0% /run/lock
tmpfs 798M 0 798M 0% /run/user/1000
Research:
I explored the web and found following related issues but could not get a concrete solution:
Queries:
- What is the reason it goes out of space? Is system generating any temporary files which are not getting cleaned up ever?
- Currently 798M of space is allocated to
/run
. Is allocating more space a solution? If yes, then how much space should I allocate to it?