I recently created a new EC2 from a snapshot of our Production EC2.
The machine started up fine, and I can ssh in, however - cannot access via anything else. No WWW, nothing.
Upon further inspection of the device, primarily the network stack - I see this:
/etc/udev/rules.d/70-persistent-net.rules
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="06:68:f3:22:91:f2", NAME="ens5"
ifconfig
ens5: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 9001
inet 172.31.12.146 netmask 255.255.240.0 broadcast 172.31.15.255
inet6 fe80::468:f3ff:fe22:91f2 prefixlen 64 scopeid 0x20<link>
ether 06:68:f3:22:91:f2 txqueuelen 1000 (Ethernet)
RX packets 492 bytes 81928 (80.0 KiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 474 bytes 76982 (75.1 KiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536
inet 127.0.0.1 netmask 255.0.0.0
inet6 ::1 prefixlen 128 scopeid 0x10<host>
loop txqueuelen 1000 (Local Loopback)
RX packets 6 bytes 416 (416.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 6 bytes 416 (416.0 B)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
Note the ens5 at the first line of the ifconfig
.
[/etc] # service network restart
Restarting network (via systemctl): Job for network.service failed because the control process exited with error code. See "systemctl status network.service" and "journalctl -xe" for details.
[FAILED]
[/etc] # systemctl status network.service
● network.service - LSB: Bring up/down networking
Loaded: loaded (/etc/rc.d/init.d/network; bad; vendor preset: disabled)
Active: failed (Result: exit-code) since Tue 2018-10-16 11:13:34 EDT; 1min 4s ago
Docs: man:systemd-sysv-generator(8)
Process: 2223 ExecStart=/etc/rc.d/init.d/network start (code=exited, status=1/FAILURE)
CGroup: /system.slice/network.service
└─857 /sbin/dhclient -1 -q -lf /var/lib/dhclient/dhclient--ens5.lease -pf /var/run/dhclient-ens5.pid -H ip-172-31-12-146 ens5
Oct 16 11:13:34 ip-172-31-12-146.us-west-1.compute.internal network[2223]: RTNETLINK answers: File exists
Oct 16 11:13:34 ip-172-31-12-146.us-west-1.compute.internal network[2223]: RTNETLINK answers: File exists
Oct 16 11:13:34 ip-172-31-12-146.us-west-1.compute.internal network[2223]: RTNETLINK answers: File exists
Oct 16 11:13:34 ip-172-31-12-146.us-west-1.compute.internal network[2223]: RTNETLINK answers: File exists
Oct 16 11:13:34 ip-172-31-12-146.us-west-1.compute.internal network[2223]: RTNETLINK answers: File exists
Oct 16 11:13:34 ip-172-31-12-146.us-west-1.compute.internal network[2223]: RTNETLINK answers: File exists
Oct 16 11:13:34 ip-172-31-12-146.us-west-1.compute.internal systemd[1]: network.service: control process exited, code=exited status=1
Oct 16 11:13:34 ip-172-31-12-146.us-west-1.compute.internal systemd[1]: Failed to start LSB: Bring up/down networking.
Oct 16 11:13:34 ip-172-31-12-146.us-west-1.compute.internal systemd[1]: Unit network.service entered failed state.
Oct 16 11:13:34 ip-172-31-12-146.us-west-1.compute.internal systemd[1]: network.service failed.
It cannot find eth0, nor can it restart the network stack. I have tried rebooting the machine, shutting down and starting up, with no luck. What am I missing?
Did you change from older instance type to T3 / M5 / C5? These have got a different hardware and use different device names.
One option is to reconfigure the network stack to reflect the new device names - that may be quite an undertaking unless you’re a skilled Linux admin and know what you are doing.
Or, easier, change the instance type to the same that you made the snapshot from. That should fix the device names back to what they used to be.
You can change the size, e.g. from large to medium, but keep the type - if it was T2 use T2 again.
Also I suggest you restore it from the snapshot to a fresh instance - the current one has probably tried to accommodate the new device names and may be in inconsistent state. Better to start again from the Prod snapshot.
Hope that helps :)
You can modify the device name in the udev rule you posted. Edit that line and rename
ens5
toeth0