I have just installed Ubuntu 24.04 on an Intel Optane p4801x drive. I am trying to see how fast it boots up from this disk. Indeed, it comes up very quickly. But the NetworkManager
unit often does not start. It is enabled, but dead, and there is nothing in the journalctl -b 0 -u NetworkManager.service
. I.e. the unit never ran during the boot. When I launch it myself after the boot, it comes up fine:
systemctl start NetworkManager
The logs show that networkd-dispatcher
was disabled to break ordering cycle:
$ journalctl -b 0 | grep network
Oct 20 14:21:15 evergreens kernel: drop_monitor: Initializing network drop monitor service
Oct 20 14:21:15 evergreens systemd[1]: multi-user.target: Found ordering cycle on networkd-dispatcher.service/start
Oct 20 14:21:15 evergreens systemd[1]: multi-user.target: Job networkd-dispatcher.service/start deleted to break ordering cycle starting with multi-user.target/start
Oct 20 14:21:15 evergreens systemd[1]: Reached target network.target - Network.
Oct 20 14:21:15 evergreens systemd[1]: Reached target network-online.target - Network is Online.
...
I suspect it caused NetworkManager
to never run either.
Systemd does not show any problem in either:
$ sudo systemd-analyze verify networkd-dispatcher.service
$ sudo systemd-analyze verify NetworkManager.service
$
But the log actually shows many more ordering cycles, and it shows up in verify multi-user.target
:
$ journalctl -b 0 | grep "break.*cycle"
...
$ sudo systemd-analyze verify multi-user.target
...
For example, the network:
$ sudo systemd-analyze verify multi-user.target 2>&1 | grep -i netwo
multi-user.target: Found dependency on network.target/start
ubuntu-advantage.service: Found ordering cycle on network.target/start
ubuntu-advantage.service: Job network.target/start deleted to break ordering cycle starting with ubuntu-advantage.service/start
multi-user.target: Found ordering cycle on networkd-dispatcher.service/start
multi-user.target: Job networkd-dispatcher.service/start deleted to break ordering cycle starting with multi-user.target/start
multi-user.target: Found ordering cycle on NetworkManager.service/start
multi-user.target: Job NetworkManager.service/start deleted to break ordering cycle starting with multi-user.target/start
But, it looks like sudo systemd-analyze verify multi-user.target
prints different things on almost every time I run it? Is that possible? Sometimes it prints way more, sometimes only 1 unit.
I tried to plot the dependencies of multi-user.target
following the answer on Unix Stackexchange:
$ sudo systemd-analyze verify multi-user.target 2>&1 |\
perl -lne 'print $1 if m{Found.*?on\s+([^/]+)}' |\
xargs --no-run-if-empty systemd-analyze dot | dot -Tsvg > cycle.svg
I could not see a cycle there. Also, the graph is big and hard to read. I tried to "zoom in" on some units like, but I don't see a cycle there either:
$ echo multi-user.target networkd-dispatcher.service basic.target |\
xargs --no-run-if-empty systemd-analyze dot |\
dot -Tsvg > cycle.svg
Since systemd-analyze verify
prints different things almost every time it runs, these graphs are probably not trustworthy anyway.
Looking at individual units, I don't find an issue. The NetworkManager
dependencies seem fine:
$ cat /usr/lib/systemd/system/NetworkManager.service
[Unit]
Description=Network Manager
Documentation=man:NetworkManager(8)
Wants=network.target
After=network-pre.target dbus.service
Before=network.target
BindsTo=dbus.service
...
[Install]
WantedBy=multi-user.target
Also=NetworkManager-dispatcher.service
I had a similar issue when booting up from a regular NVMe drive from Corsair. And never had it when the boot from the Corsair NVMe was unusually slow because of one specific unit. (Mounting a HDD disk in /etc/fstab
- it is not in fstab
anymore, so it does not slow down the boot.)
I.e. I think this problem happens only when the boot sequence is fast. Although, I do not understand why that would be the case. Why a problem in the dependency graph would show up only in fast boot?
Could someone suggest how to track down ordering cycles?
What's going on with systemd-analyze verify multi-user.target
printing different things each time? Is this a real known behavior? How does it traverse the dependency graph, is it somehow random? Could this be related to the cause that makes systemd delete units during the boot, but the same units run fine later?
I looked into
systemd-analyze verify
more. It indeed printed a random number of ordering cycles every time I ran it! (The corresponding question on Unix Exchange.) And it turned out that a custommount
unit was causing it somehow. I just noticed that thismount
unit shows up in every block of "Found ordering cycle", like this:When the
mount
unit is disabled,systemd-analyze verify
finds no ordering cycles and the boot runs correctly.So, the boot failure must be caused by these randomly occurring ordering cycles, which make
systemd
randomly delete some of units, includingNetworkManager
.I do not see why the unit causes cycles. Especially why it's random. It is a simple mount that attaches a bunch of backup HDD drives after the
multi-user.target
like so:So, following reddit advice, I disabled this
mount
unit and added & enabled anautomount
unit that triggers the mount: