My ubuntu 16.04 bootime is consistently long (several minutes). When i run
systemd-analyze blame
the first line is
2min 687ms open-iscsi.service
The box is indeed setup with an iscsi drive (a separate NAS box) but I keep this drive off 99% of the time which could explain the long boot time.
My question is, how do I setup my system so that the unsuccessful wait for the iscsi connexion does not slow down the boot process?
Things I tried:
I tried to setup the replacement_timeout to -1 in /etc/iscsi/iscsid.conf
node.session.timeo.replacement_timeout = -1
but it did not prevent the 2 minutes wait time.
EDIT: additional infos
1/ there is no entry in /etc/fstab for that disk
2/ service status gives some info
$ service open-iscsi status
● open-iscsi.service - Login to default iSCSI targets
Loaded: loaded (/lib/systemd/system/open-iscsi.service; enabled; vendor preset: enabled)
Active: failed (Result: exit-code) since mer. 2017-12-20 13:35:25 CET; 5h 23min ago
Docs: man:iscsiadm(8)
man:iscsid(8)
Process: 1511 ExecStart=/sbin/iscsiadm -m node --loginall=automatic (code=exited, status=8)
Process: 1500 ExecStartPre=/bin/systemctl --quiet is-active iscsid.service (code=exited, status=0/SUCCESS)
Main PID: 1511 (code=exited, status=8)
déc. 20 13:35:25 hippo-camp iscsiadm[1511]: iscsiadm: initiator reported error (8 - connection timed out)
déc. 20 13:35:25 hippo-camp iscsiadm[1511]: iscsiadm: Could not login to [iface: default, target: whitebox.target-2, portal: 192.168.0.99,3260].
déc. 20 13:35:25 hippo-camp iscsiadm[1511]: iscsiadm: initiator reported error (8 - connection timed out)
déc. 20 13:35:25 hippo-camp iscsiadm[1511]: iscsiadm: Could not log into all portals
déc. 20 13:35:25 hippo-camp iscsiadm[1511]: Logging in to [iface: default, target: whitebox.target--1, portal: 192.168.0.99,3260] (multiple)
déc. 20 13:35:25 hippo-camp iscsiadm[1511]: Logging in to [iface: default, target: whitebox.target-2, portal: 192.168.0.99,3260] (multiple)
déc. 20 13:35:25 hippo-camp systemd[1]: open-iscsi.service: Main process exited, code=exited, status=8/n/a
déc. 20 13:35:25 hippo-camp systemd[1]: Failed to start Login to default iSCSI targets.
déc. 20 13:35:25 hippo-camp systemd[1]: open-iscsi.service: Unit entered failed state.
déc. 20 13:35:25 hippo-camp systemd[1]: open-iscsi.service: Failed with result 'exit-code'.
WORKAROUND: A workaround is to disable iscsi service as follows:
A more satisfying solution would be to be able to make the iscsi connexion non blocking for the boot process.
Intro:
I had (4) Raspberry Pi's running Ubuntu 20.04 LTS and as Docker hosts. One of the (4) hosts- "Host 1" in this answer- was taking almost 2 minutes longer to boot than the other (3) in the cohort due to
open-iscsi
.I tried for a quick-win restarting services and rebooting, but problem persisted.
Short Answer:
The
open-iscsi
cache was causing the problem. I blew it away and this resolve the issue:The service then rose up very quickly like the other (3) hosts.
Longer Answer:
To isolate the fault I performed the following troubleshooting:
Comparative Analysis:
Service Dependencies Ordering:
Since there were (4) hosts, comparative analysis using
system-analyze plot
- which gives a comprehensive, very granular look into when, where and how long a service rises-up compared to other services- would yield the answer:I SCP'ed these files to my Macbook:
I then right-clicked
open with
for each of the (4) files and chose my web browser. I now had a tab with each of the files I could compare.All the services were rising-up in the same order, so I was able to exclude an issue like open-iscsi starting before the networking was fully up. So all hosts were loading the same services in the same order. I could thus exclude service ordering dependencies as the culprit.
Open-iSCSI Config Files:
I next compared the
open-iscsi
config files for eqch of the (4) hosts:These only varied in respect to the CHAP autentication settings. Other than that, there was exact parity between all the hosts' open-iscsi config files. So I could exclude inconsistent open-iscsi configuration as being causal.
The Cache:
I had changed passwords after configuring Host 1. I suspected this could be giving me the grief. So I thought: it's a no-cost solution to blow away the connections cache:
After the reboot I did another
system-analyze plot
for Host 1:This revealed the problem was resolved and that problem was duff CHAP or other connection information in the open-iscsi cache.
Conclusion:
If you're nervous about deleting anything, just do:
But otherwise, I'd suggest just blowing away the cache rather than going through the lengthy troubleshooting I did; hope this saves other folks wasting their cycles. Easy problemn to resolve once you know how things are broken ;-)