I’m running a Debian Squeeze AMD64 server. Target runlevel after boot is runlevel 2, which includes rsyslogd, cron, sshd and some other stuff, but not dovecot, postfix, apache2, etc. The system fails to reach runlevel 2 with several symptoms:
- The system hangs at trying to start rsyslogd
- Booting into runlevel 1 works, then login from the console works
- Starting rsyslogd from runlevel 1 via /etc/init.d/rsyslog hangs
- Starting runlevel 2 with rsyslogd disabled works
- But then, logging in via console fails: I get the motd, and then nothing
- Starting sshd from runlevel 1 succeeds
- But then, I cannot login via ssh. Sometimes password ssh login gives me the motd and then nothing, sometimes not even this. Trying to offer a public key seems to annoy the sshd enough to not talk to me any further.
- When rebooting from runlevel 1, the server hangs at trying to stop apache2 (which is not running, so this really should be trivial). Trying to stop apache2 when logged in in runleve 1 does hang as well.
And that’s just the stuff which fails all the time. RAM has been tested, dmesg shows no problems. I have no clue.
Update: (shortened) output from rsyslogd -c4 -d called in runlevel 1
rsyslogd 4.6.4 startup, compatibility mode 4, module path '' caller requested object 'net', not found (iRet -3003) Requested to load module 'lmnet' loading module '/user/lib/rsyslog/lmnet.so' module of type 2 being loaded conf.c requested ref for 'lmnet', refcount 1 rsylog runtime initialized, version 4.6.4, current users 1 syslogd.c requested ref for 'lmnet', refcount now 2
I can kill rsyslogd with Strg+C, then. /var/log shows none of the configured log files, though.
Update2: Thanks to @DerfK I still have no clue, but at least I narrowed down the problem. I’m now testing with /etc/init.d/apache2 stop
(without an apache2 running, of course) which hangs as well and looks like an even more obvious failure.
After some testing I found out that a file with one single line:
/usr/sbin/apache2ctl configtest > /dev/null 2>&1
hangs, while the same line executed in an interactive shell works. I was not able to further reduce this line while, i. e. every single part, the stream redirections and the commando itself is necessary to reproduce the hang. @DerfK also pointed me to strace
which gave a shallow hint about what kind of hang we have here:
wait4(-1
for the init scriptsfutex(0xsomepointer, FUTEX_WAIT_PRIVATE, 2, NULL
forrsyslogd
/apache2
binaries called by the init scripts
The system was installed as a Debian Lenny by my hoster in autumn 2011, I upgraded it to Squeeze immediately and kept it up to date with Squeeze, which then used to be testing. There were no big changes, though. I guess I never tried to reboot the system before.
Update3: I found the problem. My /etc/nsswitch.conf specified ldap as hosts lookup backup, which is not available at that time of the boot. Relying on dns solely fixes my boot problems.
This sounds to me like some basic network service isn't being started. Compare the contents of
/etc/rc2.d
with/etc/rc3.d
ro see if runlevel 3 starts anything runlevel 2 doesn't (normally it does, but usually it's not something fundamental).Debian Squeeze does concurrent startup by default. This means multiple init scripts are running at the same time on boot. You can try disabling this so that only one script runs at a time to help find out exactly which step it is failing on. Since the init scripts will run in the same order every time it should fail on the same one everytime unless it is a much more serious problem.
To disable concurrent booting add
CONCURRENCY=none
to/etc/default/rcS
. Remove the line to restore the default.