UPDATE: It seems mountall is hanging inside the routine emit_event(), which it calls after / is remounted to emit an event to that effect. Inside emit_event, it calls ply_boot_client_flush(), then constructs the env array, calls upstart_emit_event(), then dbus_pending_call_block(). And there it hangs. So any ideas why dbus_pending_call_block would hang indefinitely? Broken plymouth? dbus? upstart? Any suggestions for fixes or further diagnostics?
Reboot of my Ubuntu 10.04 LTS, 64bit AMD machine hangs 100%. The drive access light is off, but the alt-sysreq keys do work. The hardware is a Lenovo W700ds laptop. Now, I apologize in advance, because I'm very limited in the information about the system I have available, and in what I can do with it (because it will not boot). I can boot from the 10.04 CD - using it like a rescue disk. I can fsck, mount and read & write to my partitions - they are fine. I already tried reformatting my swap with mkswap. I have 4 ext4 partitions on my system: sda1 is /, sda2 is /usr, sda3 is /home, and a 4th that I use for data storage /sdb1 (is the entire disk, mounts at the mountpoint /hdb which I created). There is also /sda4 which is swap. Right now I am writing this from a browser I have opened in the 'rescue session' from the 10.04 LTS install CD.
I would GREATLY appreciate suggestions/comments on what I could do to help diagnose what is hanging, why, and what I could do to fix it. I've done a websearch already, but found nothing new along these lines (some 1-1.5 year old bug reports with similar symptoms, but their fixes did not work).
I installed 10.04 on a new disk around the first of July, then used aptitude to bring everything up to date. Since then I've been installing LOTS of packages (I'll attach the dpkg log below). With sda being 750GB (/ 20GB, /usr 80GB) I had lots of space to install packages that I 'might someday use'. I wonder if its one of these packages I installed that has screwed up my system? I installed kernel 2.6.32-32-generic and rebooted, but have installed many more packages since. I reboot this machines as rarely as possible - preferring to hibernate it while going from place to place. Lately though, I noticed some strange behavior associated with de-hibernation: when the system would de-hibernate it brings up the gnome screen saver with the a password needed to unlock - well, it would not recognize my password! I had to alt-F1, log in as root, and kill the screen saver. Then all would be fine, or so it seemed. Also, upon de-hibernation I would frequently see for a short while blinking colorful garbage on the screen. It would go away, so I didn't try to find the cause. Another possibly relevant point is that I needed to use "nomodeset" in the installation of 10.04, and when bringing up the rescue shell from that same CD, if I use only nomodeset it will eventually hang with a flashing NumLock LED or Caps Lock LED (crash?), but if I also use "noapic nolapic acpi=off" then it comes up ok. I've tried these options with my system to see if they cure the boot hang problem - they do not.
This is a machine I use for work as well as for nearly everything else, so getting it to boot again is a TOP priority. /home is intact, which is good. But I'm about at my wits end in trying to diagnose (much less fix) this cause of the hung boot.
I boot the system, and it starts running the mountall config script in /etc/init/mountall.conf. I see output from mountall running fsck - 4 lines that say: fsck from util-linux-ng 2.17.2 (thats one per ext4 partition). Then there are 4 more lines from fsck informing the user that the partitions were found to be "clean". And that is it - everything just stops. The drive activity LED goes off. I can use the alt-sysreq keys, but they have so far not proven useful. I saw a bug report where one user used alt-sysreq-i to kill process and it dropped him into a shell. For me, it does say it has killed processes (udev and udev-bridge and plymouth, says its respawning udev, etc), but I do not get any shell.
I have been trying to determine what exactly is hanging. To this end, I've tinkered with /etc/init/mountall.conf. I have added echo lines, and I have added the -v (verbose) option to the exec of mountall. No echo lines after the exec of mountall are shown, so this may mean mountall is hanging. Or, it may not be displaying the last of the output - in which case mountall may have exited and something else may be hanging. I note that alt-sysreq-i does not say mountall is killed. I've tried to narrow down what the system might be hanging on by commenting out sda3 (/home), swap and sdb1 (/hdb) from fstab, but it still hangs.
There is alot I can do myself, but feel like I'm in over my head here. I would like to, for example, get the source code for mountall, add printed flags, recompile and stick it on my system - to narrow down A) if mountall is actually hanging, and B) what is it hanging on. BUT, I can not boot my machine to a shell from which to compile within - and the rescue disk environment is only 2.6.32-28-generic #55 - so it would not match my system. I'd like to remove or reinstall packages, but again, I can not boot my machine and do this.
(my dpkg log file is several MBs, so I will attach it in a following dialog box)
Thanks, Greg