I've got a Mac fileserver that isn't acting right (and was acting strangely yesterday as well). I set up an audible ping to it, and ssh
ed into it and, as root, issued the reboot
command. It did nothing! I ssh
ed in again and issues another reboot command, and still nothing.
It is bizarre seeing 'reboot' listed as one of the tasks in top
's listing.
Then, as root, I issued a
shutdown -r now
It alerted all users that the system was going down ... and it didn't go down. (I can't establish a new ssh connection, but I did leave myself one open with root access.)
I've never seen anything like this. What could prevent a system from rebooting, and more to the point, without accessing the box physically (I can, it is just at another location), how can I bring the box down?
I notice now that top
says:
Processes: 25 total, 2 running, 4 stuck, 19 sleeping... 88 threads
I've never seen stuck processes either. (And one of my friends was just telling me that only on Unix can you have sleeping zombie children.)
Update:
From this thread (esp. post #9), I take is that ps
and top
will show a 'U' for stuck ("uninterruptible") processes.
bash-3.2# ps ax | grep U
48 ?? Us 0:08.23 /usr/sbin/update
10180 ?? U 0:32.95 /System/Library/CoreServices/RemoteManagement/ARDAgent.app/Contents/Support/build_hd_index
17119 s000 U+ 0:00.07 reboot
17052 s001 U+ 0:00.09 reboot
17261 s002 R+ 0:00.00 grep U
Issuing kill -9 [pid]
has no effect.
Well, in my particular instance, this server (an Apple XServe) was not talking to the attached RAID unit. I finally rebooted the server, the RAID unit, and the server again, and thinks worked.
From my research, it appears that tasks can get into the stuck or uninterruptible state, and even SIGKILL will not phase them. I believe that one process was waiting for the RAID volumes to mount, and the other processes (particularly the 'reboot' commands) were waiting for it.
As a general catchall the following items will interrupt a reboot:
It won't help you if you don't already have it set up. But for the future, using a console server with ipmi, or even ipmi without a console server, you could issue power commands to the machine that are equivalent to a hard reset.