I'm running a Windows Server 2003 guest instance in Xen 3.x. This DomU runs fine for a day or two, then stops responding — I don't get any network response, and I can no longer connect to Xen's VNC console for this DomU.
xm list
shows this:
Name ID Mem VCPUs State Time(s)
Domain-0 0 6508 8 r----- 1161159.4
[A working Linux DomU] 1 512 1 -b---- 68711.1
[The hung Windows DomU] 5 512 1 ------ 67234.2
[Another working Linux DomU] 3 512 1 -b---- 163036.4
(What does the ------
mean? The xm manual explains what each of the six states mean, but not what no-state means.)
If I xm destroy
and then xm create
the Windows DomU again, it boots right back up (with the Windows alert The previous system shutdown at [...] was unexpected.
), and then stops responding after another day or two. Aside from that alert, there's nothing relevant in the Windows event log. Also, I'm using Munin to monitor disk, network, process count, CPU usage, and memory usage; the Munin graphs don't indicate any resource exhaustion or other suspicious activity prior to the hang.
I checked /var/log/xen/*.log
, but no log messages are generated at the time the server stops responding.
How should I proceed in troubleshooting this?
The
------
means it's in a nothing state. As in, it's not running, blocked paused, shutdown, crashed or dying. So, "runnable but not running," as if it were in the run-queue, but not at the front of it.As to troubleshooting, what have you tried so far, and what Xen performance monitoring tools or scripts are you running? Kinda hard to suggest where to go when we don't know where you've been. If you haven't already, I'd definitely start with logging and performance monitoring to see if you can correlate the onset of your runnable state with any indicators.
It might also be worth having a look at the Windows event logs or doing some performance logging inside Windows - I doubt they'll say anything of note, but it could be that something inside the guest OS is triggering this behavior, and if so, you'd want to look at the guest OS to track down what.