We've had a rather annoying incident yesterday.
During a routine snapshot backup, a machine failed with the following error:
Backup virtual machine
Cannot complete the operation. See the event log for details.
Incremental Forever - Incremental
Followed by:
Create virtual machine snapshot
An error occurred while saving the snapshot: msg.snapshot.error-QUIESCINGERROR.
Eight hours later, the next backup and snapshot kicked in, and succeeded.
However, the machine itself was completely unresponsive, and the SQL Server that was installed on it was throwing I/O errors.
On the machine itself the following warning occurred every 30 seconds for the next 12 hours:
Reset to device, \Device\RaidPort0, was issued.
The SQL Server, when attempting to execute any query on the databases gave the following error:
Time-out occurred while waiting for buffer latch type 2 for page
And in the SQL Logs we could find the following error:
SQL Server has encountered 1 occurrence(s) of I/O requests taking longer than 15 seconds to complete on file
We ended up trying to remount the databases, restart the services (SQL Server and Virtual Disk services). But in the end the only solution was a restart of the server.
What happens during the VSphere Snapshot process that could cause this chain of events?
If this is related to the VSphere Snapshot, why would a reboot fix it?
0 Answers