We have an overworked server currently running a single SQL Server 2000 instance on physical hardware, and about 40 different apps interact with it on a daily basis. Last year, the RAID controller failed and we had no spare, so IT Support hurriedly migrated it overnight to a copy running on a VMWare Server. While it was on that server everything ran much quicker due to it being a big improvement in spec. However, the biggest app using it had occasional serious errors which never occurred on physical hardware.
Specifically, several times a week it would disconnect batches of users - anywhere from just ten to hundreds at once, and all at the same time. It didn't affect any particular users or PCs or offices - all were affected equally.
The only common thing was the app, which is a VB6 app using ADO 2.8 to connect. The other apps connecting to that virtualised instance of SQL Server seemingly had no problems, although they were (and are) responsible for only a tiny fraction of the work involving this server.
The upshot is that after about two weeks of loving the speed and hating the random mass disconnections (which we were never able to find a cause for), we sadly took the decision to return to physical hardware and the disconnections vanished.
Now we've reached the point where the old server just can't handle all that's being asked of it, and we're intending to migrate everything to 2 or more other servers. The snag is that there's a good chance they'll have to be virtual ones again. Given what happened last time, I'm trying to find out what possible reasons there could be for these mass disconnections. We were running VMWare ESX, but the network is Novell-based. Also, the server had a linked server setup to connect to an Informix server using a known-to-be-buggy ODBC driver, and this is used throughout the day.
Any ideas on the cause(s)?
Check your error logs and that sort of stuff. It SOUNDS like it might be a big I/O freeze - something got swapped out (or the VM started swapping) that probably shouldn't have, and it took so long to swap it back in under load that things just crapped out.
Are your VMs swapping? Thats death to performance on any virtualization platform.
I found this fix which may be your problem. You will have to do this on the SQL VM. The network connection has an option, task offload, set that is causing the problem. Here is the post that describes the fix.
[http://forum.wegotserved.com/index.php?/topic/11433-help-my-network-connection-keeps-dropping-out/]
View PostReg, on 22 January 2010 - 12:08 AM, said: Well I may have fixed my network connection problem. Based on something I read in the Backup forum, I disabled "task offload" located in the advanced settings of my network controller card. Since that time (about a week ago), I've been able to successfully backup my Win 7 computer without losing the network connection. I have no idea what "task offload" means.
I did exactly the same and started a manual backup for one of the Windows 7 clients. The backup has just successfully finished without any problem and I am very happy Posted Image
I didn't know either what his "Task Offload" was all about, so I looked it up on the internet and found an explanation in Wikipedia. It seems to be a protocol that breaks down large chunks of data into smaller fragments, before they can be send through the network... http://en.wikipedia....segment_offload Maybe some off you understand this better than I as I am not much of a computer or network whizz Posted Image