I have a production server that is responsible for controlling a factory. The server runs a number of control applications and a SQL server.
The problem I have is that one of the applications that communicates with a PLC is reporting communication problems at seemingly random intervals.
Using resource monitor, I have noticed the network activity drops sharply whenever this problem occurs. My VNC connection is not interrupted and the server responds to pings from other computers during the event/blip, however, other computers on the network which run applications connected to the SQL server are freezing till the network traffic restores.
Screenshot of the resource monitor network graphs at the time of a blip. the first arrow is when we start experiencing communication problems and the second arrow is when things return to normal:
I have analysed the SQL server at these times and there are no resource waits and the number of batches processed per second are also low. I also did a trace on the SQL server at the time of a blip but this did not reveal anything significant.
At the moment just before the network activity drops there are no other indications this is going to happen. The CPU is low and memory is usage remains at about 70%.
Could this be being caused by external factors affecting the network or maybe something wrong with the network card?
Edit (additional information):
This is a performance monitor for packets sent and received at the time of the blip:
Try installing Wireshark (www.wireshark.org) you'll see what happens to each packet... I had a similar issue with my Exchange 2010 Server and after analysing the packets, i discovered that the issue was with IP Fragmentation which i resolved by reducing the MTU of the server's NIC. So, that might be a reason for the packet drops.
Check this link: http://www.networkworld.com/community/blog/mtu-size-issues