We are experiencing difficulty with a Windows Server 2003 that is connected to our network via a VPN.
Everything seems to work fine with this server except file share downloads to local machines.
The local machines are running Windows XP. Remote Desktop connections to the same server work great. Uploads to file shares also work acceptibly. (This is surprising because the network between us is actually rated higher for download than upload.)
When dragging and dropping files to the local system, there is a delay > 10 seconds before any progress bar activity. Download of files via the command prompt has the same characteristics. Delay is incurred for each file to be transferred, not once for the entire connection. I have tried ping address -f -l 1472
to verify that this is not a "black hole router" problem http://support.microsoft.com/kb/314825. Same delay regardless of whether a mapped drive or a UNC path is used or whether the connection is made by specifying the IP address instead of the host name. Disabling "NetBIOS over TCP/IP" also did not help. The registry does not have any advanced TCP/IP settings (such as MTU) altered from their defaults. I tried reducing MTU in the registry and that didn't help either.
Any ideas? Also, workarounds that involve adjusting the local XP machine configuration instead of the LAN/WAN or server configuration would be greatly appreciated, if possible.
Sniff the traffic between the client and the server and see what's happening. There's no better way to get to the bottom of a protocol problem than to see what the computers are saying to each other.
The SMB protocol is a total dog when running over a latent network. I also suspect that you're seeing "progress bar" activity sooner on uploads because of implementation artifacts of the Windows shell, not because data is actually being transferred sooner. My guess is that the same kind of things are happening on every file transfer. A per-file latency lends some creedence to such a conclusion.
In a past life, we fixed file-transfer issues (specifically over VPN) by reducing the MTU. Ping tests using packet sizes around the MTU thresholds were not useful; it's true that you rule out black-holes, but if you had a black-hole you probably wouldn't be able to do anything useful through the VPN.
Our problems were specifically with SMB file transfers, I believe due to packet fragmentation. Reducing the MTU to the 1400-1420 range helped significantly. Remember, VPN encapsulation adds more headers to each packet (it's been a while so I forget the specifics and I'm too lazy to google it tonight), but 1472 + ESP+AH + Ethernet is far more than 1500 (assuming you're using IPSec/ESP+AH). From my experience, excessive fragmentation is not a black or white problem; just because certain tests pass at certain times doesn't mean you can rule it out later.
Since the server is sending the data in this case, you might see if you can set the MTU on the server. It's also the common point that all SMB client connect to, so changing the MTU once there might eliminate the need to set MTU on all the clients (as Path MTU is negotiated between end points).
Also, to those telling him to run wireshark -- what specifically would you tell him to look for? I'm not sure that this is a 'protocol error' -- SMB is probably resetting its connection because of the underlying latency and/or packet drops, so technically SMB is doing its job (it's just a fragile protocol) -- and I'm not sure that tcpdump/sniffer/wireshark is going to point them to a solution. I love tracing network network conversations as much as the next CCNP, but in the wrong hands a scalpel is useless/dangerous. He already said he's not a network admin...