Here's a curious ones for the gurus:
Setup:
Source Machine: Windows Server 2003 R2 machine with local hard drive. VHD file of 40GB. 1 x 1Gbps network card, Cat6 cable, switch.
Target Machine: Windows Server 2008 R2 machine with iSCSI connection to iSCSI target on separate machine (1TB, RAID5). 1 x 1Gbps network card, Cat6 cable, connected to same switch as for Source Machine. Second 1Gbps network card, Cat6 cable, connected via isolated switch to the iSCSI target.
Switches are Netgear JGS524 model (web managed).
If I copy from the Win2003R2 machine to Win2008R2 machine local drive I get 40GB in 45 minutes, 36 seconds.
If I copy from the Win2008R2 machine to the iSCSI target (local drive to iSCSI target) I get 40GB in 37 minutes 56 seconds.
If I copy from the Win2003R2 machine to the iSCSI target via the Win2008R2 machine I get 40GB in 3 hours, 50 minutes, 24 seconds.
All copies were done via the following command issued on the Win2008R2 box:
XCOPY <source> <target> /J
XCOPY /J - Copies using unbuffered I/O. Recommended for very large files.
So, what's the bit I'm missing here? Why does a back-to-back copy take in total 1 hour, 23 minutes, 32 seconds when a "straight through" copy take almost 3 times as long?
Switches show no errors, network hovers around the 3% utilisation mark for the duration of the copy (whereas the "back-to-back" copies are around the 25% utilisation mark).
What have I missed?
Could it be the 'unbuffered' copy the problem? It is possible that Windows does some tricks that can speed up the copy the the source/target is a local disk, but it reverts to a safer behaviour if it is using two net devices.
I have played with disk testing in Unix, and the OSes can play lots of tricks with the disk subsystem. Good luck.
What about the SMB-Protocol? Win2k8R2 uses SMB 2.0 while older Win-versions only have SMB 1.0 which is not as fast AND is there a virus-scanner active? On the other way the direct access of the iSCSI-Device uses different protocol with minimized overhead and no virus-scanner for shure.
Firstly, you are copying from 2003R2 to 2008R2 in both instances. Since 2003 is involved it can only use SMB1 and doesn't do much in the way of simultaneous requests and the information will be traversing the network in chunks of around 64kb and each 64kb chunk has to be acknowledged by the server as being written before the 2003R2 box sends the next one.
Now, if the 2008R2 box has to send off the iSCSI request and receive an acknowledgment before it returns the reply to the 2003R2 box then this can slow down the process. Some back of the envelope calculations suggest that for 64kb chunks you'd need 22ms between the request from 2003R2 to write a chunk and the response to say that it has been written. That seems a bit long, but is not outside the realm of possibility with all the steps involved.
I'm not sure if that is your problem, but if you are interested then you can use wireshark to look at the network traffic and verify what block sizes and delays are involved.
Another less exciting possibility is that you have your server set as Full Duplex and your switch set as auto negotiate. This combination doesn't work and results in the switch thinking the connection is half-duplex and causes it to occasionally drop packets. The dropped packets would be much worse when the server is both sending and receiving large amounts of data at the same time as would be the case with your second copy process.