I'm looking for help with what I'm sure is an age old question. I've found myself in a situation of yearning to understand network throughput more clearly, but I can't seem to find information that makes it "click"
We have a few servers distributed geographically, running various versions of Windows. Assuming we always use one host (a desktop) as the source, when copying data from that host to other servers across the country, we see a high variance in speed. In some cases, we can copy data at 12MB/s consistently, in others, we're seeing 0.8 MB/s. It should be noted, after testing 8 destinations, we always seem to be at either 0.6-0.8MB/s or 11-12 MB/s. In the building we're primarily concerned with, we have an OC-3 connection to our ISP.
I know there are a lot of variables at play, but I guess I was hoping the experts here could help answer a few basic questions to help bolster my understanding.
1.) For older machines, running Windows XP, server 2003, etc, with a 100Mbps Ethernet card and 72 ms typical latency, does 0.8 MB/s sound at all reasonable? Or do you think that slow enough to indicate a problem?
2.) The classic "mathematical fastest speed" of "throughput = TCP window / latency," is, in our case, calculated to 0.8 MB/s (64Kb / 72 ms). My understanding is that is an upper bounds; that you would never expect to reach (due to overhead) let alone surpass that speed. In some cases though, we're seeing speeds of 12.3 MB/s. There are Steelhead accelerators scattered around the network, could those account for such a higher transfer rate?
3.) It's been suggested that the use SMB vs. SMB2 could explain the differences in speed. Indeed, as expected, packet captures show both being used depending on the OS versions in play, as we would expect. I understand what determines SMB2 being used or not, but I'm curious to know what kind of performance gain you can expect with SMB2.
My problem simply seems to be a lack of experience, and more importantly, perspective, in terms of what are and are not reasonable network speeds. Could anyone help impart come context/perspective?
The mathematical formula you are referring to is actually the way to determine the most efficient transmit window size settings for TCP, not the actual bandwidth available. TCP uses a mechanism called sliding windows that allows for adjustment of transmit speeds based on network conditions. The idea is that a TCP transmitter will send more and more data without requiring an acknowledgement from the receiver. If there's a loss of data then the amount of data sent between acknowledgements decreases, thus also decreasing the effective bandwidth.
The formula in question actually determines the ideal sizing of that TCP transmit window based on the latency and round-trip latency between a given pair of hosts. The idea is to have a window sized such that the amount of data 'in flight' corresponds to what's known as the bandwidth-delay product. For example, if you have 50 megabits per second (6.25 megaBYTES) and an average round-trip latency of 100ms then you'd have 6.25 * 0.1 = 625 kilobytes of data. This would be the value that TCP would negotiate (if configured correctly). As the latency and bandwidth characteristics of your links varies then so too does the window size.
What you need is a bandwidth management tool like iperf (free) running on both the source and your various destinations. This should give you an idea of the actual amount of throughput possible (independent of other apps) while also providing some insight into latency. Running an extended ping between hosts will also provide a general idea of latency characteristics. When you have this data you'll have a better idea of what you should be seeing as far as throughput goes.
BTW - The use of any kind of LAN optimizer will often incorporate data compression, TCP optimization, caching, etc.. While handy, it can obscure the nature of the underlying links. Once you have an idea of the raw bandwidth / delay (and packet loss, potentially) you can take a closer look to make sure your various hosts are set up to take proper advantage of available bandwidth.
Try "ping -l 8092" or FTP or HTTP to check if it is SMB issue.
First of all: what media do you use to connect computers? What us "100mpbs"? Ethernet? You can't use it for "distributed geographically" computers, right?
In case of "vpn over Internet" routers between your computers may use different links: one is fast, other is not. They may choose link based on many parameters.
Please describe your network.
That could be an MTU issue also: several links may have different MTUs.
A lot of comments and users have offered great advice here. Some of them hit pretty close to what I was looking for, but I was also fortunate enough to meet with a Network veteran from our company that's helped clarify things. I thought I would post my findings/understandings here for the benefit of others. Please feel free to correct me if any of this seems off:
1.) The maximum throughput of a single TCP session with 72ms latency and a 64K window is right around 0.8 MB/s, making that speed reasonable for single thread, single session copies, like the ones we performed with robocopy.
2.) This speed difference appears to come down to the effectiveness of the transfer method. In our case, we were using Robocopy and Repliweb. I discovered robocopy uses a single TCP session, whereas Repliweb can open multiple sessions to send data through.
3.) Research from Microsoft's website does show SMB2 as having a considerable performance gain over SMB1. There have, however, been problems in some cases in how the OS negotiates which protocol to use, so one should be aware of both a.) which cases SMB2 may be used, and b.) whether or not SMB2 is actually being utilized based on network captures.
Currently, it looks like Wire-shark can determine the use of the SMB2 protocol.
I hope this helps. Again, my understanding is fairly rudimentary here, feel free to expand.