Is it reasonable to expect decent performance from a SQL server connected via fiber private WAN - 5000 km's away. The carrier is telling us that this network is the 'best' possible, and will introduce less than 50ms latency guaranteed.
Our database is currently connected via gigabit ethernet to our other machines within the same datacenter. Getting a 5000km 'LAN Extension' between the two DC's is appealing to us. But when we connect via the Internet and VPN, we find the performance very slow doing large volumes of small transactions. We don't have any WAN experience, so not sure what most companies do in these cases. Can we simulate 5000kms of 'latency'?
5000Kms at the speed of light would be ~17ms latency. Down optical fibre, more like 25ms. And that's a best case server-to-server fibre with no delays for media conversion, switching, routing, server response, etc.
Minimum is 25ms, guaranteed maximum 50ms...
yes there are tools to simulate a slow link, e.g.
http://jagt.github.io/clumsy/ - a Windows program for network traffic changing
http://wanem.sourceforge.net/ - a Linux LiveCD for WAN emulation
and lots of suggestions https://stackoverflow.com/questions/130354/how-do-i-simulate-a-low-bandwidth-high-latency-environment and https://stackoverflow.com/questions/1094760/network-tools-that-simulate-slow-network-connection
Compare sys.dm_os_wait_stats polls between a period of simulated high network latency and an identical period of normal network latency (for the same client load and the same database statistics). For example, the following script can be used to compare [wait_rate (ms/Sec)] for the 'ASYNC_NETWORK_IO' [wait_type]. Differences measured in X to low XX % are likely to under the influence of background noise (i.e. they're meaningless). Differences measured in XXX% are likely to be relevant. To give you a qualitative understanding about the relative importance of the wait_rates you will see: If your SQL Server were to have 500 active worker threads, and if all 500 of those threads were waiting upon just one wait_type (for some given period of time), then the [wait_rate (ms/Sec)] for that one [wait_type] would be 500,000 ms/sec. I typically ignore [wait_rate (ms/Sec)]s that are less than 1000 ms/Sec.
The following script was written to collect data for a 20 second period of time. This can be adjusted (per your testing needs). The script does not aggregate wait_types into categories (for example, this script does not lump all PAGEIO% waits together).