I'm working with a pair of co-located CentOS Linux servers sitting behind a Sonicwall PRO 2040 Enhanced firewall running in transparent bridge mode.
These servers are having a strange problem downloading files more than a few megabytes in size. For example, if I try to wget or FTP a copy of the Linux kernel from kernel.org, the first ~1-2MB will download at 600+K/s, and then throughput will drop off a cliff to 1K/s.
I've reviewed all the firewall configuration settings for anything suspicious, but found nothing. More interestingly, I performed the same download with a Windows server sitting behind the same firewall, and it sailed right through at 600+K/s the whole way.
Has anyone seen this? Where should I start looking to troubleshoot this problem?
We too are experiencing the same problem. Anything larger than what can be transferred in the initial download burst (~3.7mb for us), trickles off to ~1-4kb a second regardless of the bandwidth available.
It seems to be a problem specific to and common with the SonicWall PRO 2040 Firewall - https://discussions.apple.com/message/12250946?messageID=12250946
The root of the problem is the firewall and the best long-term fix is to find a setting on the firewall to allow the TCP Window Scaling option to be turned on and also use the initiating machine's TCP Window Scale Factor correctly in the initialization of the connection.
Though this article refers to routers, the same logic applies to what's happening with the SonicWall Pro 2040 Firewall, http://lwn.net/Articles/92727/:
Similar to what was mentioned above, there are workarounds for individual machines - http://prowiki.isc.upenn.edu/wiki/TCP_tuning_for_broken_firewalls, by turning off the rfc1323 TCP extension, the firewall is never given the opportunity to pass a TCP Window Scale Factor of 0 and instead passes along that the rfc1323 extension is not enabled, presumably using the maximum allowed window size by TCP without the rfc1323 extension, which is 64kb.
Commands we've used on our various machines as a temporary workaround:
Ubuntu 10.10:
Change takes effect immediately:
Permanent change, after next reboot:
Mac OSx:
Change takes effect immediately:
Permanent change, after next reboot:
Win7:
See available options:
Disable Command (Persistent):
In response to why the Windows Server was not having any problems, I found this article - http://msdn.microsoft.com/en-us/library/ms819736.aspx
Those firewalls will bog down if you have Intrusion Prevention and/or Antivirus turned on. Especially if you have TCP Stream selected as one of the types to scan. It will try to build the whole file in its memory to scan it...
Temporarily disable those features and see if your performance climbs back up. If so, then look at adding your servers to the exception list so you don't have drop your pants for the whole network.
Do you see the problems downloading to the Linux server from within the Network? If not that it must be something to do with combination of Linux and the Firewall. On the firewall, can you watch CPU usage or look for warnings? What about resetting the firewall?
Maybe after the first MB or so an adjustment is made by Linux automatically to the TCP options (or maybe Layer 2), and the firewall doesn't like this? Looking at the various network options in /proc might give you an idea. Also, a packet dump on Linux might show some change in what is going on when the slowdown happens.
Though I haven't found the root cause of this, I did find a quick workaround that lets me get file transfers through:
sysctl -w net.ipv4.tcp_window_scaling=0
The kernel default for TCP window scaling is on, but that command lets me temporarily disable it. I haven't persisted the setting permanently via sysctl.conf because I'm not sure about its overall performance effects, but it works in a pinch and then I can flip it back to 1 when I'm done.
Try changing theTCP windows on the Sonicwall.
There's a lot of initial diagnostics left to perform here.
Errors in
/var/log/messages
?Errors in
dmesg
?Packet loss evidenced in
/sbin/ifconfig
?Issues with link negotiation?
Are there any differences, physical or not, between the Windows box and Linux box?
Edit 1
Can you reproduce the performance using different protocols and sites?