I am a developer that has the good+bad situation of designing a network service that will be hit very hard by iPhone clients. The iPhone app has over 10MM downloads in the past year and now I'm bringing the users online to interact with each other.
I would like to tune the TCP implementation for the servers that will host my TCP-based network service. The per-request size sent will be "small" (say < 256 bytes). OK, you figured it out, it's a game server (shocker!).
FYI, I am not interested in UDP (or a reliable layer atop UDP as seen in ENet and RakNet for instance) for this particular service as the games are not Quake-like; all packets must be reliably received, and that's what TCP was designed for. Thus, the connections between the iPhone client and the service will be "long-lived" (as much as possible -- tunnels and elevators be damned!).
FYI, I'm running the service on a 100Mbps uplink on servers that run Linux 2.6.18-164.9.1.el5.
My goals are to simultaneously:
- keep latency as low as possible; and
- minimize the amount of memory used per connected client.
There are a large number of TCP-related knobs to tweak! After some basic research it seems that most people recommend leaving the settings as is. However, there are a number of settings that seem like they should be tweaked for particular cases. That's a little vague I know, and that's why I'm asking for help.
Things to consider tuning for small requests/responses on flaky networks while minimizing memory as much as possible might be:
- memory available to the TCP/IP implementation
- setting the "nodelay" option (disable Nagle algorithm since this is a semi-real-time game server)
- congestion control algorithms
- etc. (what else?)
Consider TCP congestion control algorithms:
- reno: Traditional TCP used by almost all other OSes
- cubic: CUBIC-TCP
- bic: BIC-TCP
- htcp: Hamilton TCP
- vegas: TCP Vegas
- westwood: optimized for lossy networks
My servers default to bic whose "goal is to design a protocol that can scale its performance up to several tens of gigabits per second over high-speed long distance networks while maintaining strong fairness, stability and TCP friendliness."
Just from the tiny description, Westwood sounds more apropos since it "is intended to better handle large bandwidth-delay product paths (large pipes), with potential packet loss due to transmission or other errors (leaky pipes), and with dynamic load (dynamic pipes)".
Am I getting in too deep here or is this par for the course?
What types of things do you guys tune TCP/IP for generally? How? What rules of thumb are there to know?
What words of wisdom do you have for my particular case?
Thanks a lot!
So, as you've found out, TCP congestion control is a pretty complicated area.
For this particular case, because of the small requests, you're going to want to try to keep the connections open as much as possible, because one connection per request is going to take five packets each, whereas you can get the average down to a little more than two packets if you keep connections around.
NODELAY is the right thing for a game server; you want your 256 bytes delivered right away, and that's not a whole segment, so Nagle will pause unless you use NODELAY.
If your servers have loads of memory, the memory options are no big deal, new kernels have them right.
As for congestion control algorithms, you spotted Westwood. The other option is CUBIC. You can just go with one, or you can do some research and benchmark them. That could be quite a bit of work, but for 10M clients it's worth it. So, I'd be looking in to running a simulation using a traffic generator on a Mac or three (since they have the same TCP implementation as the phone), a Linux box in between acting as a router (more about this shortly) and one of your servers, to see how it goes.
Now, that middle Linux box should run ns-3 so you can simulate a more complicated path than just an ethernet switch. You then capture some packet traces on the sending end of the TCP connections, and analyse them with tcptrace or the tcptrace graphing modes of wireshark. The tcptrace documentation is a good introduction to analysing TCP congestion behaviour.