I'm building an analytics package, and project requirements state that I need to support 1 billion hits per day. Yep, "billion". In other words, no less than 12,000 hits per second sustained, and preferably some room to burst. I know I'll need multiple servers for this, but I'm trying to get maximum performance out of each node before "throwing more hardware at it".
Right now, I have the hits-tracking portion completed, and well optimized. I pretty much just save the requests straight into Redis (for later processing with Hadoop). The application is Python/Django with a gunicorn for the gateway.
My 2GB Ubuntu 10.04 Rackspace server (not a production machine) can serve about 1200 static files per second (benchmarked using Apache AB against a single static asset). To compare, if I swap out the static file link with my tracking link, I still get about 600 requests per second -- I think this means my tracker is well optimized, because it's only a factor of 2 slower than serving the same static asset repeatedly.
However, when I benchmark with millions of hits, I notice a few things --
- No disk usage -- this is expected, because I've turned off all Nginx logs, and my custom code doesn't do anything but save the request details into Redis.
- Non-constant memory usage -- Presumably due to Redis' memory managing, my memory usage will gradually climb up and then drop back down, but it's never once been my bottleneck.
- System load hovers around 2-4, the system is still responsive during even my heaviest benchmarks, and I can still manually view http://mysite.com/tracking/pixel with little visible delay while my (other) server performs 600 requests per second.
- If I run a short test, say 50,000 hits (takes about 2m), I get a steady, reliable 600 requests per second. If I run a longer test (tried up to 3.5m so far), my r/s degrades to about 250.
My questions --
a. Does it look like I'm maxing out this server yet? Is 1,200/s static files nginx performance comparable to what others have experienced?
b. Are there common nginx tunings for such high-volume applications? I have worker threads set to 64, and gunicorn worker threads set to 8, but tweaking these values doesn't seem to help or harm me much.
c. Are there any linux-level settings that could be limiting my incoming connections?
d. What could cause my performance to degrade to 250r/s on long-running tests? Again, the memory is not maxing out during these tests, and HDD use is nil.
Thanks in advance, all :)
EDIT Here is my nginx config -- http://pastie.org/1450749 -- it's mostly vanilla, with obvious fat trimmed out.
You're abusing Nginx's worker_threads. There is absolutely no need to run that many workers. You should run as many workers as you have CPUs and call it a day. If you're running gunicorn on the same server, you should probably limit nginx workers to two. Otherwise, you're just going to thrash the CPUs with all the context switching required to manage all of those processes.
I have used nginx to serve 5K request a second for static content. You can increase the number of worker_connections which are currently set to 1024.
The max_client calculation would be as follows.
The worker_connections and worker_proceses from the main section allows you to calculate maxclients value:
max_clients = worker_processes * worker_connections
In a reverse proxy situation, max_clients becomes
max_clients = worker_processes * worker_connections/4
http://wiki.nginx.org/EventsModule#worker_connections
Calculating the max worker connections is easy once you know the capacity of your setup. Total capacity/number of core is max worker connections. To calculate total capacity there are multiple ways.
If you the above method doesn't work for you then try the methods below. I am doing broad assumptions ignoring RAM and IO, they will also factor in but these will give you starting points and you can make adjustments from there on.
Assumption that bandwidth is the bottleneck, take the average object size that nginx is serving and divide your bandwidth with that and you'll get the max supported qps.
In the second assumption, CPU is the bottleneck. In this case measure the request time and divide 1 by it and multiple with the number of cores in your system. This will give the number of request per second nginx can handle.