I want to make a server for my static content.
I need to serve some 3-10 mb files - a lot. (I will also put on this server some .js and .css and images from my websites).
I thought of nginx and G-WAN ( http://trustleap.com/ ).
What I don't know is what resources are needed for serving static content? How much RAM is used for each file transfer?
If I will go with a 256 mb (or 512 mb) VPS with good port and huge bandwhich how many hits /seconds will I be able to serve (3-10 mb files)? (I know "it depends" - but please give me a rough estimation based on experience or theory).
There are not a lot of files, just often downloaded - should I consider caching, or this will only use my memory needed for serving hits?
If you're using nginx, then you're talking just a few KB of overhead per active connection. If you're using something like Apache, you'll have one thread per connection, which means hundreds of KB or even megabytes per connection.
However, nginx does not support asynchronous disk IO on Linux (because async disk IO on Linux is basically horribly broken by design). So you will have to run many nginx worker processes, as every disk read could potentially block a whole worker process. If you're using FreeBSD, this isn't a problem, and nginx will work wonders with asynchronous disk and network IO. But you might want to stick with Apache if you're using Linux for this project.
But really, the most important thing is disk cache rather than the web server you choose. You want lots of free RAM so that the OS will cache those files and make reads really fast. If the "hot set" is more than say 8 GB, consider getting less RAM and an inexpensive SSD instead, as the cost/benefit ratio will likely be better.
Finally, consider using a CDN to offload this, and getting a really cheap server. Serving static files is what they do, and they do it very fast and very cheaply. SimpleCDN has the lowest prices, but MaxCDN, Rackspace, Amazon, etc. all are big players at the low end of the CDN space.
If the OS can cache the hot part of the content into ram, it will not use the disk and will serve things really quickly. Hundreds of request per second should be possible on a VPS, you will most likely saturate the network well before you run into CPU limits.
If the content does not fit into ram, then disk IO (seek, throughput, filesystem fragmentation) will come into play and the equation changes.
The webserver will add a memory overhead per client, but nginx can do that in a few kilobyte per connection.
Hope these pointers can help you.
First, for the same number of workers, G-WAN v4.7+ is using far less RAM than Nginx at startup:
G-WAN uses threads (one per core typically), Nginx uses processes (one per core typically), and processes drag more overhead, require synchronization via shared memory, etc. Both use the "asynchronous" model of event handling.
Note that here G-WAN can automatically grow to more than 1 million of concurrent connections while Nginx is limited to its
worker_connections
settings (defined at only 4096 in the ab.c test above).The short story is that G-WAN v4.7+ (where in-memory caching is disabled by default) consumes much less RAM than Nginx, for all file sizes, while serving more requests per second.
The long story is that while Nginx consumes more and more memory even with new HTTP keep-alived requests, G-WAN's memory usage can stay stable for HTTP keep-alived requests, and it grows far less than for Nginx with non-keep-alived requests.
Our weighttp wrapper ab.c measures the memory consumption of the server application and of the system for the duration of the test. And it shows that Nginx puts a heavier weight on the system regarding the memory resources consumption.
This is due to the way each web server is handling requests and allocating memory.
Both servers (Nginx and G-WAN) use
sendfile()
so the kernel (rather than the application) is allocating the resources for I/O.The web servers will still allocate resources, but that's for maintaining the context of each connection rather than to buffer disk I/O.
Therefore, the momory consumption depends on the size of the file chunks sent at each
sendfile()
call rather than directly on the total file size.The totfile size has an influence on the long-term for high concurrencies, but that's due to the amount of chunks that need to be cached by the kernel.
Any more question, drop us a line at G-WAN. We have heavily invested in CDN-like applications.