I am just curious about the server configurations to serve just static files from one server.
Is it possible to build a server that just for static files and serve millions of concurrent connections. What could be the best HTTPD service for this?
The server will only serve static files from directories and will not use any other service beside a HTTPD and ofcourse no PHP.
Millions of concurrent connections? Unless you are hosting video streams or other large files, http requests usually finish in such short time that even with busier sites you don't get that many concurrent connections. If you do, you seriously don't have only one server. Also with that kind of traffic your network traffic would be on such high level that I don't think one server would be your best bet.
But, let's imagine you truly run a service with only one server and millions of concurrent connections: nginx or lighttpd would then be your best bet. Next you probably would need to adjust many kernel parameters such as
fs.open
. Also you probably would need to compile your own kernel.Here are slides about how HEANET scaled their Apache 2.x to 20 0000+ concurrent sessions. Note that even that required quite a lot of tinkering.
I expect you're being a tad optimistic with your traffic estimates, but your best bet for large-scale static asset service will be nginx. Note that with that many concurrent connections you'll have to tweak some kernel parameters.
"millions" of concurrent connections might be a little difficult to achieve, but most web servers fall into these architecture types: pre-fork (1 process/1 thread, 1 connection), threaded (1 process/many threads, 1 thread per connection), event-driven (1 process/1 thread, many connections). There are of course hybrids of these, such as apache mpm_worker which is a hybrid of pre-fork and threaded.
In general, pre-fork will handle the fewest number of connections because creating a new process per connection is expensive and consumes a lot of resources. Threaded is a little bit better, but thousands or millions of threads can have a lot of overhead as well. Event-driven systems are typically 1-process/1-thread and use async/non-blocking IO to achieve very high concurrency with minimal resource overhead.
You will probably want to stick to the event-driven family to get close to your "millions of concurrent" goal. Some event-driven apps are limited to 1 CPU. If you are on a multi-cpu machine, you will want to run 1 instance per cpu (some webservers might take care of this for you, while others will require you to script this out and manage it yourself.)