I am trying to understand how massive sites like Facebook or Wikipedia work, for my intellectual curiosity. I read about various techniques for building scalable sites, but I am still puzzled about one particular detail.
The part that confuses me is that ultimately, the DNS will map the entire domain to a single IP address, or a handful of IP addresses in the case of round-robin DNS.
For example, wikipedia.org has only one type-A DNS record. So, people from all over the world visiting Wikipedia have to send a request to the one IP address specified in DNS.
What is the piece of hardware that listens on the IP address for a massive site, and how can it possibly handle all the load coming from the requests for users all over the world?
Edit 1: Thanks for all the responses! Anycast seems like a feasible answer... Does anyone know of a way to check whether a particular IP address is anycast-routed, so that I could verify that this really is the trick used in practice by large sites?
Edit 2: After more reading on the topic, it appears that anycast is not typically used for dynamic web content. Anycast is usually used for UDP (e.g., DNS lookups), or sometimes for static content.
One interesting thing to note is that Facebook uses profile.ak.fbcdn.net to host static content like style sheets and javascript libraries. Each time I ping this name, I get a response from a different IP address. However, I can't tell whether this is anycast in action, or a completely different technique.
Back to my original question: as far as I can tell, even a large site will have a single expensive piece of load-balancing hardware listening on its handful of public IP addresses.
It isn't necessarily a piece of hardware doing this but a complete system that has been designed to scale. This not only encompasses the hardware but more importantly the application design, database design (relational or otherwise), networking, storage and how they all fit together.
A good starting point for your curiosity on finding out how some of the large sites scale is High Scalability - Start Here and High Scalability on Wikimedia architecture, Facebook and Twitter as examples.
Regarding your question about DNS and single IP addresses and round-robin these types of sites will often use load balancing as a method of presenting a single IP address. This can be done either by specialised hardware load balancers or through software running on general purpose servers. The incoming requests to the IP managed by the load balancer is then distributed across a series of servers transparently to the end user.
For a good explanation on this topic, including a comparison of hardware and software load balancers/proxies and how they compare to DNS round robin, have a read of Load Balancing Web Applications.
Anycast can also be used for TCP connections, assuming the connections are short-lived so the routes do not change during the connection lifetime. This is a good assumption with HTTP connections (especially if Connection: Keep-Alive is kept to a short timeout or disabled).
Many CDNs (CacheFly, MaxCDN, and probably many others) actually use anycast for TCP connections (HTTP), and not just DNS. When you resolve a hostname on CacheFly, you get the same IP address around the world, it is simply routed to the "closest" CacheFly cluster. "Closest" here would be in terms of BGP path length and metrics, which is usually a better way to measure network latency than simple geographic distance.
In the case of Wikipedia specifically: http://www.datacenterknowledge.com/archives/2008/06/24/a-look-inside-wikipedias-infrastructure/
The easiest way to verify if an IP address is using Anycast is to do a traceroute from different location. You can try the following : go to traceroute.org , pick a location and try to do a traceroute to IP address 8.8.8.8 ( Google Public DNS that use anycast ). You should be able to see that traceroute from server in Australia to 8.8.8.8 stay in Australia.
Instead of ping, try to do hostname lookup : eg : http://network-tools.com/default.asp?prog=dnsrec&host=profile.ak.fbcdn.net
You will see the list of IP address behind that name. These IP addresses will be use in a round-robin fashion when you ping the server.
Igor, your question is great, and like so many innocent questions, there are many, many answers, all at different levels of details.
The piece of hardware is a web server. Obviously ;-)
The piece of hardware is actually a cluster of load balancers, all of which are configured to pull from shared storage so they're all identically configured with identical material.
The piece of hardware is actually one of several clusters of load balancers, geographically dispersed, and you were directed to the one closest to you, a decision made by the DNS server.
Google released a bit on their homegrown hardware architecture last year and it makes for a good read.
A single IP address does not necessarily mean a single server: http://en.wikipedia.org/wiki/Anycast
Larger sites use several different techniques together. Those websites you mentioned do all have in almost every country several server. Based on the IP address of the website visitor the DNS server is giving back an IP address of the cluster which is the nearest to the visitor. Akamai is providing such a service (click on the picture on this website for more information.)
Those "clusters" in this datacenter consist now of several different machines (DB server, web server, load balancer, etc.) Depending on what you are providing with your website you have maybe some servers for the static content etc.
Mmassive sites like Facebook or Wikipedia rely on several different technologies to achieve scalability.
One of those technologies is dns. Dns is configured to load balance with round robin. The dns configuration is smart enough to figure out where your request is coming from and to return the address of the site that is closest to you. So if you do a dig you will see multiple records, but if you do a ping you will always get back the same address.
At the site, the first piece of hardware you hit is a reverse proxy or a load balancer pool. The pools are setup so all machines answer the same IP but return a new IP in the session header. All further requests will go through the same node.
The load balancers employed for large sites are not large expensive pieces of equipment, they are commodity servers running LVS. http://www.linuxvirtualserver.org/
Massive sites like Google almost certainly design their own hardware. Large sites would probably use a multi-layer switch to load balance connections to multiple actual servers. http://en.wikipedia.org/wiki/Multilayer_switch