I found this explanation how a CDN works. But there is one thing I don't really understand. Let's assume I setup multiple DNS servers at my location and they use the nameserver domains dns1.example.com
, dns2.example.com
and dns3.example.com
. This DNS servers are able to deliver a server IP depending on the visitors location (ping, geo database, browser language or whatever). Now I update this nameserver settings for my domain www.example.org
at the registry.
Now, the very first request on www.example.org
with an expired TTL tries to resolve the domain. It asks:
- the local .hosts/DNS, if TTL expired:
- the internet providers DNS, if TTL expired:
- the root DNS, if TTL expired:
- my local
dns1.example.com
But if I understand it correct, the new IP is then added to all these nameserver caches until the TTL expires again. So how is it possible to send other IPs to the visitor depending on his location?
In this answer theandym said the request is "forwarded", but I don't think this is how a CDN works, because "forwarding" means lengthen the transmission way resulting a longer loading time. Or does a CDN require zero TTL for the domain?
Update1
Through this question I found Google's document describing how they optimized CDN performance. It did not explain how the CDN works in general, but there were interesting explanations like the following:
Thereafter, whenever a client attempts to fetch content hosted on the CDN, the client is redirected to the node determined to have the least latency to its prefix. This redirection however is based on the prefix corresponding to the IP address of the DNS nameserver that resolves the URL of the content on the client’s behalf, which is typically co-located with the client.
This means Google checks at first the latency of all IP prefixes and defines a DNS resolution table (?) for all available prefixes. And if a visitor has the IP 198.51.100.231
the Google server IP is used, that is set for the prefix 198.51.100.0
. But again: How does Google's DNS know which IP the visitor is using? Most visitors resolve Google's domain through their internet provider and by that the resolving is done through those external DNS servers or not?
As an additional example: If I start a DNS resolution for the domain facebook.com
with different online tools (hosted in different countries) it is resolved to different IPs with different domains like:
- 31.13.92.36 Reverse: edge-star-mini-shv-01-frt3.facebook.com
- 31.13.76.68 Reverse: edge-star-mini-shv-01-sea1.facebook.com
- 31.13.69.228 Reverse: edge-star-mini-shv-01-iad3.facebook.com
- 157.240.2.35 Reverse: edge-star-mini-shv-01-ort2.facebook.com
After that I thought it could depend on the DNS server location used by the visitor, but I tried my own (Deutsche Telekom, Germany), Google's (8.8.8.8) and a major one from France (Orange) and they all returned for facebook.com
the IP 31.13.92.36
.
Ok it seems I can now give a rough answer to my own question. Anurag Bhatia says that there exist two methods how a CDN works:
DNS
Lets say we have a server with the IP
1.2.3.4
located in USA and a cache-server with the IP2.3.4.5
located in Germany. Now a visitor tries to resolve the domainexample.org
. If he did not change his network settings he uses the DNS server of his internet service provider (ISP). And this ISP asks nowdns1.example.com
(the nameserver of the domain) for the IP. Now it depends on the location of the ISP. If its located in Germany thedns1.example.com
returns2.3.4.5
and if its located in the USA it returns1.2.3.4
.But there might be a disadvantage with this method: Every time a user changed his network settings and uses an EDNS0 (see IETF draft) incompatible DNS provider (for example a corporate's central DNS server) the
dns1.example.com
will answer again with the nearest IP to those DNS locations, but this time the visitor is most likely in a different location causing a higher latency.EDNS0 compatible DNS providers are passing information about the user to the authoritative DNS server. So the authoritative DNS server can respond with the IP next to the location of the user:
Anycast
I don't really understand Anycast because of BGP, etc., but I think the further explanation of Anurag Bhatia gives an idea how it could work:
Anycast has also a disadvantage: Routing is flexible. While at the start of a TCP session the target node might be located in network A it may change to network B. Therefore Anycast will be used in practice for UDP only. UDP is a session-less protocol.
Most CDN are using DNS for HTTP/HTTPS traffic and Anycast for DNS requests.
Your application will point to CDN (for assets, images, APIs etc). Then the CDN will either use the cache or fetch data/files from your servers. In your example, you will point to cdn.example.com and CDN will route it to dns1.example.com. cdn.example.com will fetch data from the nearest location on the anycast network so IPs can be different.
Source:
https://www.youtube.com/watch?v=JX2qrdp0WT4
https://www.akamai.com/blog/developers/how-cdn-can-make-your-apis-more-powerful