I'm having a hard time understanding how a governing body assigns IP addresses, companies use BGP to advertise those IPs, and how the internet works. Then, where the hell does DNS come in?
Can anyone suggest a good read of how this stuff actually works? I suppose I have several questions. The first is, does ARIN (or any other governing body) actually matter? If they weren't around, would there be chaos? When they assign a block, they don't LITERALLY assign it? You have to use BGP to advertise, correct? I have always been used to a closed hosting environment (dedicated/shared) where you have routed IPs.
Then, how does DNS come in to play? With my registrar I am able to register a DNS server (eNom) - what does that actually mean? I've installed Bind and made all of that work, and I run my own DNS servers, but who are they registering that DNS server with? I just don't get it.
I feel like this is something I should know and I don't, and I'm getting really frustrated. It's like.. simple.. how does the internet work? From assigning IPs, to companies routing them, and DNS.
I guess I have an example - I have this IP space let's say 158.124.0.0/16 (example). The company has 158.124.0.0/17 internet facing. (First of all, why do companies get blocks of IPs assigned and then not use them? Why don't they use use reserved internal space 10.x and 192.x?). So, that's where I'm at. What would I do to actually get these IPs on the Internet and available? Let's say I have a data center in Chicago and one in New York. I'm not able to upload a picture, but I can link one here: http://begolli.com/wp-content/gallery/tech/internetworkings.png
I'm just trying to understand how from when the IP block is assigned, to a company using BGP (attaining a public AS #?), and then how DNS comes in to play?
What would something look like from my picture? I've tried to put together a scenario, not sure if I did a good job.
Leased IP Blocks
IPs are assigned in blocks by IANA to the Regional Internet Registries (RIR). See this (list and map) of the RIRs. The RIRs then lease out smaller blocks IPs to individual companies (usually ISPs). There are requirements (including fees and proof of use) for getting a distribution and failing to maintain these means a loss of lease.
Once a company has leased one or more blocks from the RIR, they need some way of telling the rest of the world where to find a particular IP (or set thereof: subnets). This is where BGP comes into play. BGP uses a large network concept called an Autonomous System (AS). The AS knows how to route within itself. When routing to another network it only knows about AS Gateways and where the "next hop" toward those external addresses. AS numbers are managed by IANA as well.
Within an AS, even one as large as an ISP, they might use several routing protocols (RIP, OSPF, BGP, EIGRP, and ISIS come to mind) to route traffic internally. It's also possible to use Static Routing Tables, but entirely impractical in most applications. Internal routing protocols are a huge topic, so I'll simplify by saying there are other questions on Server Fault that can do those topics more justice than I can here.
DNS
Humans don't remember numbers well, so we invented host names. Skipping the history, we use the Domain Naming System (DNS) to keep track of what hostname points to what IP address. There is a central registry for these, also managed by IANA, and they determine what Top Level Domains (TLD) (eg ".com" or ".net") go in the Root Zone, which is served by the Root Servers. IANA delegates administration of the "root zone", this administrator only accepts updates from qualified Registrars.
You can use a Registrar to "purchase" a domain name, which is a subdomain of a TLD. This registration essentially creates that subdomain and assigns you control over it's Name Server (NS) and Glue (A) records. You point these to a DNS server that hosts your domain. When a client wants to resolve your IP from a domain name, the client contacts their DNS server which does a recursive lookup, starting with the root server, finding your DNS server and eventually getting the relevant information.
Everyone Agrees
As for the "governing bodies": everyone just agrees to use them. There are no (or very few) laws requiring anyone to cooperate at all. The Internet works because people choose to cooperate. The governing bodies provide a means of easy cooperation. All the various RFCs, "Standards", and such - nobody is being forced to use them. But we understand that society is built on cooperation, and it's in our own self interests to do so.
The efficiency bred by cooperation is the same reason BGP is popular, everyone basically agrees to use it. In the days of ArpaNet they started with hand configured route tables; then gradually progressed to a more comprehensive system as the Internet grew in complexity, but everyone just "agreed" to use whatever new standard. Similarly name resolution stated with host files that networks would distribute, and eventually grew into the DNS system we know today. ("Agreed" in quotes because many times a minority set a requirement for a new standard and nobody else had a better alternative, so it was accepted).
Trust
This level of cooperation requires trusting IANA, a lot. As you've seen they manage most of the various systems' cores. IANA is currently a US Government sponsored Non-Profit corporation (similar to the US Post Office), it is not part of the government, though only barely removed. In past years there was concern that the US Goernment might exercise some control over IANA as a "weapon" against other world governments or civilians (particularly through laws like SOPA and PIPA, which were not passed, but may be the basis for future laws).
Currently IANA has taken it upon themselves to raise funding (despite being a non-profit company) through the creation of new TLDs. The "xxx" TLD was viewed by some as an extortionist-style fundraising campaign, as a large percentage of registrants were "defending" their name. IANA has also taken applications for privately owned TLDs (at $180,000 each; they have suspended the application process after being inundated with applications, nearly half being from Amazon alone. Many of these applications resulted in new gTLDs.
All advertisements to the public internet, the DFZ (Default-Free Zone), is done via BGP (Border Gateway Protocol), how ISP's do internal routing varies a lot. Most would use BGP internally as well both between their own routers (BGP is often used in conjunction with an IGP such as OSPF) and also with clients, if you don't have your own AS number you can use a private AS to peer with your ISP and when they announce your address space to the DFZ they simply remove the private AS from the as-path. For smaller non-redundant links you can use static routing as well on the PE. The actual "assignment" is just in the database of your registrar, the whois database, RIPE/ARIN etc run their own databases for this purpose.
Try running the command
whois 158.124.0.0/16
on a Linux box.Same goes with DNS, the reverse DNS server is specified in whois records.
This is a very old question, but I had many of the same questions in figuring out how the Internet works. Like the other answers, the networking books give an overview of BGP and DNS but still left me confused. For example, a.root-servers.net through m.root-servers.net are given as the root servers, but how does a DNS service know where to find those servers if they can't use DNS themselves.
The basics of IP, subnetting, DNS, etc. are assumed to be known by this answer. I am addressing "gaps" I, and probably the questioner, have on how the Internet works. By no means am I an expert, but this is my understanding of the gaps.
IP Addresses
The first thing to note is that when the Internet started out as ARPANET, everybody knew everybody and routing tables for IP addresses were handcoded. I assume the assignment process for IP's was done over the phone. As the Internet became too big, BGP was used by multiple networks (AS's) to advertise they had public IP's or could get to a public IP through their AS to another AS. The trust was there that an AS wouldn't advertise an IP they didn't have.
Today, there's not as much trust. Instead, ISP's can download and authenticate the IP allocations to each AS from IANA and the regional authorities. These downloads are now authenticated through public key cryptography. So when IANA "assigns an IP address," they are changing their record (or really the regional authority changes their record). All other AS's can download and authenticate their records.
These records are important because ISP's can't take the word of other ISP's that they have the IP addresses. The ISP's can compare the BGP advertisement with the authenticated IP records. If any BGP advertisement shows the last AS as an AS other than what's in IANA's and RIR's authenticated record, the BGP advertisement does not change their own routing.
More commonly, a rogue ISP or AS can advertise they have a route through their AS they don't have. AS1 has an IP registered and AS5 currently uses AS5 -> AS4 -> AS3 -> AS1 -> IP. AS2 advertises to AS5 a route of AS5 -> AS2 -> AS1 -> IP. Except AS2 doesn't actually have a connection with AS1. It can just lose the packets, maybe to frustrate AS1's hosting customers. Or AS2 could be a small company network with a multihomed arrangement with AS5 and AS1. Their router is misconfigured and advertises a path through a small company network. Nearly all ISP's throw away such advertisements of their BGP customers and only pass on terminating BGP advertisements.
More likely, you have the case of Pakistan trying to shut off Youtube in Pakistan through such IP hijacking, and shutting off Youtube outside of Pakistan too since AS's outside of Pakistan assumed their BGP advertisements were correct.
In the end, there isn't a perfect defense against such IP hijacking. In most countries like the US, such abuse of BGP can be punished as breach of contract and other ISP's will shut off peering connections with that AS if they have to. An ISP could also disregard the whole IANA and RIR apparatus and redirect the IP addresses to their own servers. That won't work for any https sites though, assuming the ISP doesn't have the private keys for any CA's. There is very little to gain from it economically. It only happens with authoritarian governments, such as Egypt recently shutting off all BGP advertisements to their ISP's from outside the country.
DNS Servers
DNS is somewhat simpler once the IP tables are correct. The root servers are all hardcorded IP addresses in the DNS server code. a.root-servers.net is 198.41.0.4 and the IP address is anycast within one AS. In the case of a.root-servers.net, the AS is Verisign and there are five different sites. In the US, the two sites are New York and LA. Anycasting is like if you had an address of 123 Main Street and you said "It doesn't matter what town you are in, go to 123 Main Street and you'll find one of my businesses." Both 123 Main Street in NY and LA will give the same answer for all top-level domains. The AS, in this case Verisign, figures out internally which server has the fewest hops through OSPF, internal BGP, and other routing protocols. So a router in Denver may go to LA while a router in Chicago goes to New York. The same routing process can be used for Anycast hosts because the hosts don't offer to route traffic.
One of the root servers gives which IP address for the com top-level domain. Then that domain gives the domain for yoursite.com. The registrars really have a contract with whoever runs the top-level domain. So if the top-level domain currently doesn't have a record for yoursite.com, it has access to add a record with their who-is server. Then, with the access the registrar gave you to yoursite.com's DNS records, you change the records in their DNS server to go to your IP address.
Because DNS all depends on multiple IP addresses going to the right place, you have the same issue as before with AS's authenticating the IP registry and then the BGP assignments. That is the key piece for an http website. Https has the added protection of certificates. So, an ISP can't reroute requests for their own root servers and top-level domain servers to give their own IP for, say, citibank.com. If they did, the IP address given to the user will be a different IP address, but their server won't have Citibank's private key.
and no, I'm not kidding(I got started with this book 15 years ago, but it's still very relevant): http://www.amazon.com/Internet-Dummies-John-R-Levine/dp/0764506749
Then, come back here with the BGP questions =)