This very well may not be the correct forum to post this sort of thing on but its a question that's baffled me for years.
I'm very well aware of how servers themselves work and am fairly well versed in most things computer.
Take serverfault.com for example.
I type http://www.serverfault.com/ into my browsers address bar. At that point my computer performs a reverse dns lookup to get the IP address.
I know that the dns information is located on a DNS server, but how does my computer know where to immediately look for that? Are there DNS servers at fixed IP addresses that my computer automatically goes looking for?
Second, once my computer has the IP address of serverfault.com, it goes to (in my case) comcast to start making its way to serverfaults servers. How does that process work? When running a trace route to serverfault.com, it makes about 16 hops until it finds exactly what I'm looking for.
Obviously it cant make a direct connection to serverfault.com because that would take a direct physical route to the server but what controls the request through the network? What makes it take the route that I see in the trace route?
I know this question is insanely open but if I could at least get some links to outside sources or know what to search google for that would be immensely helpful.
(I'm late to this party but let's see what I can do w/ it anyway.)
You're asking about two distinctly different things-- name resolution via DNS and IP routing. Let's tackle them individually.
The DNS Part
Typically, client computers are informed of the IP addresses of the DNS servers they are to use by way of data in "options" given to them by their DHCP server (the server that provides a "lease" of an unused IP address for the client to use).
Some computers don't use DHCP to obtain an IP address, but rather have their IP address statically-assigned. In cases where computers have statically assigned IP addresses the DNS server IP addresses are also statically assigned in the computers' configuration.
(There are more esoteric methods for getting DNS server information to clients, but the two above cover well over 90% of cases.)
It sounds like your computer is on a home network and is probably getting its IP address assigned from a DHCP server running on either a router in your home network or a DHCP server at your ISP.
In the case of a home router, it will have obtained an IP address from your ISP's DHCP server and, in the course of obtaining that IP address, will have learned the IP address(es) of the DNS servers your ISP intends you to use. Some home routers will provide the ISP DNS server addresses to the clients of the router's DHCP sever. Still other home routers will run a "mini" DNS server themselves and will direct DHCP clients to its own "mini" DNS server. Typically this "mini" DNS server will just forward requests on to the ISP's DNS servers.
If your computer is connected directly to the ISP's network w/o a router then, very likely, your computer is being provided the IP addresses of the ISP's DNS servers by the ISP's DHCP server.
IP datagrams contain the destination IP address and not a human-readable name. In order to "talk to" a remote server your computer needs the IP address of that remote server. The process of "resolving" a human-readable name into an IP address (suitable for inclusing into IP datagrams as a destination address) is called a forward DNS lookup.
I won't belabor a full description of recursive forward DNS resolution here, but, basically, your client computer sends a request to its DNS server (the one it learned about from DHCP or that is statically configured) for the name "www.serverfault.com". That request will make its way, eventually, to a DNS server at your ISP. The DNS server at your ISP will make a request to one of a list of well-known "root DNS servers". The answer that the root DNS server returns will, in turn, guide a request by the ISP's DNS server to a ".com" DNS server, then to a "serverfault.com" DNS server. Ultimately, an answer will be returned by the ISP's DNS server to your computer (possibly via DNS server in your home router, as mentioned above).
I'd encourage you to look into some more technical descriptions of how the DNS protocol works if you're interested in details.
That gets us through the DNS-related part of your question. Now, let's move on to the IP routing part of the question.
The IP Routing Part
The result of all those DNS queries will be an IP address (or multiple IP addreses, to be technical). Your browser will initiate a TCP connection to one of the addresses returned by the DNS query. This will result in your computer sending an IP datagram (destined for an IP address returned by our earlier DNS query) to the "default gateway" known by your computer. That "default gateway" is nothing more than the IP address of another computer (typically a router) that your computer "hands off" packets to for delivery to the Internet. Assuming you're using Ethernet, the specifics of how your computer "hands off" an IP datagram involve the ARP protocol and specifics that are probably a bit too deep for this answer.
You might ask: How does your computer know what the IP address of its default gateway is?
Similarly to the way in which computers receive their DNS server addresses from DHCP computers have their "default gateway" provided to them by an "option" received when an IP address is "leased" to them from DHCP. If a computer has a statically-assigned IP address then, typically, its "default gateway" will also be statically assigned.
Fundamentally, IP routing is a game of "handing off" packets from one computer (routers) to another until the packet gets to its destination (or "dies trying" if the packet gets forwarded too many times). Each router has a number of network interfaces that connect it to other routers. When it receives a packet a router "decides" which network interface would be the "best" for that packet to leave through and, after making that "routing decision", it hands the packet off to another router via the chosen network interface. That process repeats until your packet reaches its destination.
My epic subnetting answer discusses the basics of static IP routing. In static IP routing, each router has a statically-assigned list of destination networks and understands the "adjacency" of networks to the router's network interfaces. In Real LifeTM static IP routing isn't used within large networks because it's too cumbersome to maintain and it doesn't take into account routing around congestion or failed links.
The "traceroute" you perform shows you the results of routing decisions made by routers along each "hop" of your packet's path. These routers use dynamic routing protocols, like Border Gateway Protocol (BGP) or Open Shortest Path First (OSPF) to make decisions about how to route your packet to another router. These dynamic routing protocols can take into account factors such as link congestion or availability, relative "distance" your packet would travel along each prospective path, and possibly other factors (including "political" factors such as peering agreements) to determine where your packet goes.
The specifics of how individual dynamic routing protocols work is well beyond this answer. Fortunately the architecture of the Internet is such that endpoints (like your computer, or the servers at Serverfault.com) don't need to know anything about routing of packets inside the "cloud". As long as all the routers inside the network play by the proper rules packets will be delivered (though IP allows for out-of-order delivery and loss of packets-- higher-level protocols take care of handling these occurances). Better still, new dynamic routing protocols can be devised and implemented inside the "cloud" and nothing needs to change for all the end-points to take advantage of improved routing.
Actually, it does a forward lookup. A reverse DNS lookup is when you want to map an IP address to a domain
Every domain has its nameserver records. These nameserver records point to the authorative DNS servers for the domain. Your PC finds this by going to its own pre-configured DNS servers. You can confirm this on Windows:
Yes. Sort of. There are Root Servers, but your local machine is unlikely to use them. If you're on a corporate network, then it will most likely use root servers, otherwise your ISP may use them
Your computer is configured with a Default Gateway (visible by IPConfig)
This default gateway is your window to the world. This default gateway has its OWN default gateway (your ISP), and that gateway is generally connected to the rest of the internet via a protocol called BGP. This BGP protocol is a way of announcing all the seperate hops and routes to a specific destination (the ServerFault IP address). It hops to the next router, which then recalculates the best method of getting there, hops, so on and so forth until you get to the end. It's possible for two sequential packets to reach the destination out of order because of this, so TCP has controls built in for this.
The BGP protocol maintains the health of each subsequent link it has, and when the health of that link changes (too busy, or goes down, whatever) then it chooses another hop to go to.
Your computer can locate a DNS server because it's told the address of one (or preferably more) DNS servers in it's networking configuration. That info is either obtained dynamically (from another protocol called DHCP) or it's configured statically for a specific network interface.
Traffic to a specific IP address that is not on your "physical" LAN is routed via (you guessed it) IP routers that have partial and complete tables that describe the interconnection of all the networks that form the Internet.
When performing a traceroute, all the hops between you and the desired host represent one of (potentially) many possible routes to that host.
If you look at your IP configuration in the computer you will find that it has a list iof DNS servers to ask. This is the link. Like the router to use, the DNS is a configured value coming normally viaa DHCP. It is part of the IP configuration.
The DNS server of the provider will then go normally to a list of known DNS servers that manage the complete internet (called ROOT servers). Basically they are the KNOWN AUTHORITIES for the "zone" ".", which is the top level zone. Technically, ".com" is a subdomain of ".". As I alareday said, the list of root servers is "known".
More information at http://www.root-servers.org/
The connections configured between all the AS (Autonomous Systems) that the providers are assgined, which is exchanged between them through the BGP protocol. Basicaly a provider gets addresss from a central registry for an AS ans it publishes the known other systems on the edges of it's AS to other AS. They exchange vviaBGP how to route to other networks.
More information at http://en.wikipedia.org/wiki/Border_Gateway_Protocol
The Domain Name System (DNS)
First of all, the process used to get from www.serverfault.com to ww.xx.yy.zz is called a forward DNS lookup. A reverse lookup is the process used to get a DNS name from an IP address.
That said, your computer goes to the DNS server set in its network settings (operated by your ISP/workplace and called a "recursor"). If the computer automatically obtains an IP address via DHCP (i.e. you didn't manually set it), the DNS server address is usually obtained via DHCP as well.
Although caching is involved, the recursor handles your query by first starting at one of thirteen IP addresses of DNS "root servers," although in reality, more than 13 root DNS servers exist, sharing the IP addresses using a mechanism called "anycast."
The root server, of course, does not hold the DNS entries for all domain names. Instead, it just refers all requests for a .com domain name to one of the .com DNS servers. The .com DNS server, which contains the record for serverfault.com, refers the request to that DNS server, which contains the record we are looking for: the IP address of www.serverfault.com (called an A record). This all happens within the ISP's recursor (where the results of successful lookups can be saved for later use) to keep the process running efficiently and smoothly; it only returns the final answer.
IP Routing
Also in the computer's network settings is the IP address of a "default gateway," which is the address of the firewall or router on your LAN. All packets sent by your computer are directed toward that gateway if they are not to a computer directly connected to the LAN, as determined by the set subnet mask.
The router holds a "routing table" containing a list of IP address ranges and where to send packets destined for them. Routing tables are often configured automatically. Within your ISP's network, any of several protocols may be used to do so, including Routing Information Protocol (RIP). Once the packet leaves your own ISP's network (sent toward the open internet through an upstream provider, which is the ISP's ISP), a protocol called Border Gateway Protocol (BGP) is used. Packets continue to be forwarded among routers until they reach their final destination. Packets travel back to your computer in the same manner.
How does traceroute work? It manipulates a field in the packet called "Time to Live," which in normal use is intended to prevent packets from looping endlessly should there be an error in router configuration. It is defined as the number of routers a packet is allowed to travel through. Traceroute increases the TTL by 1 each time and listens for the error messages generated when that limit is exceeded (in the form of ICMP packets). It records the IP address of the router that generated the error and optionally performs a reverse DNS lookup to get a DNS name, which may or may not be successful.