Hosts – domain names and IPs
In my description of the host component of a URL, I specified that the host domain could be either a domain name, or an IP address. As I mentioned before, an IP address is the underlying numeric address used by routing hardware and software to navigate to a resource on a network. It's the unique ID, specific to a piece of hardware at a specific location. A domain name, however, is the human-readable string of words and alpha-numeric characters used to make addressing easier and more consistent. It is more consistent, easily remembered, and less prone to error than a raw IP address. What's interesting, however, is that domain names and their IP addresses are actually functionally interchangeable. In any context in which you can use one, you can always safely substitute the other.
Given that IP addresses can be resolved directly by the network transport layer, and don't need to be resolved before they can be serviced by any node in the routing process, we'll ignore them for now. We'll explore the syntax, limitations, and advantages gained by using the IP address of a device later on in this book. For now, though, we're more concerned with how we can find the IP address in the first place. That's why, for this chapter at least, we're only concerning ourselves with domain names and how they're resolved by the DNS.
I'd bet that among everyone reading this book, there isn't a single person who knows a single other person who hasn't typed google.com or en.wikipedia.org into their browser's address bar. Our use of domain names is ubiquitous, and yet most of us have no idea how, exactly, they are created or used. Even for me, it wasn't until I was explicitly tasked with writing software for resolving those domain names on an internal network that I finally took the time to understand what made that system work. At that time, I learned how the web of DNS servers facilitated network usage by human users. While I only mentioned it previously, it's time to consider just what the DNS is more deeply, and how we can use it.
The DNS is a distributed, decentralized network of authoritative servers that hosts a directory of all sub-domain servers, as well as any domain names that can be resolved by that authoritative server. Any domain name that has been registered with a certified domain name registrar, and which meets the syntax standards of a domain name (and which hasn't already been registered), is considered valid. Valid domain names are added to the distributed registry hosted by authoritative servers. Between your computer and any other network node you hope to interact with using a valid, registered domain name, your request will have to interact with one or more of these name servers.
Each server will inspect the domain name given, and look up the domain in its own directory of names and IP address mappings. Naturally, the server will first determine if the given name can be resolved by that server, or at least by one of its subordinate servers. If so, the authoritative server simply replaces the domain name in the request with the IP address to which it maps, and forwards the request along accordingly. If the current server cannot resolve the domain name, however, it will forward it along up the hierarchy of name servers to a more general, parent domain. This process continues up to the root name server, or until the name is resolved.