If there is a limit on the number of ports one machine can have and a socket can only bind to an unused port number, how do servers experiencing extremely high amounts (more than the max port number) of requests handle this? Is it just done by making the system distributed, i.e., many servers on many machines?
You misunderstand port numbers: a server listens only on one port and can have large numbers of open sockets from clients connecting to that one port.
On the TCP level the tuple (source ip, source port, destination ip, destination port) must be unique for each simultaneous connection. That means a single client cannot open more than 65535 simultaneous connections to a single server. But a server can (theoretically) serve 65535 simultaneous connections per client.
So in practice the server is only limited by how much CPU power, memory etc. it has to serve requests, not by the number of TCP connections to the server.
You are mistaken - the socket's uniqueness is determined by five factors:
When offering network services, 1. and 2. typically are static (e.g. IP 10.0.0.1, port 80) but unless you are expecting thousands of connections from a single client (or a single NAT gateway), you are not going to push the boundaries for the possible combinations of 3. and 4. before you run out of local resources.
So although practically a client will not use a port already in use for a connection to open a connection to a different destination IP address, port number depletion is going to be the least of your problems for nearly any application - be it on the server or client side.
The problem is a very real one with NAT gateways (routers) serving clients with a high number of open outbound connections (e.g. torrents) - there you will see port number depletion after the port pool available for NAT has been emptied. In this case the NAT gateway is unable to create any additional associations, thus effectively cutting clients off the internet.
The question was how to handle large (>64k) connection counts. The two most common methods are:
Adding more servers, which increases the number of src/dst addresses and port number tuples. There are multiple ways to share load across multiple servers; DNS round robin is one; there are others
Deploy "carrier-grade NAT" (which a friend derisively and correctly in my view refers to as "crummier-grade NAT"). This is essentially a NAT of a NAT. This has very bad implications for applications, but it's what some large providers do when they run out of IPv4 space and/or port numbers, and/or they don't want to move to IPv6.