I often see web app architectures with a SLB / reverse-proxy in front of a bunch of app servers.
What happens when the number of connections to the SLB requires too many resources for a single SLB to handle effectively? For a concrete yet over-the-top example, consider 2 million persistent HTTP connections. Clearly a single SLB cannot handle this.
What is the recommended configuration for scaling out a SLB?
Is it typical to create a group/cluster of LBs? If so, how is client load spread out between the group of LBs?