I am trying to decide between using a layer 4 load balancing solution for my datacenter or a layer 7 solution. Unfortunately (for my sanity, that is), my use case is simple enough that both solutions would work well, avoiding most of the weaknesses and not really utilizing the strengths on the other. Whatever solution we end up using, it has to have high availablity and high throughput. But we are only planning to use it to load balance over a cluster of web servers, none of which have any requirements for "sticky" session management (cookie or IP), complex rewrite rules - or, for that matter, any rewrite rules at all.
The load balancers will be connected to two switches, both with an independent connection up to the datacenter aggregation layer and merged together using Rapid Spanning Tree and whatever proprietary protocol that the switches use for virtualizing. The load balancers will also be cross-linked to each other over a crossover cable. All of the servers in the cluster are connected to both switches. All that the load balancers have to do is point the traffic over them.
Since it's just HTTP, I can use a layer 7 load balancing solution like HAProxy or nginx. But I could also use the LVS project with ldirectord or keepalived or whatever.
I've tried to break up the pros and cons as I see them, but it just ends up in a wash. What would you recommend and why? Am I missing something?
One useful benefit of "L7" like haproxy is being able to use cookies to keep the same browser hitting the same backend server. This make debugging client hits much easier.
L4 balancing may bounce a single user around on several backend servers. (which in certain cases may be advantageous, but in a debugging/profiling sense, using "L7" is much more valuable.)
EDIT: There's also a potential speed advantage of using HTTP balancing. With keep-alives clients can establish a single TCP session to your balancer and then send many HITs without needing to re-establish new TCP sessions (3-way handshake). Similarly many LBs maintain keep-alive sessions to back end systems removing the need to do the same handshake on the back end.
Strict TCP load balancing may not accomplish both of these as easily.
/* FWIW: I wouldn't say "L7" or "L4", I would say HTTP or TCP. But I'm a stickler for avoiding using the OSI to describe things it doesn't match up with well. */
I think fundamentally if you're not sure what to deploy, go with what feels simple and natural to you. Test it (use apache bench?) and make sure it works for your needs. To me HTTP LB is more natural.
Given the lack of advantage to you from doing L7 balancing, I'd settle on L4 balancing instead. I'm a big fan of keeping it as simple as possible, without sacrificing too much.
L7 requires the balancers to inspect the http headers in the packets that are going through them for appropriate routing, adding additional overhead and a marginal increase in latency for the end user. It seems a pointless expense to me if you'll gain nothing by it.
Some DNS providers have simple failover functionality. You've mentioned what your requirements are not and not what they are, but if all you need is round robin with failover if something's down, then you could use e.g. zoneedit.com's Failover. Depending on your HA needs that may be good enough and you get to skip a whole tier in your architecture.