I'm about to deploy two distinct systems: A MySQL cluster and a cluster of Jetty web-servers.
1. SOFTWARE
I can get more linux servers for the HA/LB, but which software should I use? I'm with CentOS.
I've heard of HAProxy, LVS and UltraMonkey.
I've read much of their documentation but it's not clear if they support MySQL AND HTTP/S (both).
I want one solution that takes care of both, for simplicity (instead of dealing with two).
- The solution should be open-source as the three above.
- SSL (MySQL & HTTPS) should be possible.
- "Sticky Sessions" are NOT a requirement, as I intend to have Jetty's sessions shared at the MySQL database.
2. PERFORMANCE
I've read that the load-balancing can be done in three ways:
- Direct Routing << LAN switch dependant?
- IP-IP Encapsulation (Tunnelling) << sounds best. not bound to LAN, as opposed to "Direct Routing"
- Network Address Translation << extremely limiting (replies from app service go through the LB)
The bottom line regarding performance, which worries me, is that eventually it's possible (and I don't want it) that all the data between internet users and my cluster is going through one server (the balancer), which makes the balancer a bottle-neck.
If it's connected with a 100MBit connection then the whole system is limited to that.
Is it possible to avoid that, yet get balancing and high availability? What is the "cost" to that? Do I need a special switch on the network or not?
Good that you do not have to rely on the LB (load-balancer) for the session-information.
Your need for speed leads IMHO to LVS and the "direct routing" approach. You CAN do direct routing without using the ip_forward mechanism. I set this up by using a dedicated lvs-network from the LBs to the real-servers.
Now for the "need for speed": with direct routing the LB takes the incoming requests, changes their destination MAC and put them on the line to the RS. Now the RS has to have the logical IP on the LVS-network (but should not answer to arp-requests for that IP). The RS will serve the request and will answer DIRECTLY to the client - the the back-answer will not go back through the LB, thus minimizing the load on the LB.
Additionally your ingoing traffic is most propably much lower than the outgoing traffic - so that suits your needs.
The last weak point is the availabilty of the LB itselv. You can cluster the LB with another machine that takes over the logical IP (and lvs). Since session stickyness is no problem in your setup - that`s all you have to do.
I found lvs-kiss to be a good possibility to dynamically reconfigure lvs - but there are other working solutions for this.
BTW - most of the servers where we use LVS are CentOS 5.
Update 2011-11-11: heartbeat-ldirectord is an addon for heartbeat and then you do not need lvs-kiss - if you want to cluster anyway.
Load balancing:
For HA I suggest look at the pacemaker It's robust, scalable and flexible solution for clustering and high availability.