I am trying to get my head around the concept of load balancing to ensure availability and redundancy to keep users happy when things go wrong, rather than load balancing for the sake of offering blistering speed to millions of users.
We're on a budget and trying to stick to the stuff where there's plenty of knowledge available, so running Apache on Ubuntu VPS's seems like the strategy until some famous search engine acquire us (Saturday irony included, please note).
At least to me, it's a complete jungle of different solutions available. Apaches own mod_proxy & HAproxy are two that we found by a quick google search, but having zero experience of load balancing, I have no idea of what would be appropriate for our situation, or what we would look after while choosing a solution to solve our availability concerns.
What is the best option for us? What should we do to get availability high whilst staying inside our budgets?
HAproxy is a good solution. The config is fairly straight forward.
You'll need another VPS instance to sit in front of at least 2 other VPS's. So for load balancing / fail over you need a minimum of 3 VPS's
A few things to think about also is:
SSL termination. If you use HTTPS:// that connection should terminate at the load balancer, behind the load balancer it should pass all traffic over an unencrypted connection.
File storage. If a user uploads an image where does it go? Does it just sit on one machine? You need someway to share files instantly between machines - you could use Amazon's S3 service to store all your static files, or you could have another VPS that would act as a file server, but I would recommend S3 because its redundant and insanely cheap.
session info. each machine in your load balancer config needs to be able to access the session info of the user, because you never know what machine they will hit.
db - do you have a separate db server? if you only have one machine right now, how will you make sure your new machine will have access to the db server - and if its a separate VPS db server, how redundant is that. It doesn't necessarily makes sense to have High Availability web front ends and a single point of failure with one db server, now you need to consider db replication and slave promotion as well.
So I've been in your shoes, thats the trouble with a website that does a few hundred hits a day to a real operation. It gets complex quick. Hope that gave you some food for thought :)
The solution I use, and can be easily implemented with VPS, is the following:
This arch has the following advantages, on my biased opinion:
In your case, having physically separated VPSs is a good idea, but makes the ip sharing more difficult. The objective is having a fault resistant, redundant system, and some configurations for load balancing/HA end messing it up adding a single point of failure (like a single load balancer to receive all traffic).
I also know you asked about apache, but those days we have specific tools better suited to the job (like nginx and varnish). Leave apache to run the applications on the backend and serve it using other tools (not that apache can't do good load balancing or reverse proxying, it's just a question of offloading different parts of the job to more services so each part can do well it's share).
My vote is for Linux Virtual Server as the load balancer. This makes the LVS director a single point of failure as well as a bottleneck, but
Cost can be kept down by having the first director be on the same machine as the first LVS node, and the second director on the same machine as the second LVS node. Third and subsequent nodes are pure nodes, with no LVS or HA implications.
This also leaves you free to run any web server software you like, as the redirection's taking place below the application layer.
How about this chain?
round robin dns > haproxy on both machines > nginx to seperate static files > apache
Possibly also use ucarp or heartbeat to ensure haproxy always answers. Stunnel would sit in front of haproxy if you need SSL too
You may want to consider using proper clustering software. RedHat's (or CentOS) Cluster Suite, or Oracle's ClusterWare. These can be used to setup active-passive clusters, and can be used to restart services, and fail between nodes when there are serious issues. This is essentially what you're looking for.
All of these cluster solutions are included in the respective OS licenses, so you're probably cool on cost. They do require some manner of shared storage -- either an NFS mount, or physical disk accessed by both nodes with a clustered file system. An example of the latter would be SAN disks with multiple host access allowed, formatted with OCFS2 or GFS. I believe you can use VMWare shared disks for this.
The cluster software is used to define 'services' that run on nodes all the time, or only when that node is 'active'. The nodes communicate via heartbeats, and also monitor those services. They can restart them if they notice failures, and reboot if they can't be fixed.
You would basically configure a single 'shared' IP address that traffic would be directed to. Then apache, and any other necessary services, can be defined as well, and only run on the active server. Shared disk would be used for all your web content, any uploaded files, and your apache configuration directories. (with httpd.conf, etc)
In my experience, this works incredibly well.
--Christopher Karel
Optimal load balancing can be very expensive and complicated. Basic load balancing should just ensure that each server is servicing roughly the same number of hits at anytime.
The simplest load-balancing method is to provide multiple A records in DNS. By default the IP address will be configured in a round robin method. This will result in users being relatively evenly distributed across the servers. This works well for stateless sites. A little more complex method is required when you have a stateful site.
To handle stateful requirements, you can use redirects. Give each web server an alternate address such as www1, www2, www3, etc. Redirect the initial www connection to the host's alternate address. You may end up with bookmark issues this way, but they should be evenly dispersed across the servers.
Alternately, using a different path to indicate which server is handling the stateful session would allow proxying sessions which have switched host to the original server. This may be a problem when the session for a failed server arrives at server that has taken over from the failed server. However, barring clustering software the state will be missing anyway. Due to browser caching, you may not experience a lot of sessions changing servers.
Failover can be handled by configuring server to take over the IP address of a failed server. This will minimize the downtime if a server fails. Without clustering software, stateful sessions will be lost if a server fails.
Without failover users will experience a delay until their browser fails over to the next IP address.
Using Restful services rather than stateful sessions should do away clustering issues on the front-end. Clustering issues on the storage side would still apply.
Even with load balancers in front of the servers, you will likely have round-robin DNS in front of them. This will ensure all your load balancers get utilized. They will add another layer to you design, with additional complexity and another point of failure. However, they can provide some security features.
The best solution will depend on the relevant requirements.
Implementing image servers to serve up content like images, CSS files, and other static content can ease the load on the application servers.
I generally use a pair of identical OpenBSD machines:
OpenBSD is light, stable, and quite secure - Perfect for network services.
To start, I recommend a layer3 setup. It avoids complications firewall (PF) setup. Here is an example /etc/relayd.conf file that shows setup of a simple relay load balancer with monitoring of the backend webservers:
Have you given ec2 with cloudfoundry or maybe Elastic beanstalk or just a plain old AWS autoscaling a thought. I have been using that and it scales pretty well and being elastic can scaleup/down without any human intervention.
Given that you say you have zero experience with load balancing, I would suggest these options as they require minimal brain "frying" to get up and running.
It might be a better use of your time.