We have 3 Ruby on Rails applications (A, B and C) installed on a number of application servers. Our front end is HAProxy, backend is Apache + Phusion Passenger. Originally we had all 3 Rails apps installed on each application server, but this setup was slow because HAProxy "doesn't know" if a given Rails application is "hot" on a given backed server.
Each passenger instance is configured to run up to 8 Rails application instances.
Consider the following scenario (simplified):
- 8 simultaneous requests for app A come in and HAProxy dispaches all of them to the first application server, because the rest are "very busy" with other requests.
- Passenger starts 8 instances of the app A on this server.
- Another request comes in for app B, which also gets dispatched to the first application server, since other app servers are still too busy.
- Now Passenger has to shut down one of the instances of the app A and create one instance of app B.
In the big scheme of things when there is A TON of the requests per minute, all 3 Rails app start and stop often on each app server, which is slow.
In the perfect World applications start once and process a lot of requests without having to shut down and re-launch. That's why we had to divide our app servers between 3 Rails apps:
- App A runs on 13 servers.
- App B runs on 5 servers.
- App C runs on 2 servers.
The question: is there a load balancer software that is "aware" of the backend and that knows and uses the following information to balance the load:
- How many instances of each application each backend server currently has active/hot?
- How many of those instances are currently processing requests?
- What is the current average number of requests for a given application per minute/hour?
- Is there a need to "ramp down" one application and "ramp up" another one?
The idea is to have a number of "homogenous" (same) application servers with all the apps installed, so that we could add new servers to increase the overall capacity for all apps, but the capacity for a given app is up to the "very smart" load balancer, that could control the per-app capacity without having to start and stop apps very often.
I don't know of one.
I'm facing a similar problem. At this time the best solution appears to be building a management layer that's able to track loading on the app-servers and can fiddle the load-balancer's configuration based on what it is tracking. This would be an entirely custom built solution, though, and we haven't started writing it yet.