Just been looking into Elastic Load Balancers. As I understand it they just do round robin, evenly distributing connections to the servers behind them. So what happens if you have different size instances behind an ELB? Does it send more connections to the larger instance or does it continue to evenly distribute the connections, which would mean you really shouldn't use different size instances.
Kind of, but not quite I think - unfortunately, the Amazon ELB routing documentation falls short of being non existent, so one needs to assemble some pieces to draw a conclusion. Here is the only fragment from the Elastic Load Balancing Developer Guide I'm aware of, see section Sticky Sessions in Overview of Elastic Load Balancing:
Now what does smallest load mean exactly? Again, the only explanation I'm aware of is the somewhat vague AWS team response from 2009 to ELB Strategy:
This makes a lot of sense concerning their system architecture and addressed use cases, but obviously doesn't provide the transparency and/or control of routing you may want or need for advanced HA scenarios.
Please note that, depending on interpretation, this may or may not be contradicted a bit by a more recent AWS team response to Elastic Load Balancing - Load distribution policies:
Health Checks
Of course, the above is amended with the properly documented, transparent and controllable health checks, which gives you some leverage to (potentially temporarily) remove instances from being included in routing in the first place, as summarized in the aforementioned AWS team response to ELB Strategy as well:
Conclusion
While certainly unusual, I don't see why ELB shouldn't work with a pool of different Amazon EC2 instance types as well - I haven't tried this myself though and would recommend both, Monitoring Your Load Balancer Using CloudWatch as well as monitoring your individual EC2 instances and correlate the results in order to gain respective insight and confidence into such a setup eventually.
Based on the statements made up until now, the distribution algorithm is extremely simple.
The front-end of ELB is typically more-than-one ELB instance, and the distribution is round-robin.
The back-end (your instances) algo claims to be:
This would infer that if a larger instance has less outstanding requests, then more traffic would be routed to them. There's no way to guarantee this.