Amazon recently introduced Target Tracking Policies for EC2 Auto Scaling.
In my production service, I am using two separate auto-scaling groups to support hybrid auto-scaling with a mixture of Spot and On-Demand instances. What I want is that my CPU usage should not exceed 70%, and it should use Spot instances whenever possible but fall back to On-Demand instances if necessary.
First, I set both Auto-Scaling groups (Spot and On-Demand) to use Target Tracking for 70% CPU load and set the minimum size of both groups to one. The traffic on my service is quite predictable (no sudden boost, more traffic during the day, few traffic during the night).
At one point, there were two On-Demand and two Spot instances running. The system had just scaled down because the CPU load of the five servers became very low (around 35%). With the four servers, the CPU load went up and after a few minutes briefly crossed the 70% mark (maybe there was a very minor traffic boost at that time).
The system decided conservatively to scale up again, but as both auto-scaling groups made the decision independently at the same time, two instances were started (one Spot and one On-Demand instance). At this point, there were now six servers running. After a while it scaled down again and finally reached a setup of running four instances.
To avoid that effect, I now changed the setup as follows:
- On-Demand: Target 70% CPU usage, one server minimum
- Spot: Target 65% CPU usage, one server minimum
My assumption is that it should help to prevent that scenario I described. I would expect that the On-Demand group scales down earlier than the Spot group (which is desirable, anyway, as they are more costly). And I expect that the Spot instances scale up sooner, which should protect against unnecessary upscaling from the On-Demand group.
That is my expectation, but I did not find much details in the documentation to confirm it. Can someone shed some light on how the new Target Tracking scaling works in details, and how to apply it to a hybrid setup with Spot and On-Demand instances?
Questions:
- If I set the target to 70% CPU utilization, when will it decide to scale up and when to scale down?
- If I have two Auto-Scaling groups, one with a 70% CPU utilization target and the other with 65%, when will it decide to scale up or down? Will it always prefer to scale down the 70% group? Will it always prefer to scale up the 65% group?
- What happens if the prices in the Spot market suddenly rise to exceed my bid limit. Will the On-Demand auto-scaling group take over?
- Is my understanding correct that manually defining the number of desired instances has only a short-term effect and will be automatically adjusted by the Auto Scaling policy?
- For example, if it scaled down to the minimum during the night and scaled up again next day, does it mean that the initial "number of desired instances" settings from the previous day are now obsolete? In other words, do I need to worry only about setting reasonable value for minimum and maximum, and will AWS will figure out the rest?
- AWS doesn't say exactly how it works, but it will create two CloudWatch alarms for each target tracking policy, one for scaling out and one for scaling in, you can check the thresholds on those to see when they'll be triggered
- It would eventually, the spot instances would be terminated, which would cause increased load on the On-Demand instances, which would cause them to scale.
- Correct, the 'desired capacity' is what the scaling policy changes to make instances be terminated or launched
- Correct, the min and max are the bounds for the desired (it can't go below the min or above the max)
One thing you should look into is a new feature where you can mix Spot and On-Demand in a single AutoScaling Group now. You can also have multiple instance types in one AutoScaling Group at a time. So you could theoretically have a single group with a bunch of different backup instance types selected, using the 2 cheapest spot instances at any given time with the others as fallbacks if those two run our of spot capacity.
Two important thing to note about this new feature: 1) If there is no spot capacity in any of the availability zones you selected for any of the instance types you selected, it will NOT fallback to on demand automatically. So if you have it setup for 50% spot and 50% On-Demand, and the desired is 10, with no spot availability, you'll just have 5 On-Demand instances. If you had enough different instance types selected I'd imagine this wouldn't be an issue, but who knows.
2) Most of the load balances on their use round robin or something like it for distributing connections to instances, so if there's 1 fast instance and 1 slow one, they'll both receive the same amount of connections and the slow one will eventually get bogged down
https://aws.amazon.com/blogs/aws/new-ec2-auto-scaling-groups-with-multiple-instance-types-purchase-options/