I'm trying to move off a monolithic big server, that ran many jobs, into a design that uses auto scaling to build out more servers as more jobs are run. The issue I found in testing is that when the default scale in task took place it terminated a server running a task.
Is it possible for a server to tell AWS that it is working and needs to stay up? Or able to switch between "Ok to Terminate" and "Working" as if it gets a new job?
Jobs can take minutes or hours, so one flat cooldown timer wouldn't provide the right protection.
EC2 AutoScaling has something called 'scale in protection' where an instance won't be picked for termination for a scale in event (usually caused by the desired capacity going down, but could also apply to things like an Instance Refresh).
If you have a large number of instances, be careful about API throttling, to avoid throttling these are some best practices
Alternatively, you could use scaling policies only for scale out, and then have instances themselves control scale in. Use the same logic as above, but when an instance is ready to be terminated from no work to do, have it call the terminate-instance-in-auto-scaling-group command on itself. This method might not be ideal if you don't want the ASG going down to 0 instances.