We are in the process of moving from AWS where we have a highly available system setup using EC2's auto scaling feature. However, we aren't using this to change the size of the pool based on resource usage, we are simply using it to spin up new instances when one of them fails or becomes unresponsive.
Without this auto scaling feature on other cloud providers (we are specifically looking at DigitalOcean, but it should apply anywhere), what are some options to achieve this setup? My first thought was to create an instance that monitors the others, but then that server becomes a single point of failure. Are there any services or established patterns to accomplish this whether automated or writing some scripts to the API without creating a single point of failure?