I have multiple webservers with the same content, hosted across different providers. However, I can't seem to find a nice, simple failover solution. Load-balancing software (Pound, HAProxy, etc.) are unnecessary, and I need the flexibility to manage over 100+ domains, so the paid DNS failover solutions I've found are too expensive.
So far the simplest solution I've thought of is just to set a very low TTL (30min - 1hr) in each zone entry on my nameservers (running BIND). Then, continuously monitor each server, and temporarily remove failed servers from zone entries. But this seems like something that should be currently available.
I only have root access to different VPSes running CentOS. Any suggestions? Thanks!
What you are looking for is called a global load balancer. It will basically do the same thing that you are doing, only in a more automated fashion.
We do something similar with one of our systems. DNS is run from MyDNS so all the records are stored in MySQL making the updates nice and simple. The TTL records are also run very low as a even a 5 minute outage can be a pain.
System basically works by checking the heartbeats every few minutes and updating the records accordingly.
Not perfect as a host going down can cause an outage to uses who get that dns back or have stupid dns cache policies in their proxies. Only way around this is to cluster the hosts together in locations in a sort of HA setup.
Take a look at Perlbal