We host some web sites on servers in our office. We have two internet connections, a primary and a backup connection (with separate IP address ranges.) We need to be able to detect, from outside our LAN, that the primary internet connection has failed.
We use a special DNS service that will switch the DNS records to point at our backup IP addresses if the primary addresses go offline. It detects if our primary addresses have gone offline by pinging a server, or loading a web page on our end, and if that fails, it switches our DNS over to the secondary IP addresses.
We did have it check for the existence of a page on our web server, but the problem is when we reboot that web server for windows updates, it thinks the internet connection has gone down and switches over to the secondary connection, which causes problems. It switches back after a few minutes, once the server reboots.
Up until recently, my procedure was to log into the external DNS and disable the fail-over, then reboot the server, then re-enable it. It's not an optimal solution because the fail-over has to set on on per-domain basis, and there are quite a few.
I would like to have the fail-over check a device which is always on, 100% of the time. Preferably no moving parts on a UPS, so it never goes down. I tried to use a Hawking print server, which has a web-based interface, but it seems unreliable as it has detected the internet connection has failed when it really has not. Worst case, I can build a fan-less PC that boots from USB maybe, and just runs a stripped down unix version with a web host that serves one page.
Are there any simple, 100% reliable devices that could be pressed into service that can be pinged, or serve a web page that could be used?
Ok, so my original answer assumed you were talking about failover web servers. What you want is entirely different from that. Thanks for updating your question.
What we've just done here we've accomplished with BGP. We have 2 providers that gave us a BGP link each. We then have an IP range that each provider advertises over the BGP. When either link fails the BGP takes care of routing the network. This is the way the internet works when links go down. You don't need DNS changes at all and it's much more reliable than what you have and fully automated. Could this be an option for you?
Sorry, but this sounds to me like the failover is doing what you intended. If you reboot the server is not serving any content until it's fully rebooted and the services have started. If you don't want it to check the web server then just a simple ping would work. But, this is still not going to be ideal when users try to reach your web site as IIS may not have started even though you can ping it.
Using DNS for failover is never going to be reliable. It looks like you've found this out the hard way.
The reason for this is that DNS entries are cached all over the internet and DNS caches and servers can choose to ignore your TTL values even when set rather low to say 30 seconds. If you've got it set to 5 minutes then it's going to be at least that long before things change back to normal. In any case it also sounds like your backup servers are not working as intended or is it simply a static web page to say the site is temporarily down?
There are better solutions available in the form of failover web servers. I'm not familiar with how microsoft do it exactly but I believe Microsoft Cluster Server is the correct name for the product. Or you may be able to switch your web servers to Linux. mono and mod-mono is available if you need .NET support under the apache web server. These are free. I don't think MS cluster server comes cheap.
A third option may be to set up a linux router inside your network. Then hack together a script which checks the web server every minute. If it's down, the script can change a route or redirect to a static page or another server. You might even be able to do it using iptables redirect. Anyway, just some thoughts and I know this is an old question now.
Some small board computer such as a Gumstix Overo based micro computer or Rasberry Pi would work. For web server you could use Python's built in webserver and supervisor or another web server.
I have had a Gumstix running for over 3 weeks with no over heating or downtime.