I'm working on a setup with two datacenters linked by a MAN (bridged) and everything is doubled between them, in fail-over mode with RedHat Cluster, DRBD and that kinf of things.
I have one DNS server for each location, but it turns out that having both in /etc/resolv.conf doesn't help much; if one goes down, the client waits 10s or so half of the time. In other words, it's using them for load balancing, not fail-over. So I configured the two servers to use a VIP with ucarp (≈VRRP).
Is there a way to have my two DNS servers both be up and, for example, respond to the same IP, all the time? It's no big deal if one NS resquest gets two answers.
Is there a way to do this with Anycast / Multicast and so on?
Edit: turns out anycast won't do me any good in my scenario, I have only static routes, and most traffic is actually through a bridge.
What would be interesting would be a way to have two DNS servers answer to requests on the same IP, if that's somehow possible.
You can massively mitigate problems by setting a couple of options in your resolv.conf:
rotate makes the resolver pick one of your nameservers at random, rather than using the first one unless it times out. timeout:2 reduces the dns timeout to two seconds, rather than the default value.
(NB: this was tested on Debian/Ubuntu, but I don't think this is a Debian specific change)
Anycast DNS would allow you to configure one resolver IP in all your clients; client requests would be forwarded to the 'closest' (from a network routing perspective) server.
If you tied the advertisement of the anycast VIP to a healthcheck (e.g. requesting the A record for a well known domain), then should one of your servers fail its route would be withdrawn. Once the network reconverged, all requests would be forwarded to the other device without any manual reconfiguration.
In terms of implementation, this can be done either through the use of hardware appliances (e.g. F5 Big IP, Citrix Netscaler), or through your own configuration. You can either run a routing daemon (e.g. Quagga) running on your DNS servers, or have some custom scripts that log in to your routers in order to change the state of each anycast VIP.
Fix the client - use a better resolver.
lwresd is part of Bind. It runs as a local service. You configure libc to use it via /etc/nsswitch.conf, so using it is transparent to all but statically compiled programs.
lwresd monitors the performance and availability of configured name servers (this is standard Bind behaviour). Should a host become unavailable, lwresd will back off from a server and send all queries to other configured servers. As it runs locally on each host, it should normally send all queries to the closest server.
I run an internal BGP anycast recursive DNS Cluster on two Linux Virtual Server (IPVS) Loadbalancers and it works like a charm.
The basic setup is described here: great: sorry, new users aren't allowed to add hyperlinks... (see for link below and later then)
The Problem with using VRRP for the Service IP is that it will wander between your two servers and thus your nameserver will need to bind to it quickly in order to be able to respond to queries in the case of a failover. You could work around this by NATing just as in my IPVS setup but i'd recommend loadbalancing with active service checks so you know when something is wrong.
Please note that while there are DNS implementations that make use of multicast (Apple Bonjour/mdns for example) these are usually not well suited for reliant or high volume recursive DNS service and are also commonly limited to use within the same collision domain i.e. LAN.
The simple dumb way:
Ask your linux to be much more aggressive on dns servers in resolv.conf: options timeout:0.1 rotate
So timeout is quick and rotate make him use both to round robin the load, without any VIP/VRRP/staff to manage, just 2 dns servers doing their job...
Anycast is frequently used to solve this requirement. Anycast DNS is the use of routing and addressing policies to affect the most efficient path between a single source (DNS Client) and several geographically dispersed targets that "listen" to a service (DNS) within a receiver group. In Anycast, the same IP addresses are used to address each of the listening targets (DNS servers in this case). Layer 3 routing dynamically handles the calculation and transmission of packets from our source (DNS Client) to its most appropriate (DNS Server) target.
Please see www.netlinxinc.com for an entire series of blog posts devoted to Anycast DNS. There you will find recipes for how to configure Anycast DNS. The series has covered Anycast DNS using Static Routing, RIP, and I will be posting recipes on OSPF and BGP shortly.
If it's acceptable to have a few seconds of DNS failure before the swapover occurs, you can create a simple shell script to do this. Non working pseudocode follows:
If you are using load balancers anywhere in your site, you should be able to configure them to have DNS as a virtual service.
My Kemp Loadmaster 1500s can be setup to do round-robin with failover. That would use their service checking to make sure that each DNS server is up every few seconds and divide the traffic between the two servers. If one dies, it drops out of the RR pool and only the "up" server gets queried.
You'd just have to point your resolv.conf to the VIP on the loadbalancer.
You want DNS to be reliable. Adding a huge amount of complexity to the setup will cause an absolute nightmare when something breaks.
Some of the proposed solutions only work when the redundant DNS servers are at the same site.
The fundamental issue is that the DNS client is broken as designed. It doesn't remember when a server was unreachable, and keeps trying to connect to the same nonresponsive server.
NIS handled this issue by having ypbind keep state. A clumsy solution, but it usually works.
The solution here is to lean on vendors to implement a reasonable solution to this problem. It's getting worse with IPV6, as the AAAA requests are adding to the length of time spent wasted on timeouts. I have seen protocols fail (e.g. an sshd connection) because they spent so much time waiting on DNS timeouts due to a single unreachable DNS server.
In the interim, as has been previously suggested, write a script that replaces resolv.conf with one that contains only valid nameservers. Share this script with vendors to demonstrate the unclean solution that you were forced to implement.
This hasn't been seriously tested, and it assumes an nslookup that parses like mine, and a grep that supports "-q".
Run this out of cron every 5 minutes or so.
I'm not seriously suggesting that anyone actually use cron and a shell script for critical failover management, the error-handling surprises are just too great. This is a proof of concept only.
To test this for real, change the "nameservers=" line at the top, change resolv_conf at the top to /etc/resolv.conf not /tmp/resolv.conf, and the default header for resolv.conf that contains example.com.
You may need to restart nscd if you replace resolv.conf.
I would first try duplicating your VRRP, but with an additional VIP. For each VIP, alternate the primary and backup nodes.
DNS1 = vip1 primary, vip2 secondary DNS2 = vip2 primary, vip1 secondary
Then have each of your client machines have both ips in the resolver. That way the load is spread across the nameservers, but if one goes down, the other one just takes over the additional load.