I have two DNS servers running BIND9, one master and one slave. When the zone file is updated on the master, I want the slave server to immediately start serving the changed record(s), but BIND is giving me some guff.
DNS zone transfer is already working correctly between the master and the slave. I can log into the slave server and run dig @dnsmaster myzone. AXFR
and it prints out the entire contents of the zone. To make that work, the DNS master is configured with notify yes
and also-notify { dnsslave }
. Likewise, the slave is configured with allow-transfer { dnsmaster }
.
When dnsmaster is updated, I run rndc reload
and it tells me that notifications are being sent. This is confirmed on the slave by inspecting the zonefiles in /var/named/slavedata/
. They contain the most recent data, matching what the master knows.
Now comes the weird part.
The slave server will continue to serve up old, stale DNS records, completely ignoring the fact that new data is available on disk after being notified by the master. I'm using dig
to check the results with this command: dig @slaveserver record.zone.tld
.
I thought BIND might be keeping an in-memory cache of its authoritative zones, so I set max-cache-size
and max-cache-ttl
to 0, but that had no effect.
I tried other ways of flushing this alleged cache, by running commands like rndc flush
and rndc reload
on the slave server, but it still returns the old stale records.
Finally, I noticed that MINTTL
on the zone was set to 86400 (24 hours), so I temporarily changed the MINTTL
to 15 seconds, and restarted the slave server. No effect - the slave would only provide updated DNS results after the service is restarted.
What's going on here? What is the expected behaviour of BIND9 when receiving notification that a zone is updated? Does it always respect TTL
and MINTTL
? I would assume that it would always use the most recent data available.
At my wit's end, I am considering setting up a crontab to restart the BIND slaves on an hourly basis, just to avoid serving up stale data. Is there anything better?
From your description I can't tell you exactly what is the problem but I can help you rule out several things.
The cache size settings and cache ttl settings are for cached recursive query data and (as you already suspected) don't apply to authoritative data. Similarly rndc flush is inapplicable here.
Suggested troubleshooting method:
If that doesn't work, consider posting more information, including named.conf sections from both the master and slave and logs from both servers of what is occurring after you load a freshly edited zone on the master.
I faced the same situation. My researched led me to the following realization. If you are using views, then dig@local machine will only serve what is in the localhost-view. the localhost-view get refreshed only during restart of the named. But latest zone file (transferred from the master) is still available on the slave and will be served to all queries coming from the external sources or external views. So, you need to make arrangements so that your localhost-view is refreshed.
Please do not forget to increment serial id from the zones files when you make changes in Master Server before reload named, otherwise the zones files will not be replicated to Slave Server.