The engineering team I work with has been in the process of moving equipment from one datacenter to another. Ten days ago we moved one of our name servers authoritative for our client's domains (ns1.faithhiway.com) and updated its IP address with its respective DNS provider (register.com) to point to the new datacenter. All tests done show that this name server is correctly running at its new location and when queried, returning the correct response for any domains it is responsible for.
The problem is that well after 72 hours had gone by we were still seeing more DNS activity at its old IP address than at the new. The good news is that we kept a name server responding on the old IP address for the time being so we are not seeing any issues with the domains our nameserver is responsible for but the goal is to retire that as soon as possible. As you can see from WhatsMyDNS.net, a decent amount of propagation has occurred over the last 10 days since we made this change, but still there are some locations reporting our original IP.
Considering that the TTL is only 3600 with the name servers responsible for this domain, it does not make any sense to myself or the other engineers working with me that we are having this issue.
Now if I run a DNS check using one of the Register.com DNS servers (direct nameservers for faithhiway.com), I get the following (correct) result:
# dig @dns01.gpn.register.com ns1.faithhiway.com A
; <<>> DiG 9.3.6-P1-RedHat-9.3.6-4.P1.el5_5.3 <<>> @dns01.gpn.register.com. ns1.faithhiway.com A
; (1 server found)
;; global options: printcmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 43232
;; flags: qr aa; QUERY: 1, ANSWER: 1, AUTHORITY: 5, ADDITIONAL: 5
;; QUESTION SECTION:
;ns1.faithhiway.com. IN A
;; ANSWER SECTION:
ns1.faithhiway.com. 3601 IN A 206.127.2.71
;; AUTHORITY SECTION:
faithhiway.com. 3600 IN NS dns01.gpn.register.com.
faithhiway.com. 3600 IN NS dns02.gpn.register.com.
faithhiway.com. 3600 IN NS dns03.gpn.register.com.
faithhiway.com. 3600 IN NS dns04.gpn.register.com.
faithhiway.com. 3600 IN NS dns05.gpn.register.com.
;; ADDITIONAL SECTION:
dns01.gpn.register.com. 3600 IN A 98.124.192.1
dns02.gpn.register.com. 3600 IN A 98.124.197.1
dns03.gpn.register.com. 3600 IN A 98.124.193.1
dns04.gpn.register.com. 3600 IN A 69.64.145.225
dns05.gpn.register.com. 3600 IN A 98.124.196.1
;; Query time: 50 msec
;; SERVER: 98.124.192.1#53(98.124.192.1)
;; WHEN: Thu Jan 27 15:16:57 2011
;; MSG SIZE rcvd: 269
Just as a reference, here are the results when the same query is checked against a variety of Public DNS servers:
Google:
# dig @8.8.8.8 ns1.faithhiway.com A
; <<>> DiG 9.3.6-P1-RedHat-9.3.6-4.P1.el5_5.3 <<>> @8.8.8.8. ns1.faithhiway.com A
; (1 server found)
;; global options: printcmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 12773
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0
;; QUESTION SECTION:
;ns1.faithhiway.com. IN A
;; ANSWER SECTION:
ns1.faithhiway.com. 997 IN A 206.127.2.71
;; Query time: 29 msec
;; SERVER: 8.8.8.8#53(8.8.8.8)
;; WHEN: Thu Jan 27 15:17:31 2011
;; MSG SIZE rcvd: 52
Level 3:
# dig @4.2.2.1 ns1.faithhiway.com A
; <<>> DiG 9.3.6-P1-RedHat-9.3.6-4.P1.el5_5.3 <<>> @4.2.2.1. ns1.faithhiway.com A
; (1 server found)
;; global options: printcmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 46505
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0
;; QUESTION SECTION:
;ns1.faithhiway.com. IN A
;; ANSWER SECTION:
ns1.faithhiway.com. 2623 IN A 206.127.2.71
;; Query time: 7 msec
;; SERVER: 4.2.2.1#53(4.2.2.1)
;; WHEN: Thu Jan 27 15:18:35 2011
;; MSG SIZE rcvd: 52
Verizon:
# dig @151.197.0.38 ns1.faithhiway.com A
; <<>> DiG 9.3.6-P1-RedHat-9.3.6-4.P1.el5_5.3 <<>> @151.197.0.38. ns1.faithhiway.com A
; (1 server found)
;; global options: printcmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 32658
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0
;; QUESTION SECTION:
;ns1.faithhiway.com. IN A
;; ANSWER SECTION:
ns1.faithhiway.com. 3601 IN A 206.127.2.71
;; Query time: 81 msec
;; SERVER: 151.197.0.38#53(151.197.0.38)
;; WHEN: Thu Jan 27 15:19:15 2011
;; MSG SIZE rcvd: 52
Cisco:
# dig @64.102.255.44 ns1.faithhiway.com A
; <<>> DiG 9.3.6-P1-RedHat-9.3.6-4.P1.el5_5.3 <<>> @64.102.255.44. ns1.faithhiway.com A
; (1 server found)
;; global options: printcmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 39689
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 5, ADDITIONAL: 0
;; QUESTION SECTION:
;ns1.faithhiway.com. IN A
;; ANSWER SECTION:
ns1.faithhiway.com. 3601 IN A 206.127.2.71
;; AUTHORITY SECTION:
faithhiway.com. 3600 IN NS dns01.gpn.register.com.
faithhiway.com. 3600 IN NS dns04.gpn.register.com.
faithhiway.com. 3600 IN NS dns05.gpn.register.com.
faithhiway.com. 3600 IN NS dns02.gpn.register.com.
faithhiway.com. 3600 IN NS dns03.gpn.register.com.
;; Query time: 105 msec
;; SERVER: 64.102.255.44#53(64.102.255.44)
;; WHEN: Thu Jan 27 15:20:05 2011
;; MSG SIZE rcvd: 165
OpenDNS:
# dig @208.67.222.222 ns1.faithhiway.com A
; <<>> DiG 9.3.6-P1-RedHat-9.3.6-4.P1.el5_5.3 <<>> @208.67.222.222. ns1.faithhiway.com A
; (1 server found)
;; global options: printcmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 12328
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0
;; QUESTION SECTION:
;ns1.faithhiway.com. IN A
;; ANSWER SECTION:
ns1.faithhiway.com. 169507 IN A 207.200.19.162
;; Query time: 6 msec
;; SERVER: 208.67.222.222#53(208.67.222.222)
;; WHEN: Thu Jan 27 15:19:29 2011
;; MSG SIZE rcvd: 52
SpeakEasy:
# dig @66.93.87.2 ns1.faithhiway.com A
; <<>> DiG 9.3.6-P1-RedHat-9.3.6-4.P1.el5_5.3 <<>> @66.93.87.2. ns1.faithhiway.com A
; (1 server found)
;; global options: printcmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 9342
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0
;; QUESTION SECTION:
;ns1.faithhiway.com. IN A
;; ANSWER SECTION:
ns1.faithhiway.com. 169323 IN A 207.200.19.162
;; Query time: 69 msec
;; SERVER: 66.93.87.2#53(66.93.87.2)
;; WHEN: Thu Jan 27 15:19:51 2011
;; MSG SIZE rcvd: 52
As you can see above, the majority of queries are returning the correct result. But a few (OpenDNS and SpeakEasy in the examples above) are still showing the old IP address. Considering the length of time that has gone by, it seems obvious to me that either we have made a mistake and not thoroughly handled the DNS changes on our end (likely) or there is a problem with either the DNS provider for this domain (Register) or with some of the DNS servers out in the wild (rather unlikely).
Any advice on how I can proceed with this?
UPDATE (January 31, 2011):
First of all, I apologize for the length of both the original question and this update. I contemplated removing some of the excess from the original post but just in case this problem and its solution are helpful to someone else in the future I'm just going to leave everything as it is.
Anyway, I've been doing some more research into this problem, and have discovered the following interesting occurrence. While running a check on the glue records for faithhiway.com always resolve correctly, if I go and check a client domain (where ns1.faithhiway.com is authoritative), I get a strange response. It looks like the root servers are returning nsX.faithhiway.com as their old IP addresses still (under Additional Section). Because we have a server still there responding to DNS queries, the trace finishes and returns the correct IP addresses as the final step (again, under Additional Section). The example below uses one of the domains that we use that uses ns1.faithhiway.com as its authoritative DNS server.
# dig +trace +nosearch +all +norecurse ignitemail.com
; <<>> DiG 9.2.4 <<>> +trace +nosearch +all +norecurse ignitemail.com
;; global options: printcmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 46856
;; flags: qr ra; QUERY: 1, ANSWER: 13, AUTHORITY: 0, ADDITIONAL: 0
;; QUESTION SECTION:
;. IN NS
;; ANSWER SECTION:
. 7986 IN NS a.root-servers.net.
. 7986 IN NS b.root-servers.net.
. 7986 IN NS c.root-servers.net.
. 7986 IN NS d.root-servers.net.
. 7986 IN NS e.root-servers.net.
. 7986 IN NS f.root-servers.net.
. 7986 IN NS g.root-servers.net.
. 7986 IN NS h.root-servers.net.
. 7986 IN NS i.root-servers.net.
. 7986 IN NS j.root-servers.net.
. 7986 IN NS k.root-servers.net.
. 7986 IN NS l.root-servers.net.
. 7986 IN NS m.root-servers.net.
;; Query time: 39 msec
;; SERVER: 8.8.8.8#53(8.8.8.8)
;; WHEN: Mon Jan 31 09:22:17 2011
;; MSG SIZE rcvd: 228
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 16325
;; flags: qr; QUERY: 1, ANSWER: 0, AUTHORITY: 13, ADDITIONAL: 14
;; QUESTION SECTION:
;ignitemail.com. IN A
;; AUTHORITY SECTION:
com. 172800 IN NS h.gtld-servers.net.
com. 172800 IN NS m.gtld-servers.net.
com. 172800 IN NS i.gtld-servers.net.
com. 172800 IN NS l.gtld-servers.net.
com. 172800 IN NS c.gtld-servers.net.
com. 172800 IN NS k.gtld-servers.net.
com. 172800 IN NS d.gtld-servers.net.
com. 172800 IN NS f.gtld-servers.net.
com. 172800 IN NS b.gtld-servers.net.
com. 172800 IN NS a.gtld-servers.net.
com. 172800 IN NS e.gtld-servers.net.
com. 172800 IN NS g.gtld-servers.net.
com. 172800 IN NS j.gtld-servers.net.
;; ADDITIONAL SECTION:
a.gtld-servers.net. 172800 IN A 192.5.6.30
a.gtld-servers.net. 172800 IN AAAA 2001:503:a83e::2:30
b.gtld-servers.net. 172800 IN A 192.33.14.30
b.gtld-servers.net. 172800 IN AAAA 2001:503:231d::2:30
c.gtld-servers.net. 172800 IN A 192.26.92.30
d.gtld-servers.net. 172800 IN A 192.31.80.30
e.gtld-servers.net. 172800 IN A 192.12.94.30
f.gtld-servers.net. 172800 IN A 192.35.51.30
g.gtld-servers.net. 172800 IN A 192.42.93.30
h.gtld-servers.net. 172800 IN A 192.54.112.30
i.gtld-servers.net. 172800 IN A 192.43.172.30
j.gtld-servers.net. 172800 IN A 192.48.79.30
k.gtld-servers.net. 172800 IN A 192.52.178.30
l.gtld-servers.net. 172800 IN A 192.41.162.30
;; Query time: 64 msec
;; SERVER: 198.41.0.4#53(a.root-servers.net)
;; WHEN: Mon Jan 31 09:22:17 2011
;; MSG SIZE rcvd: 504
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 12860
;; flags: qr; QUERY: 1, ANSWER: 0, AUTHORITY: 2, ADDITIONAL: 2
;; QUESTION SECTION:
;ignitemail.com. IN A
;; AUTHORITY SECTION:
ignitemail.com. 172800 IN NS ns1.faithhiway.com.
ignitemail.com. 172800 IN NS ns2.faithhiway.com.
;; ADDITIONAL SECTION:
ns1.faithhiway.com. 172800 IN A 207.200.19.162
ns2.faithhiway.com. 172800 IN A 207.200.50.142
;; Query time: 152 msec
;; SERVER: 192.54.112.30#53(h.gtld-servers.net)
;; WHEN: Mon Jan 31 09:22:17 2011
;; MSG SIZE rcvd: 111
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 43016
;; flags: qr aa; QUERY: 1, ANSWER: 1, AUTHORITY: 2, ADDITIONAL: 2
;; QUESTION SECTION:
;ignitemail.com. IN A
;; ANSWER SECTION:
ignitemail.com. 3600 IN A 206.127.2.64
;; AUTHORITY SECTION:
ignitemail.com. 3600 IN NS ns1.faithhiway.com.
ignitemail.com. 3600 IN NS ns2.faithhiway.com.
;; ADDITIONAL SECTION:
ns1.faithhiway.com. 3600 IN A 206.127.2.71
ns2.faithhiway.com. 3600 IN A 206.127.2.72
;; Query time: 25 msec
;; SERVER: 206.127.2.71#53(ns1.faithhiway.com)
;; WHEN: Mon Jan 31 09:22:18 2011
;; MSG SIZE rcvd: 127
I really think this is a problem we have somewhere in our setup, but whether it is ignorance of something with DNS on my or my fellow engineer's end or just a dumb mistake we made, I have yet to find it.
Problem solved. Finally. Apparently Register.com did not update the glue records for ns1 and ns2.faithhiway.com despite our initial request for them to do so (and their confirmation that it had been done).
The tests that I posted above in my update showed that despite their confirmation of the update, the glue records were not propagating correctly. I went ahead and pushed another update to our glue records and it looks like this time we are seeing propagation:
You have 2 problems:
The queries for ns1.faithhiway.com are returning incorrect results.
The name servers listed for your domain are wrong.
You're actually testing a little backward. You're testing for what ip address is being returned when querying for ns1.faithhiway.com but what you should be testing for first is what name servers are actually being returned for faithhiway.com. A Whois lookup and an nslookup return the following servers as being the name servers for faithhiway.com:
dns01.gpn.register.com
dns02.gpn.register.com
dns03.gpn.register.com
dns04.gpn.register.com
dns05.gpn.register.com
So you need to get that fixed first.
Lots of servers ignore your TTL and cache the info much longer than they should. The easiest way to solve this is usually to contact the affected network's operators and let them know. They're usually really good about fixing it pretty quickly.