I'm experiencing symptoms similar to those of the other question. That is, peoples are complaining about a domain name being inaccessible. And then, it suddenly starts working after a while. But I doubt it's because of unreliable DNS
servers. They are of my domain name registrar, which is pretty popular in my country.
Could there be some other reasons for this? It's a subdomain, like in sub.example.com
and I make it work with wildcard DNS
record. Could that possibly be a reason? Maybe some old DNS
servers doesn't understand these sort of things? The other reason I can think of is some temporary issues in the internet, like some hosts can't access the other ones? Or some DNS
servers filtering out records by some criterion?
UPD The setup is simple, no load balancers, no clusters, no round-robin DNS
servers. And I'm talking about publicly available server. The users are the users of the internet. I'm managing both example.com
and sub.example.com
. They are in one DNS
zone.
UPD Or maybe after all it's because of the domain registrar. Its servers respond with a timeout:
>nslookup sub.example.com ns.domain.registrar.com
DNS request timed out.
timeout was 2 seconds.
Server: UnKnown
Address: 52.16.198.15
Name: sub.example.com
Address: 51.59.10.10
I suppose, not all the DNS-servers would tolerate timeouts, don't you think?
From the comments above, this definitely sounds like a DNS issue. I could see this being caused by hosts caching records, DNS servers themselves caching records (recursively), or even if you have a load-balancer like an F5 appliance that could be doing caching. One that's bitten me more times that I care to admit, is web proxies performing caching of resolved results.
In visiting a website, you have two basic parts, the journey and the destination. The journey is the pre-connection stage and the destination is the actual connection to the server. You need to isolate the pre-connection part of the equation from the actual connection.
Once you resolve the IP address of the website, try typing in the IP address instead of the domain name in your browser. Do your ping tests and traceroutes against this IP address. Is the IP address responding? Is it continually responding over a period of time (ping -t x.x.x.x on Windows) or is it going up and down? If you can get the site with the IP and not the hostname, it's your DNS. If you can't get to it with either, you've got yourself a connectivity problem. If this is intermittent for either the domain name or the IP, you probably have a intermittent load-balancing problem (such as round-robin DNS servers, or improperly configured clustering) or you have an intermittent connectivity issue such as layer 1 or 2 issue.
Something else to add to your toolbox is a looking-glass website called isitdownrightnow.com which is specifically designed to help separate out a problem with your system or if it's a more global issue.
Some of this info I've provided is just shooting blindly in the dark, since this is an ongoing troubleshooting process and information is limited so far. If you can post more info as comments, I'm confident we can help get you straightened out.
For the future, I will just add some tips from my experience:
On Unix boxes you use
On Windows boxes you use
I did not fully understand, who is SOA (Source Of Authority) for example.com and for your sub.example.com, and whether they are different from the caching DNS(-es) that your users use — if not, all three or more must be
dig
-ged. DHCP might give several DNS-es to your users and they might have different opinions on the records (caches).On both unix and Windows you can create cron jobs / schedule tasks to log
dig
results once per minute to some logfile and thengrep
through it to collect some difficult-to-catch anomalies. (It helped me prove in one case that our servers indeed sometimes cannot ping each other on the local network!)