I have an intermittent problem, and I'm not sure where to start trying to troubleshoot it.
In our dev environment, we have two visible IP addresses on load balancers, one to the front-end, and one to a number of back-end service machines. The front-end is configured to take a wildcard DNS name to support generic "portals."
dev.example.com A 10.1.1.1
*.dev.example.com CNAME dev.example.com
The back-end servers are all specific names within the same space:
core.dev.example.com A 10.1.1.2
cms.dev.example.com CNAME core.dev.example.com
search.dev.example.com CNAME core.dev.example.com
Here's the problem. Periodically a developer or a program trying to reach, say, cms.dev.example.com will get a result that points to the front-end, instead of the back-end load balancer:
cms.dev.example.com is an alias to core.dev.example.com
core.dev.example.com is an alias to dev.example.com (WRONG!)
dev.example.com 10.1.1.1
The developers are all on Mac OS X machines, though I've seen the problem occur on an Ubuntu machine as well, using a local cloud host DNS resolver.
Sometimes the developer is using a VPN, which directs the DNS to its own resolver, and sometimes he's on the local net using a DNS resolver assigned by the NAT router.
Sometimes clearing the Mac OS X DNS cache, logging into the VPN, then logging out of the VPN, will make the problem go away.
The origin authoritative server is on zerigo, and a dig directly to their name servers always seems to give the correct answer. The published DNS cache time for these records is 15 minutes, but the problem has been intermittent for about a week.
Any troubleshooting suggestions?
Hmmm....what happens when you replace the wildcard CNAME directive with:
Don't mix CNAMEs with wildcards, especially if one of your CNAMES might match a wildcard name.
It appears that the mix of wildcards, CNAMEs and DNS cacheing can give you inconsistent results under these conditions.
I solved my intermittent resolving problems by eliminating all of the CNAMEs and replacing them with A records. Not very DRY, but no more inconsistent lookups.