Preface:
We have caching resolvers at each of our geographic network locations. These are clustered for resiliency and their locality reduces the latency of internal requests generated by our servers.
This works well. Except that a vast quantity of the requests seen over the wire are lookups for the same records, generated by applications which don't perform any DNS caching of their own.
Questions:
Is there a significant benefit to running lightweight caching daemons on the individual servers in order to reduce repeated requests from hitting the network?
Does anyone have experience of using
[u]nscd
,lwresd
ordnscache
to do such a thing? Are there any other packages worth looking at?Any caveats to beware of? Besides the obvious, caching and negative caching stale results.
We use nscd on a few hundred machines, it "just works" in my experience. It massively reduced the load on our DNS servers.
The only thing to watch for is by default it will cache group/user/service lookups as well as host lookups - you may wish to disable this (in our case we wanted to cache these lookups as well).
I remember having problems with nscd many years ago, but the recent versions seem much improved.
I've used nscd in a few environments, as others have said it pretty much "just works", but if your caching resolvers at each site are working properly I would say there isn't much of an upside to local caching: Latency should be low, and you're not sending any extra requests to the outside world, just generating some extra internal chatter.
I have been bitten a couple of times by nscd holding on to a stale entry and having "one broken machine" because of it, so my paranoid advice is if your internal DNS chatter isn't really impacting performance don't bother with a local cache (and if performance is being impacted, document the existence of nscd so you remember to flush its cache if you ever make a time-critical change).