I have a caching-only dns server which get ~3k queries per second. Here is specs:
Xeon dual-core 2,8GHz 4GB of RAM
Centos 5x (kernel 2.6.18-164.15.1.el5PAE)
bind 9.4.2
rndc status: recursive clients: 666/4900/5000
About 300 new queries (not in cache) per second.
Bind always uses 100% on one core on single-thread config. After I recompiled it to multi-thread, it uses nearly 200% on two core :( No iowait, only sys and user. I searched around but didn't see any info about how bind use CPU. Why does it become bottleneck?
One more thing, here is RAM usage:
cat /proc/meminfo
MemTotal: 4147876 kB
MemFree: 1863972 kB
Buffers: 143632 kB
Cached: 372792 kB
SwapCached: 0 kB
Active: 1916804 kB
Inactive: 276056 kB
I've set max-cache-size to 0 to make sure bind can use as much RAM as it want, but it always stop at ~2GB. Since every second we got not cached queries so theoretically RAM must be exhausted but it wasn't.
Do you have any idea?
TIA,
-Gk
Which version of BIND are you using? Versions before Bind 9.5 have known scalability problems with high loads, see https://www.dns-oarc.net/files/dnsops-2007/Graff-BIND9-cache.pdf .
Besides:
I recommend you perform a side test with dnscache from dnscache, it takes 10 minutes to install, is extremely simple to tune and maintain, and has predictable performance.
Interesting problem... Never seen
bind
use 100% CPU, but quick search turned out a very interesting page that may help you fix the problem... Let me know how it turns out. I am interested to know the outcome.3k qps for a server of that class is relatively low volume in raw I/O and memory bandwidth terms - I'd expect to be able to get nearer 20k if it was an authoritative server.
That said, BIND 9.4.2 is old. If you can roll your own or use non-RHEL RPMs you really should try BIND 9.7.x instead and see if that solves your performance issues.
Also, to use more than 2GB of RAM you'd need to be running on x64 in 64-bit mode rather than x86.
You will probably get much better performance with Unbound. If you are using BIND only as a caching recursive server with nothing special in the configuration, switching to Unbound will be really easy.