Named/BIND is crashing every few days, usually I have few tools that take care of such crash and restart the service but lately they can't really restart it. What's odd is that when I try to manually restart I get this error:
named failed to start named dead but subsys locked
When running this command:
ps aux | grep named
There is some output indicating that the service is still "running" and deleting /var/lock/subsys/named or the pid file won't help. The only thing that help is kill -9 (and I hate running that command)
Looking at my /var/log/messages don't give me much clues about what happened there. What I'd like is to understand what happened there, it bugs me because having my domain name server down is critical.
Could you share if you had similar problems? or how I could investigate further such problems?
I am running centos 5.3 - 64bit - kernel: 2.6.18-128.2.1.el5.028stab064.4 / BIND 9.3.4-P1
Thanks,
The version of BIND that you are running seems to be susceptible the remote denial of service recently advertised in CVE-2009-0696. Exploits are available in the wild and your frequent crashes may relate to this. I'd advise you to upgrade as soon as you can and then see if the problem persists.
Centos is my preferred distro but for important components such as bind I wouldn't even consider using the repositories, as they can be quite out of date. I suggest you do likewise and compile from source. Long term it's more maintainable and predictable that way. BTW, what does rndc status show?
If you want to confirm that your BIND instance isn't vulnerable to the recent DoS vulnerability, try running this exploit code against your server. Make sure you're logged in to the machine so you can restart BIND if it crashes :-)
I know this isn't strictly in line with the question, but on a security note you should identify whether you need the features of bind or whether you could get away with one of the alternatives. Djbdns is significantly more secure, and powerdns and maradns aren't far behind. Bind is an old, massive, and complicated program, that never the less excels at what it does but is also quite labyrinthine. Complexity is the enemy of security, depending on your needs and the size of your managed domain you may nbe better off with one of the other available options.