I have a small environment running Windows 2008 R2 where the DHCP service on the domain controller fails every two weeks.
The most-visible error is Event ID 1059
and the Event Viewer message is:
"The DHCP service failed to see a directory server for authorization."
The setup features two domain controller and the usual services and roles (file, print, Exchange). Restarting the service fails for a variety of reasons. I've had the following messages at different times:
- "Not enough storage is available to complete this operation".
- "Unable to determine the DHCP Server version for the Server 192.168.x.x"
- "The DHCP service has detected that it is running on a DC and has no credentials configured for use with Dynamic DNS registrations initiated by the DHCP service."
A reboot of the domain controller resolves the issue for ~2 weeks. The systems are virtualized and there are no network connectivity issues.
Any ideas as to what's happening here?
Edit - The solution seems to be to fix a misbehaving domain controller.
This part really jumps out at me:
I'm assuming that you actually do have disk space available on the server. This points to the possibility of data or disk corruption. Have you run a chkdsk? Do the credentials the DHCP service runs under have permissions to the log directory and to the directory where the DHCP db is stored?
Ruling out those possibilities, next step is to check that there are no invalid entries in DNS for your domain, especially if there was a DC that was removed from the domain at some point. First do a nslookup on the FQDN of your domain, check to make sure there are no invalid IP addresses returned (I've seen sometimes a 2nd unused NIC on a DC with 169.254.x.x address register itself in DNS as a valid NS/DC). Next on the DNS server check SRV entries for LDAP and KRB, make sure they are all valid.
Since I've had the privilege of actually working in this specific environment, I can say with certainty that the DC that is hosting DHCP fails replication and goes unresponsive to requests for various Directory Services functions (like authorizing DHCP servers) every few weeks. This DHCP issue is a symptom of the larger replication problem.
Since the server that DHCP is on is a DC, it only ever looks to itself for authorization. When Directory Services stops functioning on it, so does DHCP.
The issues seems that you are not an enterprise administrator of your the tree in your forest. Do you have any other DHCP's in your domain? Because if you do, try to de-authorize it and see if you can, if you can't then you don't have access which proves the point of not being an enterprise administrator. Please also take a look at this article:
http://technet.microsoft.com/en-us/library/cc775255(v=ws.10).aspx
Maybe there is a rogue dhcp server (Check with nmap)? Also, check http://support.microsoft.com/kb/938456 describing conflicting records in AD.
maybe you run into a bug. http://support.microsoft.com/kb/2632816/en-gb
Just a few articles to look at...some may not seem to apply, but look carefully and consider the causes in each article:
http://support.microsoft.com/kb/935744
http://blogs.technet.com/b/abizerh/archive/2009/07/12/troubleshooting-the-error-not-enough-storage-is-available-to-complete-this-operation.aspx
http://forums.whirlpool.net.au/archive/1533833
I would check for AD replication issues.
http://www.microsoft.com/en-us/download/details.aspx?id=30005
Couple of questions for you... Can you try running a DCDiag on both DCs and posting any errors? Are there any other errors in the event logs? If there's no errors now try running it again on both dcs when the service has failed before rebooting the server.
Have you tried simply reinstalling DHCP on the trouble server?
So, two virtual Domain Controllers... are both DHCP servers? It sounds like only one is. In which case I'd be tempted to run for a few weeks with the DHCP server only using the other domain controller as DNS. And then for a few weeks with the other domain controller shut down.
You can always revert the change if it impacts on users but it might help narrow down which box (if it is only one) is causing the issue.
I'd also be tempted to add a third DC and then decommission the second one to rule out it being some weird installation corruption of the type Windows loves to flump into.
Have you tried restarting services on the DC rather than rebooting it?
Do the DCs host other services (file, exchange, etc)? Since you've got a virtualised environment do you have headroom to move those services onto their own servers for a few weeks to rule out confusion from those roles clashing?
Additionally, and since it hasn't been commented on, with respect to the "Not enough storage is available to complete this operation" error. If the server's disks are full then all of its DC functions are going to start to fail. Are the disks full?
To resolve the issue, kindly remove the Server Bindings (Remember we have to have a Static IP address for the DHCP Server to do this).
Steps: