While doing some unrelated troubleshooting (at least I think so, shared-printer issues) I came across a set of Event Log entries that have me concerned.
Machine Name: labcomputer82
Source: Security-Kerberos
Event ID: 4
Event Description:
The Kerberos client received a KRB_AP_ERR_MODIFIED error from the server labcomputer143$. The target name used was RPCSS/imagemaster4.ad.domain.edu. This indicates that the target server failed to decrypt the ticket provided by the client. This can occur when the target server principal name (SPN) is registered on an account other than the account the target service is using. Please ensure that the target SPN is registered on, and only registered on, the account used by the server. This error can also happen when the target service is using a different password for the target service account than what the Kerberos Key Distribution Center (KDC) has for the target service account. Please ensure that the service on the server and the KDC are both updated to use the current password. If the server name is not fully qualified, and the target domain (AD.DOMAIN.EDU) is different from the client domain (AD.DOMAIN.EDU), check if there are identically named server accounts in these two domains, or use the fully-qualified name to identify the server.
There are three machine names used in this message. It's generated on labcomputer82, it's attempting to talk to another lab workstation called labcomputer143, and the service in question (RPCSS) refers to the name of the machine that this machine was imaged from (and possibly also that of labcomputer143, I'm not sure). The thing that has me raising both eyebrows is that the machine named labcomputer82
is attempting to use an SPN of RPCSS/imagemaster4.ad.domain.edu
.
The SPN attribute on the computer object in AD looks just fine. It has all the names it should have.
These machines are imaged using Ghost and (at least in this specific case) sysprep was not used. Of the over 3,000 computer objects in our AD domain, somewhere around 1,700 of them are computer-lab seats that are frequently imaged and as of September the majority were imaged using the Ghost/Profile-Copy method instead of the Ghost/sysprep method Microsoft recommends. If this error report is something major that Windows is quietly working around, perhaps kerberos is broken for these machines and it's failing back to NTLMv2, I'd like to know so I can add pressure in my drive for sysprep adoption.
You should absolutely do Sysprep to ensure each machine has a unique ID. Sysprep does other stuff too and failing to do Sysprep can cause various Windows component to fail unpredictably.
Just by looking at the message (minus your sysprep details) this is what we can infer. A process running on labcomputer82 was trying to communicate with imagemaster4. But it appears the name it resolved and ended up communicating with was identifying itself as labcomputer143. It is likely that you have DNS issues here. Perhaps a nslookup output of imagemaster4 and labcomputer143 should be compared to ensure they dont both use the same IP address.
The RPCSS/imagemaster4.ad.domain.edu SPN that labcomputer82 requested was presented to what labcomputer 82 thought was imagemaster4. However, it turned out to be a machine/service identifying itself as labcomputer143. Obviously the computer account passwords for labcomputer143 and imagemaster4 will differ. Hence ticket encrypted using imagemaster4 won't be decrypted by labcomputer143. Hence the error.
I recommend 2 things. 1. you rule out any suspicions around sysprep usage. Make sure each machine has a unique ID for itself and associates itself in AD with a unique computer account. 2. Read the kerberos troubleshooting blog entries on http://blogs.technet.com/askds to get an understanding of troubleshooting these issues and common/likely causes.
The details are largely as maweeras suspected. The lack of Sysprep in the generation of the drive-image left the RPCSS service convinced it's SPN was
RPCSS/imagemaster4.ad.domain.edu
. When it attempted to get its TGT it was contactingimagemaster4.ad.domain.edu
, which at the time was occupied by a computer namedlabcomputer143
. Which failed.This was causing fall-back to NTLM for these stations.
As a workaround, I added a preemptable DNS entry in the domain DNS for imagemaster4.ad.domain.edu (not currently in service) to 127.0.0.1. After examining the Event Logs of several affected workstations, they're now functioning correctly for Kerberos. This is a workaround, and will stop working when imagemaster4 joins the domain and registers its DNS entries. But at least I have a method now.
Sysprep for the future.