Domain Controller in VM is unable to authenticate the users

Findings in Windows events during this kind of scenarios:
 
1.       A NetoLogon service is getting paused.
2.       Unable to rollback operation on NTDS Database.
3.       An attempt to write the edb.log return failed.
Directory Service Logs:

ESX Logs:
Observed below events in the VM logs at same point of time prior to the net logon service pause.

Why this behavior?

Based on the events, active Directory database is encountering problems with respect to read and write operations to the NTDS database.
 A sequence of events is observed indicating a possible AD database corruption. After multiple failures to update the directory database it results in a condition wherein users cannot logon to AD, and as a proactive measure the NetLogon service is paused by AD. This causes users or machines to unable to authenticate and logon to the server or domain.
Suspected Causes:
Possible causes can be,
·         Database Corruption
·         Snapshot process causing the performance hit, freezing the system, especially the disk IO.
·         Antivirus scanning the database and corresponding files
Also this issue can happen due to unsuccessful P2V conversion of the DC or DC is restored from a snapshot.
Suggestions and Recommendations:

  • Offline defragmentation of AD database
  • Check with application team if any specific tasks are running which is interfering with the snapshot backup process rendering the system to be non-responsive.
  • Confirm that Antivirus scan timings and also it excludes NTDS and other AD related folders from the scan selection list.
I recommend to create another DC (VM) and move all roles to the new DC, then demote the OLD DC and if required promote it as a DC again. This is to avoid situations like offline defragmentation, repair and restore of the database.

ADPREP /DOMAINPREP failures

I am putting forth my investigation into this problem and the solution I found hoping it will be helpful to others in similar Scenarios.
Note that the Fix may not be applicable if the cause of failure is any bit different to what I faced.
I have recently upgraded one of my customer’s environments AD to Windows server 2008 R2 from 2003 server
During the course of this activity, command ADPREP / DOMAINPREP returned with the following error statement.
Error Code:
Message: 000020B5: AtrErr: DSID-03152395, #1: 0: 000020B5:
DSID-03152395, problem 1005 (CONSTRAINT_ATT_TYPE), data 0, Att 9054f (otherWellKnownObjects).

The error code returns 0x13 DSID-03152395 in log file has to be converted to readable format using the tool DSID.exe and is available only with Microsoft and is not for general public. After decoding the code with the help of MS, i have come up with below findings based on the status message. 
What caused the Failure?
The execution of adprep /domainprep will work on the various domain wide operations to make the domain configuration changes to adapt for W2K8 R2. In that, one of the operations will be to create Managed Service Accounts container in AD.
Windows Server 2008 R2 introduces a new type of Container account called a Managed Service Accounts that assists in the endpoint administration. In a way, a managed service account can function like the Built-in organizational unit in default domain configurations.
The error cited in the Adprep.log indicated that domain configuration attribute could not be populated, and this is because of the Managed Service Accounts container already being present in the current AD.
By chance or mistake, we had an OU called” managed service account” in 2k3 environment.  Hence while preparing the AD, there is a conflict in creating the Managed service account OU by system command.

What fixed the failure?

Delete the Managed service accounts OU from windows 2003 AD and then run the adprep domainprep.
Don’t rename the OU, because, chances of solving the issue by renaming the OU are very slim.
So better delete it.
Action to be performed:
          Take a system state backup on all domain controllers.
          Move the contents of the Managed service account OU to another OU.
          When all users and sub OU have been, delete the OU.

Then run adprep domainprep and it should complete without errors.