I have Exchange SCR all setup and working. Until my cluster failed over to the backup node. Now the replication won't run and when I try and query for the replication status using either the UI or the CLI I get the following.
Get-StorageGroupCopyStatus : Microsoft Exchange Replication service RPC failed
: Microsoft.Exchange.Rpc.RpcException: Error e0434f4d from cli_GetCopyStatusEx
at Microsoft.Exchange.Rpc.Cluster.ReplayRpcClient.GetCopyStatusEx(Guid[] sgG
uids, RpcStorageGroupCopyStatus[]& sgStatuses)
at Microsoft.Exchange.Cluster.Replay.ReplayRpcClientWrapper.InternalGetCopyS
tatus(String serverName, Guid[] sgGuids, RpcStorageGroupCopyStatus[]& sgStatuse
s, Int32 serverVersion)
At line:1 char:26
+ get-storagegroupcopystatus <<<<
Now if I add in the -StandByMachine switch it returns a recordset. I can reseed the replication to the node which was active, but the replication doesn't ever do anything besides that and I can't query for the status.
So far I've done the following.
- Changed the rights on the guid network shares on both nodes of the cluster by adding in Exchange Servers with Full Control.
- Changed the rights on the folders which are the base of those network shares giving the Exchange Servers full control.
- Checked that the RPC Services are up and running on both nodes
- Checked that the remote registry services are up and running on both nodes.
- Checked the Exchange Servers domain group contains all Exchange Servers
I've looked in the event logs for the two servers which make up the cluster. I'm seeing errors about the network shares which let me to change the rights. About every 15 minutes in the logs for the machine that is currently passive I'm seeing 5 errors.
The directory '\ascoex101b\be321a6a-7b2b-427a-9130-9f0ac04438b2$' required by the Microsoft Exchange Replication Service for ASCOEX101V1\Mailbox Group 1 could not be accessed. Check the network connectivity and name resolution. Error: 53. The directory \ascoex101b\be321a6a-7b2b-427a-9130-9f0ac04438b2$ required by the Microsoft Exchange Replication Service for ASCOEX101V1\Mailbox Group 1 does not exist. Check the file system and its permissions. There was a problem with 'ascoex101b', which is an alternate name for 'ASCOEX101B'. The list of aliases is now 'ascoex101b', and the alias 'was' removed from the list. The specific problem is ''. The directory '\ascoex101b\2e491d6f-3691-49e7-b1b8-3e563b495c6b$' required by the Microsoft Exchange Replication Service for ASCOEX101V1\Public Folder Group could not be accessed. Check the network connectivity and name resolution. Error: 53. The directory \ascoex101b\2e491d6f-3691-49e7-b1b8-3e563b495c6b$ required by the Microsoft Exchange Replication Service for ASCOEX101V1\Public Folder Group does not exist. Check the file system and its permissions.
On the server that is active I'm seeing this error at about the same time.
Error updating public folder with free/busy information on virtual machine ASCOEX101V1. The error number is 0x8004010f.
The machine which is currently active is ascoex101b, the machine which is currently passive is ascoex101a, and the virtual name is ascoex101v1.
I'm assuming that the problem is related to something with the network shares but according to the share rights and folder rights everything should be working correctly. Any idea what I should check next?
It would appear that in my case the answer was simple. Reboot the now passive node and the replication started working again.