During my recent deployment of Exchange 2013 I came across an interesting error via managed availability. During our deployment we had some networking issues we had to work out and as part of my troubleshooting I had installed Exchange 2013 on a server on a different class of hardware to rule out local NIC or driver issues. Sadly, I was in a hurry and the server wasn’t a clean OS build. When it came time to remove the server; the un-install went sideways and all the server would do was login with a black desktop and a command window, and no this was not a server core install.
As an Exchange Administrator, this is not what we like to see. We’ve all had to manually remove an exchange server from AD and while it is not an overly time consuming process it is not best practice, but I didn’t have this option and so off I went.
I removed the server from the Configuration container of Active Directory and the databases that was created. I then removed the computer object before rebuilding the server. The rebuild went smooth, but then I started getting alerts from SCOM telling me that it could not get a heartbeat from the machine. In my haste I forgot to have the AD team remove the server from SCOM which I quickly remedied.
A day later, I was gifted with the following alert via email from one of my existing Exchange 2013 servers.
Source: ServerName – RemoteMonitoring
Path: FQDN; FQDN
Last modified by: System
Last modified time: 11/20/2013 3:57:20 PM Alert description: Machine ‘(removed server FQDN)’ has failed to heartbeat since ’11/14/2013 10:59:03 PM’, as observed by machine ‘Current 2013 FQDN’. Restarting the Exchange Health Manager Service did not fix the problem.
If you ran the below Exchange PowerShell Command you would see the Health Set as ‘Unhealthy’
Get-ServerHealth -Identity ‘Server Name’ -HealthSet ‘RemoteMonitoring’
This alert told me that managed availability still thinks the Exchange server exists despite the lack of AD objects. Since, not every server was giving me this error, he had to somewhat localized so I turned to the Managed Availability inside the Registry of the alerting server.
Local Managed Availability settings such as Overrides, ServerComponentStates and BugCheck Dates can be found in this location.
The Key I was looking for was another level down.
In this location I found a String Value with the old server name and the last notification date. I backed up the registry (we all do this right?) and deleted the string. And no more erroneous alerts in email or SCOM.
If you like to use remote powershell here is how you can do the above remotely.
Enter Remote PS Session
$credential = Get-Credential
Enter-PSSession –ComputerName <remote server> -Credential $credential
Reg export HKLM\Software\Microsoft\ExchangeServer\V15\ActiveMonitoring\Subjects export.reg
(This will export the key to the document library of your local user profile on the server)
Remove-ItemProperty –Path “Registry::HKLM\Software\Microsoft\ExchangeServer\V15\ActiveMonitoring\Subjects” –name “Server FQDN”
I hope you never have to uninstall Exchange 2013 the hard way, but if you do, I hope this removes one obstacle to a stable system.