SCOM 2007 R2 : Configuration not loaded

Hi Guys,

On Tuesday, the SCOM 2007 R2 infrastructure of one of my customer reflected the following symptoms :

  • Newly installed agents display as "Not Monitored" in the Operations Console.
  • Agents show as being in maintenance mode in the Operations Console, yet the workflows are not actually unloaded by the System Center Management service on the monitored computer.
  • Configuration changes, new rules or monitors, or overrides will not applied to some agents.
  • The Operations Manager event log on one or more agents will display event 21026, indicating that the current configuration is still valid, even though the configuration for these agents should have been updated.
  • The file "OpsMgrConnector.Config.xml" in the management group folder under "Health Service State""Connector Configuration Cache" does not update for long periods of time relative to the rest of the management group on one or more agents.
  • SCOM Email alerts will not triggered.
  • RMS event log is flooded by event 21042: Operations Manager has discarded 1 items in management group MGTGROUP, which came from $$ROOT$$. These items have been discarded because no valid route exists at this time. This can happen when new devices are added to the topology but the complete topology has not been distributed yet. The discarded items will be regenerated.
  • RMS contains the event 29106: The request to synchronize state for OpsMgr Health Service identified by "a340d2a9-ab1b-2e53-ca78-d303510c831d" failed due to the following exception "Microsoft.EnterpriseManagement.Common.DataItemDoesNotExistException: An instance was deleted before its properties could be read.On the RMS, if you deleted the Health Service State folder and restarted the 3 SCOM Services, the file "OpsMgrConnector.Config.xml" is not generated and on the MS you have the event 20070: The OpsMgr Connector connected to RMSFQDN, but the connection was closed immediately after authentication occurred. The most likely cause of this error is that the agent is not authorized to communicate with the server, or the server has not received configuration. Check the event log on the server for the presence of 20000 events, indicating that agents which are not approved are attempting to connect.

Concerning the resolution, the first step is to make sure to have a good backup of all your Operations Manager databases /! The actions below are not supported, do it at your own risk /!

Connect on the OperationsManager Database and run the following query :

select MT.BaseManagedEntityId, BME.BaseManagedEntityId from BaseManagedEntity BME
left outer join MT_Computer MT
on MT.BaseManagedEntityId = BME.BaseManagedEntityId
where BME.BaseManagedTypeId = (select ManagedTypeId from ManagedType where TypeName 
= N'Microsoft.Windows.Computer')
and MT.BaseManagedEntityId IS NULL

This query will look for the objects that have not been completely and correctly deleted from the database. Normally this query must return nothing, but in our case, it returned a BaseManagedEntityId. This GUID correspond to an object that has been deleted but some references are still existing in the DB.

clip_image002

We have now to identify which computer is behind this id. For that run that following query

select * from basemanagedentity where basemanagedentityid in ('IDFROMTHEPREVIOUSQUERY')

clip_image004

In our case, it returned an exchange server, I did a quick check of the server itself, SCOM agent is installed, nothing strange in the log. I went back in my SCOM console, and there, impossible to find the computer in the agent managed view. The object is well deleted but not all his references.

As this server seems to be cause of our trouble, we will delete all his references from the database by running the following query.

Begin TRAN

DECLARE @NetworkDeviceID as UniqueIdentifier
DECLARE @Name as nVarChar(30)

Set @NetworkDeviceID = 'IDFROMTHEPREVIOUSQUERY'

update basemanagedentity
set isdeleted = 1
where basemanagedentityid = @NetworkDeviceID
COMMIT TRAN

Now, go back to the RMS, stop the 3 SCOM services, delete the ‘Health Service State’ folder and restart the 3 SCOM services.

Normally, after a few second the OpsMgrConnector.Config.xml file will be created in the “%ProgramFiles%System Center Operations Manager 2007Health Service StateConnector Configuration CacheMGTGROUP” and everything will start to work correctly.

Now concerning the root cause itself, I don’t have any explication, why this server and all his references have not been successfully deleted the first time ? How one server references could cause so much trouble to the infrastructure?

I would like to thank you my MVP friends Silvio Di Benedetto, Marnix Wolf, Bob Cornelissen and also Mihai Buia from the Microsoft Premier Support for their help to resolve this problem.

Cheers

Christopher

Tweet about this on TwitterShare on FacebookShare on LinkedInShare on Google+Email this to someoneShare on TumblrPin on PinterestDigg thisShare on RedditFlattr the authorBuffer this pageShare on StumbleUpon

About Christopher Keyaert

Christopher Keyaert is a Consultant, focused on helping partners to leverage the System Center and Microsoft Azure cloud platform. He is also a Microsoft Most Valuable Professional (MVP) for Cloud and Data Center Management and a Microsoft Certified Trainer (MCT).
This entry was posted in Operations Manager. Bookmark the permalink.

0 Responses to SCOM 2007 R2 : Configuration not loaded

  1. Pingback: SCOM 2007 R2 : Configuration not loaded…again | Christopher's System Center Blog

Leave a Reply

Your email address will not be published. Required fields are marked *