Friday, July 19, 2013

SCOM 2007 R2: Alert - Processing backlogged events taking a long time


At my work place, in SCOM 2007 R2 we had several production servers showing as critical with the alert – Processing Backlogged events taking a long time.


The Monitor transitions to Warning state if Ops Manager is still processing backlogged events after 10 min. Ops Manager processed event ID 25017. If after another 10 min if the backlogged events are still being processed event ID 25018 is logged and transitions to Critical state. These event Ids appeared on some of the servers I checked.

This alert indicates:
1. Computer where the agent is installed may be low on resources. Check the resources on the computer - memory,CPU.
2. The computer is logging several events per minute. Check the event log to see if there is an application or event logging these events.
3. If the health service was stopped on the computer then when it is started, it has to process all the events from the last one it processed.

The alert is available in the System Center Core Monitoring Management Pack as can be seen below:



It appears that the backlog has been processed but the state has not returned to healthy on some servers


The below resolution worked in our environment

1.     On the affected server stop service – System Center Management

2.     Rename folder - C:\Program Files\System Center Operations Manager 2007\Health Service State

3.     Restart service – System Center Management

4.     Check that a new folder for Health Service State is created.

5.     Manually close alert if open.
6.   Check the agent is healthy again otherwise Reset Heath from SCOM console

No comments:

Post a Comment