At my work place, in SCOM 2007 R2 we
had several production servers showing as critical with the alert – Processing Backlogged
events taking a long time.
The
Monitor transitions to Warning state if Ops Manager is still processing backlogged
events after 10 min. Ops Manager processed event ID 25017. If after another 10
min if the backlogged events are still being processed event ID 25018 is logged
and transitions to Critical state. These event Ids appeared on some of the
servers I checked.
This alert indicates:
1. Computer where the agent is installed may be low on resources. Check the resources on the computer - memory,CPU.
2. The computer is logging several events per minute. Check the event log to see if there is an application or event logging these events.
3. If the health service was stopped on the computer then when it is started, it has to process all the events from the last one it processed.
The alert is available in the System Center Core Monitoring Management Pack as can be seen below:
It
appears that the backlog has been processed but the state has not returned to
healthy on some servers
The
below resolution worked in our environment
1.
On
the affected server stop service – System Center Management
2.
Rename
folder - C:\Program Files\System Center Operations Manager 2007\Health Service
State
3.
Restart
service – System Center Management
4.
Check
that a new folder for Health Service State is created.
5.
Manually
close alert if open.
6. Check the agent is healthy again otherwise Reset
Heath from SCOM console
No comments:
Post a Comment