Quirky notification behaviour of three state monitors

If you’re like me, you like the idea of a single monitor that can generate both warning and critical alerts. The Logical Disk % Free Space monitor is one example of a three state monitor. It can alert on both warning and critical states (once you override it to do so). The catch is that only one alert notification is generated from this monitor when it first changes state (either to warning or critical). When the state changes from warning to critical, the repeat count on the alert is incremented and a second notification is not sent. This means that you would only get a notification if the monitor went straight from healthy to critical. If the monitor changes from warning to critical, a notification is not sent.

This quirk was resolved with UR11 – I highly recommend that you update your SCOM environment to take advantage of this and other fixes. If you can’t do that for whatever reason, read on. Before UR11, people had different ways of dealing with this behaviour – two of the most popular ideas were duplicating the disk space monitor so there were separate monitors for warning and critical thresholds, or only alerting on critical and using the SCOM console to look at the warning alerts. Both of these would have required changing the behaviour of the support teams, who were already setting both warning and critical threshold overrides on the disk space monitor. So, I came up with another solution.

I wrote a script to reset the health of disk space monitors that had changed from warning to critical state. The script retrieves the critical logical disk alerts that have a repeat count – the repeat count indicates that the health state has changed more than once. The script resets the disk health (which closes the existing alert). As the disk is still critically low on space, the state will change back to critical and a new alert will be generated that sends a notification. I had this script running as a scheduled task every 15 minutes. You can download the script from my github.

Now that I’ve updated my SCOM environment I don’t need to use the script anymore…. but I thought I’d blog about it anyway in case it can help someone else.

Rate this post:
Share this post:

Leave a Reply