Connection Center Performance expectations
Machine specifications
When doing performance testing we have used a two-management server management group with nothing else utilizing the SQL server at the time.
SCOM Management Servers:
Azure d4ds_v5 VMs
4 CPU (Xeon Platinum 8370C 2.8 GHz)
16 GB RAM
SQL Server:
Azure d8ds_v4
8 CPU (Xeon Platinum 8272CL 2.6 GHz)
32 GB RAM
Throughput
When testing we aimed to generate 20 alerts a minute for a 72-hour period. With updates, this equates to 40 changes a minute or roughly 2400 an hour (updates cause changes to alerts which will then be sent back out again).
We have also tested 300 alerts a minute for a 30-minute period. With updates, this equates to 600 changes a minute. The default configuration will not allow you to achieve this.
Behavior when exceeding throughput thresholds
By default, Connection Center caps outbound alerts at 60 alerts a minute (from v 3.1.2.0), with alert updates capped at 250 a minute (from v 4.0.0.0 SCOM / v 2.2.2 ServiceNow). In both cases, if there are more alerts than the capacity the alerts will be saved up for future runs. As long as the number of changes per minute eventually falls below the threshold Connection Center will catch back up.
During stress testing, we have put both sides of the connection 6 hours behind and allowed them to catch back up without issue. This shouldn’t be the norm, and you may need to consider adjusting your capacities if you find yourself in this situation regularly.
In the unlikely event that SCOM is so far behind that alerts have started getting groomed out, you will see an alert firing in SCOM to notify you that it has been unable to pull the alert. You may see these alerts for other reasons, so it’s usually a good idea to check if you can pull the alert by its ID.
Load
Load is difficult to predict and will vary on a case-by-case basis. However, when using normal notification channels we typically see less than 300MB of memory use, and a couple of percentage points of CPU use every minute or so. In heavily nested monitoring environments (objects are members of complex nested group hierarchies for example) the requirements will be higher.
Inbound webhooks typically use around 150MB of memory, and a couple of percentage points of CPU use when consistently fielding requests. Using large numbers of manual objects, and/or leaving large numbers of alerts open will increase requirements.