Event Threshold Settings
Agent Event Threshold Settings
Watch the videoOverview
One of the main functions of the controller is to keep track of connected agents and report when an agent starts to experience performance problems, logs many errors in a short time, or unexpectedly goes offline. Agent events are held in the controller's configured database in the agent_events table. Within this table, you can find three types of event categories: agent, metric, and task. The agent event category is used for major connectivity events, such as loss of connectivity from an agent. The metric event category is used for events that report abnormal agent health statistics, such as abnormally high CPU usage. The task event category does not affect metrics, and is detailed in the Gateway Task Results section.
General Settings
Alarm Evaluation
This section is for activity and metric events, you can configure alarms to trigger when an event is reported at the warning or error level. You can also set the alarm pipeline that will process the generated alarms.
Enable Activity Alarms: If true, alarms will be generated for agent activity events, such as when an agent stops responding.
Enable Metrics Alarms: If true, alarms will be generated for agent metric events.
Enable Task Alarms: If true, alarms will be generated when a scheduled task fails.
Warning Priority: The priority assigned to all of the warning thresholds. Options are: Diagnostic, Low, Medium, High, Critical.
Error Priority: The priority assigned to all of the error thresholds. Options are: Diagnostic, Low, Medium, High, Critical.
Active Pipeline: The Pipeline to use for active events. Note that this Pipeline must be created before the alarm event happens, and that the name is case-sensitive.
Ack Pipeline: The Pipeline to use for acknowledgement events. Note that this Pipeline must be created before the alarm event happens, and that the name is case-sensitive.
Activity Monitor
The Activity Monitor configures how agent inactivity is reported. When contact is lost with an agent, an inactivity warning or error event is fired if the configured time in minutes has elapsed since last contact.
Inactivity Warning (Minutes): The number of minutes before a warning threshold alarm is activated. (default 5)
Inactivity Error (Minutes): The number of minutes before an error threshold alarm is activated. (default 15)
System Metric Thresholds
In addition to inactivity alarms, alarms can be set on all agents when certain metrics like CPU usage, number of clients, error rates, and more are reached. Each one has both a warning and an error level.
CPU Usage Warning (%): The warning level of an Agent's CPU usage. (default 70)
CPU Usage Error (%): The error level of an Agent's CPU usage. (default 90)
Memory Usage Warning (%): The warning level of an Agent's memory (RAM) usage. (default 70)
Memory Usage Error (%): The error level of an Agent's memory (RAM) usage. (default 90)
Errors Per Minute Warning: The warning level of an Agent's error rate (minute). The contents of the Agent's errors can be checked in the Agent's console. (default 2)
Errors Per Minute Error: The error level of an Agent's error rate (minute). The contents of the Agent's errors can be checked in the Agent's console. (default 5)
Errors Per Hour Warning: The warning level of an Agent's hourly error rate. The contents of the Agent's errors can be checked in the Agent's console. (default 20)
Errors Per Hour Error: The error level of an Agent's hourly error rate. The contents of the Agent's errors can be checked in the Agent's console. (default 60)
Connected Clients Warning: The number of clients connected to that Agent required to raise a warning alarm. (default 50)
Connected Clients Error: The number of clients connected to that Agent required to raise an error alarm. (default 100)
DB Utilization Warning (%): Triggered when the utilization of the DB connection pool exceeds the specified percentage. (default 80)
DB Utilization Error (%): Triggered when the utilization of the DB connection pool exceeds the specified percentage. (default 100)