Table of Contents
Operational Alerts
The operational alerts system monitors the state of your security coverage and notifies the team when conditions fall below defined thresholds.
Alert Rules
Alert rules define what to check and when to fire. Each rule has a type, severity, configuration thresholds, and notification preferences.
Rule Types
| Rule Type | What it checks |
|---|---|
coverage_drop |
Overall coverage score drops below a threshold |
stale_test |
A test has been in red_executing or blue_evaluating for too long |
unvalidated_test |
Tests stuck in in_review beyond a threshold duration |
high_risk_uncovered |
High-severity techniques have no validated tests |
detection_gap |
Technique has validated attack tests but no detection rule |
Rule Fields
{
"name": "Coverage below 70%",
"description": "Alert when overall coverage drops below 70%",
"rule_type": "coverage_drop",
"severity": "high",
"config": {
"threshold": 70.0,
"tactic_id": null
},
"is_enabled": true,
"cooldown_hours": 24,
"notify_in_app": true,
"notify_webhook": true,
"webhook_id": "webhook-uuid-or-null"
}
Severity Levels
| Severity | Use case |
|---|---|
info |
Informational; no action needed immediately |
low |
Worth noting but not urgent |
medium |
Should be addressed in next sprint |
high |
Requires prompt attention |
critical |
Immediate action required |
Rule Configuration Examples
Coverage drop:
{"threshold": 75.0}
Fires when organization score drops below 75%.
Stale test:
{"stale_days": 7}
Fires for any test in executing/evaluating state for more than 7 days.
High risk uncovered:
{"min_severity": "high", "max_uncovered": 5}
Fires when more than 5 high-severity techniques have no validated test.
Detection gap:
{"require_detection_rule": true}
Fires for every validated attack test that has no linked detection rule.
Alert Instances
When a rule's condition is met and the rule is not in cooldown, an alert instance is created.
Instance Lifecycle
open ──────────────> acknowledged ──────────────> resolved
│ │
└────────────────> dismissed │
│ │
└── suppressed until └── final state
cooldown resets (immutable)
Instance Fields
{
"id": "uuid",
"rule_id": "uuid",
"rule_name": "Coverage below 70%",
"rule_type": "coverage_drop",
"severity": "high",
"status": "open",
"details": {"current_score": 67.3, "threshold": 70.0},
"fired_at": "2024-03-15T10:00:00Z",
"acknowledged_at": null,
"acknowledged_by": null,
"resolved_at": null,
"dismissed_at": null
}
Alert Lifecycle Actions
Acknowledge
Marks the alert as seen and being investigated. Does NOT suppress re-firing.
POST /api/v1/alerts/{id}/acknowledge
{"notes": "Investigating coverage drop — two campaigns just completed"}
Required role: red_lead, blue_lead, admin
Resolve
Marks the underlying issue as fixed. Prevents re-evaluation from creating a duplicate alert (until cooldown expires and condition is met again).
POST /api/v1/alerts/{id}/resolve
{"resolution_notes": "Coverage restored to 78% after campaign validation"}
Required role: red_lead, blue_lead, admin
Dismiss
Suppresses the alert for the rule's cooldown period.
POST /api/v1/alerts/{id}/dismiss
{"reason": "Planned maintenance window — coverage drop expected"}
Required role: red_lead, blue_lead, admin
Alert Evaluation
Automatic (hourly)
Aegis runs alert evaluation every hour via APScheduler:
- Checks all
is_enabled=truerules - For each rule, evaluates the condition against current data
- Creates an instance if condition is met AND rule is not in cooldown
- Sends in-app notifications and/or webhook calls per rule configuration
Manual trigger
POST /api/v1/alerts/evaluate
Required role: red_lead, blue_lead, admin
Useful when you've made changes and want to check immediately without waiting for the hourly job.
In-App Notifications
When notify_in_app: true on a rule, an in-app notification is sent to all users
with role red_lead, blue_lead, or admin.
View notifications:
GET /api/v1/notifications
Mark as read:
PATCH /api/v1/notifications/{id}
{"is_read": true}
Webhook Notifications
When notify_webhook: true and a webhook_id is set, Aegis POSTs to the configured
webhook URL when the alert fires.
Webhook payload:
{
"event": "alert.fired",
"alert_id": "uuid",
"rule_name": "Coverage below 70%",
"severity": "high",
"details": {"current_score": 67.3, "threshold": 70.0},
"fired_at": "2024-03-15T10:00:00Z"
}
Summary
GET /api/v1/alerts/summary
Returns:
{
"total": 12,
"by_status": {"open": 5, "acknowledged": 3, "resolved": 3, "dismissed": 1},
"by_severity": {"critical": 1, "high": 4, "medium": 5, "low": 2, "info": 0},
"by_type": {
"coverage_drop": 2,
"stale_test": 4,
"unvalidated_test": 3,
"high_risk_uncovered": 2,
"detection_gap": 1
}
}