7.0 KiB
Aegis — Scoring System
Aegis uses a granular 0–100 scoring system to measure security coverage at multiple levels: individual techniques, tactics, threat actors, and the overall organization.
Technique Score (0–100)
Each ATT&CK technique receives a composite score based on five weighted components:
| Component | Default Weight | Description |
|---|---|---|
| Tests Validated | 40% | Ratio of detected tests to total validated tests |
| Detection Rules | 20% | Number of active detection rules linked to the technique |
| D3FEND Coverage | 15% | Number of D3FEND defensive techniques mapped |
| Freshness | 15% | How recent the latest validated test is |
| Platform Diversity | 10% | Coverage across different platforms (Windows, Linux, macOS) |
Tests Validated Component
score = (detected_tests / total_validated_tests) × weight
- Only tests in
validatedstate are counted detectedmeansdetection_result = "detected"- Example: 2 detected out of 3 validated →
2/3 × 40 = 26.7
Detection Rules Component
score = min(active_rules / 3, 1.0) × weight
- Counts active detection rules linked to the technique's
mitre_id - 3+ rules gives full marks (capped at 1.0)
- Example: 2 active rules →
2/3 × 20 = 13.3
D3FEND Coverage Component
score = min(d3fend_mappings / 2, 1.0) × weight
- Counts D3FEND defensive technique mappings
- 2+ mappings gives full marks
- Example: 1 mapping →
1/2 × 15 = 7.5
Freshness Component
days = (now - newest_validated_test.red_validated_at).days
score = max(0, 1.0 - days / 180) × weight
- 0 days old = full freshness score
- 180+ days old = 0 (completely stale)
- Linear decay between 0 and 180 days
- Example: test is 60 days old →
(1 - 60/180) × 15 = 10.0
Platform Diversity Component
platforms_covered = unique platforms across validated tests
score = min(platforms_covered / 3, 1.0) × weight
- Counts unique platforms (windows, linux, macos) from validated tests
- 3+ platforms gives full marks
- Example: windows + linux →
2/3 × 10 = 6.7
Example Calculation
A technique with:
- 2/3 tests detected, 2 detection rules, 1 D3FEND mapping, 60 days old, 2 platforms
Tests: (2/3) × 40 = 26.7
Detection: (2/3) × 20 = 13.3
D3FEND: (1/2) × 15 = 7.5
Freshness: (1 - 60/180) × 15 = 10.0
Platform: (2/3) × 10 = 6.7
─────
Total: 64.2
Configuring Weights
Weights are configurable via environment variables or the admin API. They must sum to 100.
Environment Variables
SCORING_WEIGHT_TESTS=40
SCORING_WEIGHT_DETECTION_RULES=20
SCORING_WEIGHT_D3FEND=15
SCORING_WEIGHT_FRESHNESS=15
SCORING_WEIGHT_PLATFORM_DIVERSITY=10
API Configuration
# Get current weights
GET /api/v1/scores/config
# Update weights (admin only)
PATCH /api/v1/scores/config
{
"tests": 50,
"detection_rules": 20,
"d3fend": 10,
"freshness": 10,
"platform_diversity": 10
}
Note: Runtime changes do not persist across restarts. Update the .env file or environment variables for permanent changes.
Tactic Score
The tactic score is the average of all technique scores within that tactic:
tactic_score = mean(technique_scores for techniques in tactic)
Also provides:
techniques_total— number of techniques in the tactictechniques_evaluated— techniques with score > 0techniques_by_status— count by status (validated, partial, not_covered, not_evaluated)
API
GET /api/v1/scores/tactic/execution
GET /api/v1/scores/tactic/persistence
Threat Actor Coverage Score
Measures how well the organization is covered against a specific threat actor:
actor_score = mean(technique_scores for techniques used by actor)
Also provides:
techniques_total— techniques attributed to the actortechniques_covered— techniques with score > 0coverage_percentage— percentage of techniques covereduncovered_techniques— list of technique IDs with score = 0
API
GET /api/v1/scores/threat-actor/{actor_id}
Organization Score
The top-level organizational security score is a weighted average of four sub-scores:
| Sub-score | Weight | Description |
|---|---|---|
| Total Coverage | 40% | Average technique score across all evaluated techniques |
| Critical Coverage | 25% | Average score for techniques with high/critical severity templates |
| Detection Maturity | 20% | (triggered_rules / total_active_rules) × 100 |
| Response Readiness | 15% | (remediation_completed / remediation_total) × 100 |
org_score = total_coverage × 0.4
+ critical_coverage × 0.25
+ detection_maturity × 0.2
+ response_readiness × 0.15
Caching
The organization score is cached in-memory for 5 minutes. The cache is automatically invalidated when:
- A test is validated (state →
validated) - Scoring weights are updated via the API
API
GET /api/v1/scores/organization
Operational Metrics
In addition to coverage scores, Aegis tracks operational KPIs:
Mean Time to Detect (MTTD)
Time from test execution start (start_execution audit entry) to red team submission (submit_red).
MTTD = mean(submit_red.timestamp - start_execution.timestamp) for all tests
Mean Time to Respond (MTTR)
Time from blue team evaluation (blue_validated_at) to remediation completion (update_remediation audit entry).
MTTR = mean(update_remediation.timestamp - blue_validated_at) for remediated tests
Detection Efficacy
efficacy = (detected_tests / total_validated_tests) × 100
Alert Fidelity
Ratio of true positive detections to total detection rule evaluations.
Coverage Velocity
Rate at which new techniques are being covered over time (techniques covered per week).
Validation Throughput
Number of tests moving through the pipeline per time period.
Rejection Rate
Percentage of tests rejected during dual validation.
API
# All operational metrics
GET /api/v1/metrics/operational
# Weekly trend data
GET /api/v1/metrics/operational/trend?period=90d
# Breakdown by team
GET /api/v1/metrics/operational/by-team
Score History
Weekly score snapshots for trend analysis:
GET /api/v1/scores/history?period=90d
# Returns weekly data points with: date, overall_score, total_coverage,
# critical_coverage, detection_maturity, response_readiness
Periods: 30d, 90d, 1y
Coverage Snapshots
Point-in-time captures of the complete coverage state for historical comparison:
# Create a snapshot
POST /api/v1/snapshots
{ "name": "Q1 2026 Baseline" }
# Compare two snapshots
GET /api/v1/snapshots/compare?a={snapshot_id_a}&b={snapshot_id_b}
# Returns: score_delta, improved techniques, worsened techniques, unchanged count
Automatic weekly snapshots are created every Sunday at 00:00 by the scheduler, with old snapshots cleaned up to keep the last 52 (one year).