# Aegis — Scoring System Aegis uses a granular 0–100 scoring system to measure security coverage at multiple levels: individual techniques, tactics, threat actors, and the overall organization. --- ## Technique Score (0–100) Each ATT&CK technique receives a composite score based on five weighted components: | Component | Default Weight | Description | |-----------|---------------|-------------| | Tests Validated | 40% | Ratio of detected tests to total validated tests | | Detection Rules | 20% | Number of active detection rules linked to the technique | | D3FEND Coverage | 15% | Number of D3FEND defensive techniques mapped | | Freshness | 15% | How recent the latest validated test is | | Platform Diversity | 10% | Coverage across different platforms (Windows, Linux, macOS) | ### Tests Validated Component ``` score = (detected_tests / total_validated_tests) × weight ``` - Only tests in `validated` state are counted - `detected` means `detection_result = "detected"` - Example: 2 detected out of 3 validated → `2/3 × 40 = 26.7` ### Detection Rules Component ``` score = min(active_rules / 3, 1.0) × weight ``` - Counts active detection rules linked to the technique's `mitre_id` - 3+ rules gives full marks (capped at 1.0) - Example: 2 active rules → `2/3 × 20 = 13.3` ### D3FEND Coverage Component ``` score = min(d3fend_mappings / 2, 1.0) × weight ``` - Counts D3FEND defensive technique mappings - 2+ mappings gives full marks - Example: 1 mapping → `1/2 × 15 = 7.5` ### Freshness Component ``` days = (now - newest_validated_test.red_validated_at).days score = max(0, 1.0 - days / 180) × weight ``` - 0 days old = full freshness score - 180+ days old = 0 (completely stale) - Linear decay between 0 and 180 days - Example: test is 60 days old → `(1 - 60/180) × 15 = 10.0` ### Platform Diversity Component ``` platforms_covered = unique platforms across validated tests score = min(platforms_covered / 3, 1.0) × weight ``` - Counts unique platforms (windows, linux, macos) from validated tests - 3+ platforms gives full marks - Example: windows + linux → `2/3 × 10 = 6.7` ### Example Calculation A technique with: - 2/3 tests detected, 2 detection rules, 1 D3FEND mapping, 60 days old, 2 platforms ``` Tests: (2/3) × 40 = 26.7 Detection: (2/3) × 20 = 13.3 D3FEND: (1/2) × 15 = 7.5 Freshness: (1 - 60/180) × 15 = 10.0 Platform: (2/3) × 10 = 6.7 ───── Total: 64.2 ``` --- ## Configuring Weights Weights are configurable via environment variables or the admin API. They must sum to 100. ### Environment Variables ```env SCORING_WEIGHT_TESTS=40 SCORING_WEIGHT_DETECTION_RULES=20 SCORING_WEIGHT_D3FEND=15 SCORING_WEIGHT_FRESHNESS=15 SCORING_WEIGHT_PLATFORM_DIVERSITY=10 ``` ### API Configuration ```bash # Get current weights GET /api/v1/scores/config # Update weights (admin only) PATCH /api/v1/scores/config { "tests": 50, "detection_rules": 20, "d3fend": 10, "freshness": 10, "platform_diversity": 10 } ``` Note: Runtime changes do not persist across restarts. Update the `.env` file or environment variables for permanent changes. --- ## Tactic Score The tactic score is the **average** of all technique scores within that tactic: ``` tactic_score = mean(technique_scores for techniques in tactic) ``` Also provides: - `techniques_total` — number of techniques in the tactic - `techniques_evaluated` — techniques with score > 0 - `techniques_by_status` — count by status (validated, partial, not_covered, not_evaluated) ### API ```bash GET /api/v1/scores/tactic/execution GET /api/v1/scores/tactic/persistence ``` --- ## Threat Actor Coverage Score Measures how well the organization is covered against a specific threat actor: ``` actor_score = mean(technique_scores for techniques used by actor) ``` Also provides: - `techniques_total` — techniques attributed to the actor - `techniques_covered` — techniques with score > 0 - `coverage_percentage` — percentage of techniques covered - `uncovered_techniques` — list of technique IDs with score = 0 ### API ```bash GET /api/v1/scores/threat-actor/{actor_id} ``` --- ## Organization Score The top-level organizational security score is a weighted average of four sub-scores: | Sub-score | Weight | Description | |-----------|--------|-------------| | Total Coverage | 40% | Average technique score across all evaluated techniques | | Critical Coverage | 25% | Average score for techniques with high/critical severity templates | | Detection Maturity | 20% | `(triggered_rules / total_active_rules) × 100` | | Response Readiness | 15% | `(remediation_completed / remediation_total) × 100` | ``` org_score = total_coverage × 0.4 + critical_coverage × 0.25 + detection_maturity × 0.2 + response_readiness × 0.15 ``` ### Caching The organization score is cached in-memory for 5 minutes. The cache is automatically invalidated when: - A test is validated (state → `validated`) - Scoring weights are updated via the API ### API ```bash GET /api/v1/scores/organization ``` --- ## Operational Metrics In addition to coverage scores, Aegis tracks operational KPIs: ### Mean Time to Detect (MTTD) Time from test execution start (`start_execution` audit entry) to red team submission (`submit_red`). ``` MTTD = mean(submit_red.timestamp - start_execution.timestamp) for all tests ``` ### Mean Time to Respond (MTTR) Time from blue team evaluation (`blue_validated_at`) to remediation completion (`update_remediation` audit entry). ``` MTTR = mean(update_remediation.timestamp - blue_validated_at) for remediated tests ``` ### Detection Efficacy ``` efficacy = (detected_tests / total_validated_tests) × 100 ``` ### Alert Fidelity Ratio of true positive detections to total detection rule evaluations. ### Coverage Velocity Rate at which new techniques are being covered over time (techniques covered per week). ### Validation Throughput Number of tests moving through the pipeline per time period. ### Rejection Rate Percentage of tests rejected during dual validation. ### API ```bash # All operational metrics GET /api/v1/metrics/operational # Weekly trend data GET /api/v1/metrics/operational/trend?period=90d # Breakdown by team GET /api/v1/metrics/operational/by-team ``` --- ## Score History Weekly score snapshots for trend analysis: ```bash GET /api/v1/scores/history?period=90d # Returns weekly data points with: date, overall_score, total_coverage, # critical_coverage, detection_maturity, response_readiness ``` Periods: `30d`, `90d`, `1y` --- ## Coverage Snapshots Point-in-time captures of the complete coverage state for historical comparison: ```bash # Create a snapshot POST /api/v1/snapshots { "name": "Q1 2026 Baseline" } # Compare two snapshots GET /api/v1/snapshots/compare?a={snapshot_id_a}&b={snapshot_id_b} # Returns: score_delta, improved techniques, worsened techniques, unchanged count ``` Automatic weekly snapshots are created every Sunday at 00:00 by the scheduler, with old snapshots cleaned up to keep the last 52 (one year).