Files
Aegis/docs/ARCHITECTURE.md

9.9 KiB
Raw Blame History

Aegis — Architecture

High-Level Overview

┌────────────────────┐       ┌─────────────────────┐
│   React Frontend   │──────▶│   FastAPI Backend    │
│  (Vite / TS / TW)  │ REST  │  (Python 3.11)      │
└────────────────────┘       └──────┬──────┬────────┘
                                    │      │
                          ┌─────────┘      └─────────┐
                          ▼                          ▼
                 ┌─────────────────┐       ┌─────────────────┐
                 │   PostgreSQL    │       │     MinIO        │
                 │  (Data Store)   │       │ (Object Storage) │
                 └─────────────────┘       └─────────────────┘
  • Frontend — React 19 + TypeScript + Tailwind CSS v4 + TanStack Query
  • Backend — FastAPI with SQLAlchemy ORM + Alembic migrations
  • Database — PostgreSQL 15 with UUID primary keys and JSONB columns
  • Object Storage — MinIO (S3-compatible) for evidence files
  • Scheduler — APScheduler (in-process) for background jobs

Database Schema

Core Tables

Table Description
users User accounts with role-based access (admin, red_tech, blue_tech, red_lead, blue_lead, viewer)
techniques MITRE ATT&CK techniques with coverage status, tactic, platforms (JSONB)
tests Security tests with full Red/Blue workflow fields, dual validation, remediation, and retest chain
test_templates Predefined test catalog from Atomic Red Team, Sigma, CALDERA, LOLBAS, custom
evidences Evidence files separated by team (red/blue) with SHA256 integrity verification

Detection & Defense

Table Description
detection_rules Imported detection rules (Sigma, Elastic, custom) linked to ATT&CK techniques
test_detection_results Per-test detection rule evaluation results (triggered / not triggered)
test_template_detection_rules Template ↔ detection rule associations
defensive_techniques MITRE D3FEND defensive techniques
defensive_technique_mappings ATT&CK technique ↔ D3FEND defensive technique mappings

Campaigns & Scheduling

Table Description
campaigns Test campaign groupings with scheduling (recurring, weekly/monthly/quarterly)
campaign_tests Ordered test assignments within campaigns with dependency support

Intelligence & Actors

Table Description
threat_actors MITRE CTI intrusion sets with aliases, country, motivation, JSONB targets
threat_actor_techniques Threat actor ↔ ATT&CK technique mappings
intel_items Threat intelligence items from RSS feeds

Compliance

Table Description
compliance_frameworks Compliance frameworks (e.g., NIST 800-53)
compliance_controls Individual controls within a framework
compliance_control_mappings Control ↔ ATT&CK technique mappings

Operational

Table Description
coverage_snapshots Point-in-time coverage status captures with aggregate metrics
snapshot_technique_states Normalized per-technique state within a snapshot
audit_logs System-wide audit trail with JSONB details
notifications In-app notifications with read status
data_sources External data source configuration and sync status

Key Relationships

Technique ──1:N── Test ──1:N── Evidence
    │                │
    │                ├── TestDetectionResult ──N:1── DetectionRule
    │                └── CampaignTest ──N:1── Campaign
    │
    ├── ThreatActorTechnique ──N:1── ThreatActor
    ├── DefensiveTechniqueMapping ──N:1── DefensiveTechnique
    ├── ComplianceControlMapping ──N:1── ComplianceControl ──N:1── ComplianceFramework
    └── SnapshotTechniqueState ──N:1── CoverageSnapshot

Test ──retest_of──▶ Test  (self-referential retest chain)
Campaign ──parent_campaign_id──▶ Campaign  (recurring execution history)

Backend Architecture

Layered Structure

routers/          ← HTTP endpoints (input validation, auth, response shaping)
  ↓
services/         ← Business logic (state machines, calculations, imports)
  ↓
models/           ← SQLAlchemy ORM models
  ↓
database.py       ← Engine + session management (lazy initialization)

Services

Service Responsibility
test_workflow_service Test state machine (draft → validated/rejected) with dual validation
scoring_service 0100 scoring for techniques, tactics, actors, organization
score_cache In-memory TTL cache (5 min) for expensive score/metric calculations
operational_metrics_service MTTD, MTTR, detection efficacy, alert fidelity, coverage velocity
snapshot_service Coverage snapshot creation, temporal comparison, cleanup
campaign_service Campaign CRUD, progress tracking, circular dependency prevention
campaign_scheduler_service Recurring campaign execution (clone + schedule next run)
status_service Technique status recalculation from test results
notification_service In-app notification CRUD and state-change alerts
audit_service Immutable audit trail logging
mitre_sync_service MITRE ATT&CK sync via TAXII 2.0 / GitHub fallback
atomic_import_service Atomic Red Team template import from GitHub
sigma_import_service SigmaHQ detection rule import
elastic_import_service Elastic detection rule import (TOML)
caldera_import_service CALDERA ability import
lolbas_import_service LOLBAS/GTFOBins template import
d3fend_import_service MITRE D3FEND defensive technique import
threat_actor_import_service MITRE CTI threat actor import (STIX)
compliance_import_service NIST 800-53 ↔ ATT&CK mapping import
intel_service RSS-based threat intelligence scanning

Scheduled Jobs (APScheduler)

Job Schedule Description
MITRE Sync Every 24h Sync ATT&CK techniques from TAXII/GitHub
Intel Scan Every 7 days Scan RSS feeds for threat intelligence
Notification Cleanup Every 24h Remove old read notifications
Weekly Snapshot Sundays 00:00 Create coverage snapshot + cleanup old ones
Recurring Campaigns Every 24h Check and execute due recurring campaigns

Test Lifecycle (State Machine)

┌──────┐    ┌──────────────┐    ┌─────────────────┐    ┌───────────┐
│ DRAFT│───▶│RED_EXECUTING │───▶│ BLUE_EVALUATING  │───▶│ IN_REVIEW │
└──────┘    └──────────────┘    └─────────────────┘    └─────┬─────┘
                                                             │
                                         ┌───────────────────┤
                                         ▼                   ▼
                                   ┌──────────┐       ┌──────────┐
                                   │ REJECTED │       │VALIDATED │
                                   └────┬─────┘       └──────────┘
                                        │                    │
                                        └──▶ Back to DRAFT   ├──▶ Remediation
                                                             └──▶ Auto Re-test

Dual Validation in IN_REVIEW:

  • Red Lead votes approve/reject
  • Blue Lead votes approve/reject
  • Both approve → VALIDATED
  • Either rejects → REJECTED
  • One votes, other pending → stays IN_REVIEW

Auto Re-testing: When remediation is completed on a validated test, the system automatically creates a follow-up retest (up to MAX_RETEST_COUNT = 3).


Frontend Architecture

Key Technologies

  • React 19 with TypeScript
  • Vite 7 for bundling
  • Tailwind CSS v4 for styling
  • TanStack Query for server state management
  • TanStack Virtual for table virtualization
  • React Router v7 for routing
  • Recharts for charts and visualizations
  • Lucide React for icons

Page Lazy Loading

All pages except LoginPage and DashboardPage are lazy-loaded via React.lazy() with <Suspense> fallbacks for optimal initial bundle size.

Role-Based Navigation

The sidebar dynamically filters navigation items based on the current user's role:

Section Visible to
Dashboard All roles
Executive Dashboard admin, red_lead, blue_lead
ATT&CK Matrix All roles
Tests (sub-menu) All roles
Campaigns All roles
Threat Actors All roles
Compliance All roles
Comparison admin, red_lead, blue_lead
Reports All roles
System (admin section) admin only

Performance Optimizations

  • React.memo on HeatmapCell (renders 3000+ times in full matrix)
  • useMemo / useCallback for expensive calculations in memoized components
  • useDebounce hook for search inputs (300ms delay)
  • TanStack Virtual for large table virtualization (test templates, detection rules, audit logs)
  • Lazy loading for all non-critical page bundles