# Aegis — Architecture ## High-Level Overview ``` ┌────────────────────┐ ┌─────────────────────┐ │ React Frontend │──────▶│ FastAPI Backend │ │ (Vite / TS / TW) │ REST │ (Python 3.11) │ └────────────────────┘ └──────┬──────┬────────┘ │ │ ┌─────────┘ └─────────┐ ▼ ▼ ┌─────────────────┐ ┌─────────────────┐ │ PostgreSQL │ │ MinIO │ │ (Data Store) │ │ (Object Storage) │ └─────────────────┘ └─────────────────┘ ``` - **Frontend** — React 19 + TypeScript + Tailwind CSS v4 + TanStack Query - **Backend** — FastAPI with SQLAlchemy ORM + Alembic migrations - **Database** — PostgreSQL 15 with UUID primary keys and JSONB columns - **Object Storage** — MinIO (S3-compatible) for evidence files - **Scheduler** — APScheduler (in-process) for background jobs --- ## Database Schema ### Core Tables | Table | Description | |-------|-------------| | `users` | User accounts with role-based access (admin, red_tech, blue_tech, red_lead, blue_lead, viewer) | | `techniques` | MITRE ATT&CK techniques with coverage status, tactic, platforms (JSONB) | | `tests` | Security tests with full Red/Blue workflow fields, dual validation, remediation, and retest chain | | `test_templates` | Predefined test catalog from Atomic Red Team, Sigma, CALDERA, LOLBAS, custom | | `evidences` | Evidence files separated by team (red/blue) with SHA256 integrity verification | ### Detection & Defense | Table | Description | |-------|-------------| | `detection_rules` | Imported detection rules (Sigma, Elastic, custom) linked to ATT&CK techniques | | `test_detection_results` | Per-test detection rule evaluation results (triggered / not triggered) | | `test_template_detection_rules` | Template ↔ detection rule associations | | `defensive_techniques` | MITRE D3FEND defensive techniques | | `defensive_technique_mappings` | ATT&CK technique ↔ D3FEND defensive technique mappings | ### Campaigns & Scheduling | Table | Description | |-------|-------------| | `campaigns` | Test campaign groupings with scheduling (recurring, weekly/monthly/quarterly) | | `campaign_tests` | Ordered test assignments within campaigns with dependency support | ### Intelligence & Actors | Table | Description | |-------|-------------| | `threat_actors` | MITRE CTI intrusion sets with aliases, country, motivation, JSONB targets | | `threat_actor_techniques` | Threat actor ↔ ATT&CK technique mappings | | `intel_items` | Threat intelligence items from RSS feeds | ### Compliance | Table | Description | |-------|-------------| | `compliance_frameworks` | Compliance frameworks (e.g., NIST 800-53) | | `compliance_controls` | Individual controls within a framework | | `compliance_control_mappings` | Control ↔ ATT&CK technique mappings | ### Operational | Table | Description | |-------|-------------| | `coverage_snapshots` | Point-in-time coverage status captures with aggregate metrics | | `snapshot_technique_states` | Normalized per-technique state within a snapshot | | `audit_logs` | System-wide audit trail with JSONB details | | `notifications` | In-app notifications with read status | | `data_sources` | External data source configuration and sync status | ### Key Relationships ``` Technique ──1:N── Test ──1:N── Evidence │ │ │ ├── TestDetectionResult ──N:1── DetectionRule │ └── CampaignTest ──N:1── Campaign │ ├── ThreatActorTechnique ──N:1── ThreatActor ├── DefensiveTechniqueMapping ──N:1── DefensiveTechnique ├── ComplianceControlMapping ──N:1── ComplianceControl ──N:1── ComplianceFramework └── SnapshotTechniqueState ──N:1── CoverageSnapshot Test ──retest_of──▶ Test (self-referential retest chain) Campaign ──parent_campaign_id──▶ Campaign (recurring execution history) ``` --- ## Backend Architecture ### Layered Structure ``` routers/ ← HTTP endpoints (input validation, auth, response shaping) ↓ services/ ← Business logic (state machines, calculations, imports) ↓ models/ ← SQLAlchemy ORM models ↓ database.py ← Engine + session management (lazy initialization) ``` ### Services #### Business Logic Services | Service | Responsibility | |---------|---------------| | `test_workflow_service` | Test state machine (draft → validated/rejected) with dual validation | | `test_crud_service` | Test CRUD, query logic, permission validation | | `scoring_service` | 0–100 scoring for techniques, tactics, actors, organization | | `scoring_config_service` | DB-persisted scoring weights with validation | | `score_cache` | In-memory TTL cache (5 min) for expensive score/metric calculations | | `operational_metrics_service` | MTTD, MTTR, detection efficacy, alert fidelity, coverage velocity | | `metrics_query_service` | Dashboard aggregation queries | | `snapshot_service` | Coverage snapshot creation, temporal comparison, cleanup | | `campaign_crud_service` | Campaign CRUD, lifecycle, scheduling | | `campaign_service` | Campaign progress tracking, circular dependency prevention | | `campaign_scheduler_service` | Recurring campaign execution (clone + schedule next run) | | `status_service` | Technique status recalculation from test results | | `coverage_report_service` | Coverage report generation and CSV export | | `compliance_service` | Compliance framework analysis and gap detection | | `detection_rule_service` | Detection rule queries, auto-association, evaluation | | `threat_actor_service` | Threat actor queries, coverage, gap analysis | | `evidence_service` | Evidence permission validation and queries | | `heatmap_service` | ATT&CK Navigator layer generation | | `user_service` | User CRUD, role validation, password hashing | | `audit_query_service` | Paginated audit log queries and distinct lookups | | `audit_service` | Immutable audit trail logging (write-only) | | `data_source_service` | Data source CRUD, sync dispatch, statistics | | `notification_service` | In-app notification CRUD and state-change alerts | | `intel_service` | RSS-based threat intelligence scanning | #### Import Services (all satisfy `ImportService` protocol) | Service | Responsibility | |---------|---------------| | `mitre_sync_service` | MITRE ATT&CK sync via TAXII 2.0 / GitHub fallback | | `atomic_import_service` | Atomic Red Team template import from GitHub | | `sigma_import_service` | SigmaHQ detection rule import | | `elastic_import_service` | Elastic detection rule import (TOML) | | `caldera_import_service` | CALDERA ability import | | `lolbas_import_service` | LOLBAS/GTFOBins template import | | `d3fend_import_service` | MITRE D3FEND defensive technique import | | `threat_actor_import_service` | MITRE CTI threat actor import (STIX) | | `compliance_import_service` | NIST 800-53 ↔ ATT&CK mapping import | ### Domain Layer ``` domain/ ├── entities/ # Rich domain entities with business logic │ ├── technique.py # TechniqueEntity with status recalculation │ ├── campaign.py # CampaignEntity with lifecycle state machine │ └── compliance.py # ComplianceFrameworkEntity with coverage calculation ├── value_objects/ # Immutable value types │ ├── mitre_id.py # MITRE ATT&CK ID validation │ └── scoring_weights.py # Scoring weights (sum=100, non-negative) ├── ports/ # Interfaces (Protocol contracts) │ ├── repositories/ # TechniqueRepository, TestRepository │ └── import_service.py # ImportService protocol + IMPORT_REGISTRY ├── errors.py # Domain exceptions (EntityNotFoundError, etc.) ├── enums.py # TestState, TechniqueStatus, TestResult ├── test_entity.py # TestEntity with state machine + domain events └── unit_of_work.py # UnitOfWork context manager ``` ### Scheduled Jobs (APScheduler) | Job | Schedule | Description | |-----|----------|-------------| | MITRE Sync | Every 24h | Sync ATT&CK techniques from TAXII/GitHub | | Intel Scan | Every 7 days | Scan RSS feeds for threat intelligence | | Notification Cleanup | Every 24h | Remove old read notifications | | Weekly Snapshot | Sundays 00:00 | Create coverage snapshot + cleanup old ones | | Recurring Campaigns | Every 24h | Check and execute due recurring campaigns | --- ## Test Lifecycle (State Machine) ``` ┌──────┐ ┌──────────────┐ ┌─────────────────┐ ┌───────────┐ │ DRAFT│───▶│RED_EXECUTING │───▶│ BLUE_EVALUATING │───▶│ IN_REVIEW │ └──────┘ └──────────────┘ └─────────────────┘ └─────┬─────┘ │ ┌───────────────────┤ ▼ ▼ ┌──────────┐ ┌──────────┐ │ REJECTED │ │VALIDATED │ └────┬─────┘ └──────────┘ │ │ └──▶ Back to DRAFT ├──▶ Remediation └──▶ Auto Re-test ``` **Dual Validation in IN_REVIEW:** - Red Lead votes approve/reject - Blue Lead votes approve/reject - Both approve → VALIDATED - Either rejects → REJECTED - One votes, other pending → stays IN_REVIEW **Auto Re-testing:** When remediation is completed on a validated test, the system automatically creates a follow-up retest (up to `MAX_RETEST_COUNT` = 3). --- ## Frontend Architecture ### Key Technologies - **React 19** with TypeScript - **Vite 7** for bundling - **Tailwind CSS v4** for styling - **TanStack Query** for server state management - **TanStack Virtual** for table virtualization - **React Router v7** for routing - **Recharts** for charts and visualizations - **Lucide React** for icons ### Page Lazy Loading All pages except `LoginPage` and `DashboardPage` are lazy-loaded via `React.lazy()` with `` fallbacks for optimal initial bundle size. ### Role-Based Navigation The sidebar dynamically filters navigation items based on the current user's role: | Section | Visible to | |---------|-----------| | Dashboard | All roles | | Executive Dashboard | admin, red_lead, blue_lead | | ATT&CK Matrix | All roles | | Tests (sub-menu) | All roles | | Campaigns | All roles | | Threat Actors | All roles | | Compliance | All roles | | Comparison | admin, red_lead, blue_lead | | Reports | All roles | | System (admin section) | admin only | ### Performance Optimizations - **React.memo** on `HeatmapCell` (renders 3000+ times in full matrix) - **useMemo** / **useCallback** for expensive calculations in memoized components - **useDebounce** hook for search inputs (300ms delay) - **TanStack Virtual** for large table virtualization (test templates, detection rules, audit logs) - **Lazy loading** for all non-critical page bundles