feat(phase-33): final polish V3 - navigation, performance, and documentation (T-238 to T-240)
This commit is contained in:
220
docs/ARCHITECTURE.md
Normal file
220
docs/ARCHITECTURE.md
Normal file
@@ -0,0 +1,220 @@
|
||||
# Aegis — Architecture
|
||||
|
||||
## High-Level Overview
|
||||
|
||||
```
|
||||
┌────────────────────┐ ┌─────────────────────┐
|
||||
│ React Frontend │──────▶│ FastAPI Backend │
|
||||
│ (Vite / TS / TW) │ REST │ (Python 3.11) │
|
||||
└────────────────────┘ └──────┬──────┬────────┘
|
||||
│ │
|
||||
┌─────────┘ └─────────┐
|
||||
▼ ▼
|
||||
┌─────────────────┐ ┌─────────────────┐
|
||||
│ PostgreSQL │ │ MinIO │
|
||||
│ (Data Store) │ │ (Object Storage) │
|
||||
└─────────────────┘ └─────────────────┘
|
||||
```
|
||||
|
||||
- **Frontend** — React 19 + TypeScript + Tailwind CSS v4 + TanStack Query
|
||||
- **Backend** — FastAPI with SQLAlchemy ORM + Alembic migrations
|
||||
- **Database** — PostgreSQL 15 with UUID primary keys and JSONB columns
|
||||
- **Object Storage** — MinIO (S3-compatible) for evidence files
|
||||
- **Scheduler** — APScheduler (in-process) for background jobs
|
||||
|
||||
---
|
||||
|
||||
## Database Schema
|
||||
|
||||
### Core Tables
|
||||
|
||||
| Table | Description |
|
||||
|-------|-------------|
|
||||
| `users` | User accounts with role-based access (admin, red_tech, blue_tech, red_lead, blue_lead, viewer) |
|
||||
| `techniques` | MITRE ATT&CK techniques with coverage status, tactic, platforms (JSONB) |
|
||||
| `tests` | Security tests with full Red/Blue workflow fields, dual validation, remediation, and retest chain |
|
||||
| `test_templates` | Predefined test catalog from Atomic Red Team, Sigma, CALDERA, LOLBAS, custom |
|
||||
| `evidences` | Evidence files separated by team (red/blue) with SHA256 integrity verification |
|
||||
|
||||
### Detection & Defense
|
||||
|
||||
| Table | Description |
|
||||
|-------|-------------|
|
||||
| `detection_rules` | Imported detection rules (Sigma, Elastic, custom) linked to ATT&CK techniques |
|
||||
| `test_detection_results` | Per-test detection rule evaluation results (triggered / not triggered) |
|
||||
| `test_template_detection_rules` | Template ↔ detection rule associations |
|
||||
| `defensive_techniques` | MITRE D3FEND defensive techniques |
|
||||
| `defensive_technique_mappings` | ATT&CK technique ↔ D3FEND defensive technique mappings |
|
||||
|
||||
### Campaigns & Scheduling
|
||||
|
||||
| Table | Description |
|
||||
|-------|-------------|
|
||||
| `campaigns` | Test campaign groupings with scheduling (recurring, weekly/monthly/quarterly) |
|
||||
| `campaign_tests` | Ordered test assignments within campaigns with dependency support |
|
||||
|
||||
### Intelligence & Actors
|
||||
|
||||
| Table | Description |
|
||||
|-------|-------------|
|
||||
| `threat_actors` | MITRE CTI intrusion sets with aliases, country, motivation, JSONB targets |
|
||||
| `threat_actor_techniques` | Threat actor ↔ ATT&CK technique mappings |
|
||||
| `intel_items` | Threat intelligence items from RSS feeds |
|
||||
|
||||
### Compliance
|
||||
|
||||
| Table | Description |
|
||||
|-------|-------------|
|
||||
| `compliance_frameworks` | Compliance frameworks (e.g., NIST 800-53) |
|
||||
| `compliance_controls` | Individual controls within a framework |
|
||||
| `compliance_control_mappings` | Control ↔ ATT&CK technique mappings |
|
||||
|
||||
### Operational
|
||||
|
||||
| Table | Description |
|
||||
|-------|-------------|
|
||||
| `coverage_snapshots` | Point-in-time coverage status captures with aggregate metrics |
|
||||
| `snapshot_technique_states` | Normalized per-technique state within a snapshot |
|
||||
| `audit_logs` | System-wide audit trail with JSONB details |
|
||||
| `notifications` | In-app notifications with read status |
|
||||
| `data_sources` | External data source configuration and sync status |
|
||||
|
||||
### Key Relationships
|
||||
|
||||
```
|
||||
Technique ──1:N── Test ──1:N── Evidence
|
||||
│ │
|
||||
│ ├── TestDetectionResult ──N:1── DetectionRule
|
||||
│ └── CampaignTest ──N:1── Campaign
|
||||
│
|
||||
├── ThreatActorTechnique ──N:1── ThreatActor
|
||||
├── DefensiveTechniqueMapping ──N:1── DefensiveTechnique
|
||||
├── ComplianceControlMapping ──N:1── ComplianceControl ──N:1── ComplianceFramework
|
||||
└── SnapshotTechniqueState ──N:1── CoverageSnapshot
|
||||
|
||||
Test ──retest_of──▶ Test (self-referential retest chain)
|
||||
Campaign ──parent_campaign_id──▶ Campaign (recurring execution history)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Backend Architecture
|
||||
|
||||
### Layered Structure
|
||||
|
||||
```
|
||||
routers/ ← HTTP endpoints (input validation, auth, response shaping)
|
||||
↓
|
||||
services/ ← Business logic (state machines, calculations, imports)
|
||||
↓
|
||||
models/ ← SQLAlchemy ORM models
|
||||
↓
|
||||
database.py ← Engine + session management (lazy initialization)
|
||||
```
|
||||
|
||||
### Services
|
||||
|
||||
| Service | Responsibility |
|
||||
|---------|---------------|
|
||||
| `test_workflow_service` | Test state machine (draft → validated/rejected) with dual validation |
|
||||
| `scoring_service` | 0–100 scoring for techniques, tactics, actors, organization |
|
||||
| `score_cache` | In-memory TTL cache (5 min) for expensive score/metric calculations |
|
||||
| `operational_metrics_service` | MTTD, MTTR, detection efficacy, alert fidelity, coverage velocity |
|
||||
| `snapshot_service` | Coverage snapshot creation, temporal comparison, cleanup |
|
||||
| `campaign_service` | Campaign CRUD, progress tracking, circular dependency prevention |
|
||||
| `campaign_scheduler_service` | Recurring campaign execution (clone + schedule next run) |
|
||||
| `status_service` | Technique status recalculation from test results |
|
||||
| `notification_service` | In-app notification CRUD and state-change alerts |
|
||||
| `audit_service` | Immutable audit trail logging |
|
||||
| `mitre_sync_service` | MITRE ATT&CK sync via TAXII 2.0 / GitHub fallback |
|
||||
| `atomic_import_service` | Atomic Red Team template import from GitHub |
|
||||
| `sigma_import_service` | SigmaHQ detection rule import |
|
||||
| `elastic_import_service` | Elastic detection rule import (TOML) |
|
||||
| `caldera_import_service` | CALDERA ability import |
|
||||
| `lolbas_import_service` | LOLBAS/GTFOBins template import |
|
||||
| `d3fend_import_service` | MITRE D3FEND defensive technique import |
|
||||
| `threat_actor_import_service` | MITRE CTI threat actor import (STIX) |
|
||||
| `compliance_import_service` | NIST 800-53 ↔ ATT&CK mapping import |
|
||||
| `intel_service` | RSS-based threat intelligence scanning |
|
||||
|
||||
### Scheduled Jobs (APScheduler)
|
||||
|
||||
| Job | Schedule | Description |
|
||||
|-----|----------|-------------|
|
||||
| MITRE Sync | Every 24h | Sync ATT&CK techniques from TAXII/GitHub |
|
||||
| Intel Scan | Every 7 days | Scan RSS feeds for threat intelligence |
|
||||
| Notification Cleanup | Every 24h | Remove old read notifications |
|
||||
| Weekly Snapshot | Sundays 00:00 | Create coverage snapshot + cleanup old ones |
|
||||
| Recurring Campaigns | Every 24h | Check and execute due recurring campaigns |
|
||||
|
||||
---
|
||||
|
||||
## Test Lifecycle (State Machine)
|
||||
|
||||
```
|
||||
┌──────┐ ┌──────────────┐ ┌─────────────────┐ ┌───────────┐
|
||||
│ DRAFT│───▶│RED_EXECUTING │───▶│ BLUE_EVALUATING │───▶│ IN_REVIEW │
|
||||
└──────┘ └──────────────┘ └─────────────────┘ └─────┬─────┘
|
||||
│
|
||||
┌───────────────────┤
|
||||
▼ ▼
|
||||
┌──────────┐ ┌──────────┐
|
||||
│ REJECTED │ │VALIDATED │
|
||||
└────┬─────┘ └──────────┘
|
||||
│ │
|
||||
└──▶ Back to DRAFT ├──▶ Remediation
|
||||
└──▶ Auto Re-test
|
||||
```
|
||||
|
||||
**Dual Validation in IN_REVIEW:**
|
||||
- Red Lead votes approve/reject
|
||||
- Blue Lead votes approve/reject
|
||||
- Both approve → VALIDATED
|
||||
- Either rejects → REJECTED
|
||||
- One votes, other pending → stays IN_REVIEW
|
||||
|
||||
**Auto Re-testing:** When remediation is completed on a validated test, the system automatically creates a follow-up retest (up to `MAX_RETEST_COUNT` = 3).
|
||||
|
||||
---
|
||||
|
||||
## Frontend Architecture
|
||||
|
||||
### Key Technologies
|
||||
|
||||
- **React 19** with TypeScript
|
||||
- **Vite 7** for bundling
|
||||
- **Tailwind CSS v4** for styling
|
||||
- **TanStack Query** for server state management
|
||||
- **TanStack Virtual** for table virtualization
|
||||
- **React Router v7** for routing
|
||||
- **Recharts** for charts and visualizations
|
||||
- **Lucide React** for icons
|
||||
|
||||
### Page Lazy Loading
|
||||
|
||||
All pages except `LoginPage` and `DashboardPage` are lazy-loaded via `React.lazy()` with `<Suspense>` fallbacks for optimal initial bundle size.
|
||||
|
||||
### Role-Based Navigation
|
||||
|
||||
The sidebar dynamically filters navigation items based on the current user's role:
|
||||
|
||||
| Section | Visible to |
|
||||
|---------|-----------|
|
||||
| Dashboard | All roles |
|
||||
| Executive Dashboard | admin, red_lead, blue_lead |
|
||||
| ATT&CK Matrix | All roles |
|
||||
| Tests (sub-menu) | All roles |
|
||||
| Campaigns | All roles |
|
||||
| Threat Actors | All roles |
|
||||
| Compliance | All roles |
|
||||
| Comparison | admin, red_lead, blue_lead |
|
||||
| Reports | All roles |
|
||||
| System (admin section) | admin only |
|
||||
|
||||
### Performance Optimizations
|
||||
|
||||
- **React.memo** on `HeatmapCell` (renders 3000+ times in full matrix)
|
||||
- **useMemo** / **useCallback** for expensive calculations in memoized components
|
||||
- **useDebounce** hook for search inputs (300ms delay)
|
||||
- **TanStack Virtual** for large table virtualization (test templates, detection rules, audit logs)
|
||||
- **Lazy loading** for all non-critical page bundles
|
||||
Reference in New Issue
Block a user