refactor(detection-rules): extract query/business logic to detection_rule_service, router is thin HTTP adapter

This commit is contained in:
2026-02-19 17:39:31 +01:00
parent d305db8794
commit 560fc0c9f0
7 changed files with 5853 additions and 282 deletions

View File

@@ -0,0 +1,228 @@
# Aegis — C4 Container Diagram (Level 2)
> **Author:** Architecture review
> **Date:** February 11, 2026
> **Notation:** C4 Model — Level 2 (Container Diagram)
---
## Diagram
```mermaid
C4Container
title Aegis — Container Diagram (C4 Level 2)
%% ─── Actors ─────────────────────────────────────────────────────
Person(security_team, "Security Team", "Red/Blue Technicians, Red/Blue Leads, Viewers — interact with the platform via browser")
Person(admin, "Administrator", "Manages users, triggers data syncs, configures scoring weights, reviews audit logs")
%% ─── System Boundary: Aegis Platform ────────────────────────────
Container_Boundary(aegis, "Aegis Platform") {
Container(frontend, "Frontend SPA", "React 19, TypeScript, Vite, Tailwind CSS, Nginx", "Single-page application served by Nginx in production. Provides dashboards, ATT&CK heatmaps, test workflows, campaign management, compliance views, and report exports. Proxies /api/ requests to backend.")
Container(backend, "Backend API", "Python 3.11, FastAPI, Uvicorn, SQLAlchemy", "REST API serving 21 router modules under /api/v1. Handles authentication (JWT + HttpOnly cookies), RBAC authorization (6 roles), Red/Blue test workflows, scoring engine, heatmap generation, report building, and CRUD for all domain entities. Rate-limited with SlowAPI.")
Container(scheduler, "Background Scheduler", "APScheduler (in-process)", "Runs inside the backend process as a BackgroundScheduler thread. Executes 5 periodic jobs: MITRE ATT&CK sync (24h), intel scan (7d), notification cleanup (24h), weekly coverage snapshot (Sundays 00:00), recurring campaigns check (24h). Each job manages its own DB session.")
ContainerDb(postgres, "PostgreSQL 15", "PostgreSQL, Alpine", "Primary relational data store. Holds techniques, tests, users, campaigns, threat actors, detection rules, compliance mappings, audit logs, notifications, coverage snapshots, and scoring configuration. Schema managed by Alembic migrations (18 versions).")
ContainerDb(minio, "MinIO", "MinIO (S3-compatible), Alpine", "Object storage for Red/Blue team evidence files (screenshots, logs, PCAPs, documents). Stores files in the 'evidence' bucket. Backend generates presigned URLs for secure direct downloads.")
}
%% ─── External Systems ───────────────────────────────────────────
System_Ext(mitre_taxii, "MITRE ATT&CK TAXII Server", "STIX/TAXII 2.0 feed providing Enterprise ATT&CK techniques and tactics catalog")
System_Ext(mitre_cti, "MITRE CTI GitHub", "STIX 2.0 bundles: ATT&CK techniques (fallback), threat actors (intrusion-sets), actor-technique relationships")
System_Ext(d3fend, "MITRE D3FEND API", "REST API providing defensive techniques and ATT&CK-to-D3FEND countermeasure mappings")
System_Ext(atomic, "Atomic Red Team", "GitHub repository with 1500+ atomic test YAML files mapped to ATT&CK techniques")
System_Ext(sigma, "SigmaHQ", "GitHub repository with Sigma detection rules in YAML, tagged with ATT&CK technique IDs")
System_Ext(elastic, "Elastic Detection Rules", "GitHub repository with Elastic SIEM rules in TOML format with MITRE threat mappings")
System_Ext(caldera, "MITRE CALDERA", "GitHub repository with CALDERA abilities in YAML, organized by tactic")
System_Ext(lolbas, "LOLBAS / GTFOBins", "GitHub repositories for Living Off The Land binaries (Windows) and GTFOBins (Linux)")
%% ─── Planned Systems ────────────────────────────────────────────
System_Ext(github_actions, "GitHub Actions (Planned)", "Future CI/CD: lint, type check, pytest, Docker build, deploy")
System_Ext(artifactory, "Artifactory (Planned)", "Future artifact repository for Docker images and versioned build artifacts")
%% ─── Relationships: Users → Containers ──────────────────────────
Rel(security_team, frontend, "Uses", "HTTPS / Browser")
Rel(admin, frontend, "Uses", "HTTPS / Browser")
%% ─── Relationships: Frontend → Backend ──────────────────────────
Rel(frontend, backend, "Proxies API requests to", "HTTP (Nginx reverse proxy to backend:8000/api/)")
%% ─── Relationships: Backend → Data Stores ───────────────────────
Rel(backend, postgres, "Reads/writes domain data", "TCP/5432, SQLAlchemy ORM")
Rel(backend, minio, "Uploads/downloads evidence files", "HTTP/9000, boto3 S3 API")
%% ─── Relationships: Scheduler → Data Stores ─────────────────────
Rel(scheduler, postgres, "Reads/writes via own sessions", "TCP/5432, SQLAlchemy")
%% ─── Relationships: Backend/Scheduler → External Sources ────────
Rel(scheduler, mitre_taxii, "Syncs techniques every 24h", "TAXII 2.0 / HTTPS")
Rel(backend, mitre_cti, "Imports threat actors + fallback sync", "HTTPS, ZIP download")
Rel(backend, d3fend, "Imports D3FEND techniques and mappings", "REST API / HTTPS")
Rel(backend, atomic, "Imports atomic test templates", "HTTPS, ZIP ~40MB")
Rel(backend, sigma, "Imports Sigma detection rules", "HTTPS, ZIP download")
Rel(backend, elastic, "Imports Elastic detection rules", "HTTPS, ZIP download")
Rel(backend, caldera, "Imports CALDERA abilities", "HTTPS, ZIP download")
Rel(backend, lolbas, "Imports LOLBAS and GTFOBins", "HTTPS, ZIP download")
%% ─── Relationships: Planned ─────────────────────────────────────
Rel(github_actions, backend, "Builds, tests, deploys (planned)", "HTTPS")
Rel(github_actions, frontend, "Builds, deploys (planned)", "HTTPS")
Rel(github_actions, artifactory, "Pushes Docker images (planned)", "HTTPS")
UpdateLayoutConfig($c4ShapeInRow="3", $c4BoundaryInRow="1")
```
---
## Container Responsibilities
### Frontend SPA
| Attribute | Detail |
|-----------|--------|
| **Technology** | React 19, TypeScript 5.9, Vite 7.3, Tailwind CSS 4, React Router 7 |
| **Runtime (Dev)** | Node 20 + Vite dev server on port 5173 |
| **Runtime (Prod)** | Nginx Alpine serving static build artifacts on port 80 |
| **State Management** | AuthContext (React Context) + TanStack React Query for server state |
| **API Communication** | Axios client with `withCredentials: true` (HttpOnly JWT cookie) |
| **Security** | CSP headers, X-Frame-Options: DENY, X-Content-Type-Options: nosniff, gzip compression |
| **Responsibilities** | Render UI (21 pages, 30+ components), route protection by role, lazy loading, API proxy via Nginx `/api/``backend:8000/api/` |
### Backend API
| Attribute | Detail |
|-----------|--------|
| **Technology** | Python 3.11, FastAPI, Uvicorn, SQLAlchemy, Alembic, Pydantic |
| **Runtime** | Uvicorn ASGI server on port 8000 (behind Nginx proxy) |
| **API Surface** | 21 routers, 80+ endpoints under `/api/v1` |
| **Auth** | JWT (HS256) in HttpOnly cookie, bcrypt passwords, in-memory token blacklist |
| **RBAC** | 6 roles: admin, red_tech, blue_tech, red_lead, blue_lead, viewer |
| **Rate Limiting** | SlowAPI (5 req/min on login) |
| **Error Handling** | Global handlers for ValidationError → 400, SQLAlchemyError → 500, Exception → 500 |
| **Responsibilities** | All business logic, test workflow state machine, scoring engine, heatmap generation, report building, CRUD, data import orchestration, audit logging |
### Background Scheduler
| Attribute | Detail |
|-----------|--------|
| **Technology** | APScheduler `BackgroundScheduler` (runs in-process within backend) |
| **Lifecycle** | Starts on FastAPI lifespan startup, shuts down on app shutdown |
| **Session Model** | Each job creates and closes its own `SessionLocal()` instance |
| **Registered Jobs** | See table below |
| Job | Trigger | Frequency | Action |
|-----|---------|-----------|--------|
| `mitre_sync` | Interval | Every 24 hours | Syncs ATT&CK techniques via TAXII 2.0 (fallback: GitHub ZIP) |
| `intel_scan` | Interval | Every 7 days | Scans threat intelligence sources for new indicators |
| `notification_cleanup` | Interval | Every 24 hours | Deletes read notifications older than 90 days |
| `weekly_snapshot` | Cron | Sundays at 00:00 | Creates coverage snapshot, cleans up old ones (keeps last 52) |
| `recurring_campaigns` | Interval | Every 24 hours | Checks and spawns due recurring test campaigns |
### PostgreSQL 15
| Attribute | Detail |
|-----------|--------|
| **Image** | `postgres:15-alpine` |
| **Database** | `attackdb` |
| **Schema Management** | Alembic with 18 migration versions |
| **Connection** | `postgresql://user:pass@postgres:5432/attackdb` via SQLAlchemy |
| **Volumes** | Named volume `aegis_postgres_data_prod` for persistence |
| **Health Check** | `pg_isready` every 5 seconds |
| **Data Stored** | Techniques (ATT&CK), tests (Red/Blue workflow), users, campaigns, threat actors, detection rules (Sigma/Elastic), D3FEND mappings, compliance frameworks, audit logs, notifications, coverage snapshots, scoring config, intel items, data sources, evidence metadata |
### MinIO (S3-compatible)
| Attribute | Detail |
|-----------|--------|
| **Image** | `minio/minio:latest` |
| **Ports** | 9000 (S3 API), 9001 (admin console) |
| **Bucket** | `evidence` (auto-created on backend startup) |
| **Access** | Via boto3 S3 API from backend |
| **Volumes** | Named volume `aegis_minio_data_prod` for persistence |
| **Responsibilities** | Store Red/Blue team evidence files (screenshots, logs, PCAPs). Backend generates time-limited presigned URLs for secure browser downloads. |
### GitHub Actions (Planned)
| Attribute | Detail |
|-----------|--------|
| **Status** | Not yet implemented — no `.github/workflows/` directory exists |
| **Planned Scope** | Lint (ruff/flake8), type check (mypy), unit/integration tests (pytest), Docker image build, deploy to staging/production |
| **Integration** | Would trigger on push/PR to main branch |
| **Artifact Flow** | Build Docker images → push to Artifactory → deploy via compose |
### Artifactory (Planned)
| Attribute | Detail |
|-----------|--------|
| **Status** | Not yet implemented — no integration code exists |
| **Planned Scope** | Docker image registry for versioned backend/frontend images |
| **Integration** | Receive images from GitHub Actions CI pipeline, serve to production deploy |
---
## Network Topology
```
Internet
│ HTTPS (:80 / :443)
┌─────────────────┐
│ Frontend │
│ Nginx + React │
│ :80 │
└────────┬────────┘
┌────────────┼────────────────────────────────┐
│ │ aegis-network (bridge) │
│ │ /api/ proxy │
│ ▼ │
│ ┌─────────────────┐ │
│ │ Backend API │◄── Scheduler │
│ │ FastAPI/Uvicorn │ (in-process thread) │
│ │ :8000 │ │
│ └───┬─────────┬──┘ │
│ │ │ │
│ │ │ │
│ ▼ ▼ │
│ ┌─────────┐ ┌───────┐ │
│ │PostgreSQL│ │ MinIO │ │
│ │ :5432 │ │ :9000 │ │
│ └─────────┘ └───────┘ │
│ │
└──────────────────────────────────────────────┘
│ HTTPS (outbound only)
External Data Sources
(MITRE, SigmaHQ, Elastic, etc.)
```
---
## Data Flow Summary
| Flow | Path | Protocol | Notes |
|------|------|----------|-------|
| User → UI | Browser → Nginx | HTTPS | Static SPA assets, gzip compressed, 1-year cache for static files |
| UI → API | Nginx → Uvicorn | HTTP (internal) | Reverse proxy with 300s timeout for long sync operations |
| API → DB | Uvicorn → PostgreSQL | TCP/5432 | SQLAlchemy ORM, request-scoped sessions via `get_db()` |
| API → Storage | Uvicorn → MinIO | HTTP/9000 | boto3 S3 API, presigned URLs for downloads |
| Scheduler → DB | APScheduler thread → PostgreSQL | TCP/5432 | Independent sessions per job, created/closed in try/finally |
| Scheduler → External | APScheduler thread → MITRE TAXII | HTTPS | Scheduled sync every 24h, fallback to GitHub ZIP |
| Admin → External | API on-demand → GitHub repos | HTTPS | ZIP download triggered by admin via `/api/v1/system/*` endpoints |
| Health Check | Docker → Backend `/health` | HTTP (internal) | Restricted to private IPs via Nginx `allow/deny` directives |