docs: update architecture analysis and tech debt docs to reflect resolved items

2026-02-18 19:27:52 +01:00
parent f41b8fd8c2
commit 0b65f51d1c
5 changed files with 3131 additions and 0 deletions
@@ -0,0 +1,684 @@
 # Aegis — Deep Architectural Analysis
 > **Author:** Automated architecture review  
 > **Date:** February 11, 2026 (updated February 18, 2026)  
 > **Scope:** Backend (FastAPI/Python), Frontend (React/TypeScript), Infrastructure (Docker)
 >
 > **Note:** Sections marked with ✅ reflect changes implemented since the initial analysis.
 ---
 ## Table of Contents
 1. [Current Architecture](#1-current-architecture)
 2. [Coupling Analysis](#2-coupling-analysis)
 3. [Business Logic vs Infrastructure Separation](#3-business-logic-vs-infrastructure-separation)
 4. [SOLID Evaluation](#4-solid-evaluation)
 5. [Architectural Risks](#5-architectural-risks)
 6. [Refactor Proposal Towards Clean Architecture](#6-refactor-proposal-towards-clean-architecture)
 7. [Executive Summary](#7-executive-summary)
 ---
 ## 1. Current Architecture
 ### 1.1. Classification: Layered Monolith with Incomplete Service Layer
 Aegis follows a **layered monolithic architecture** deployed as two containers (backend + frontend) with a **partial and inconsistent** level of separation. It is not Clean Architecture, nor Hexagonal, nor microservices.
 ```
 ┌─────────────────────────────────────────────────┐
 │                   FRONTEND                       │
 │         React 19 + TypeScript + Vite             │
 │  ┌──────────┐  ┌──────────┐  ┌───────────────┐  │
 │  │  Pages   │→ │ API Layer│→ │ Axios Client  │  │
 │  │(21 pages)│  │(22 mods) │  │(HttpOnly JWT) │  │
 │  └──────────┘  └──────────┘  └───────────────┘  │
 └────────────────────────┬────────────────────────┘
                         │ HTTP/REST
 ┌────────────────────────▼────────────────────────┐
 │                   BACKEND                        │
 │              FastAPI + SQLAlchemy                 │
 │                                                  │
 │  ┌─────────────────────────────────────────────┐ │
 │  │              Router Layer (21 routers)       │ │
 │  │  Contains: validation, queries, partial     │ │
 │  │  business logic, serialization, auditing    │ │
 │  └────────┬──────────────────┬─────────────────┘ │
 │           │                  │                    │
 │  ┌────────▼───────┐  ┌──────▼──────────────────┐ │
 │  │ Service Layer  │  │   Direct DB Access       │ │
 │  │ (20 services)  │  │   (SQLAlchemy queries    │ │
 │  │ Partial: only  │  │    inside routers)       │ │
 │  │ for workflows  │  │                          │ │
 │  └────────┬───────┘  └──────┬──────────────────┘ │
 │           │                  │                    │
 │  ┌────────▼──────────────────▼─────────────────┐ │
 │  │          Model Layer (18 models)             │ │
 │  │     SQLAlchemy ORM — Anemic Domain Models    │ │
 │  └────────────────────┬────────────────────────┘ │
 │                       │                          │
 │  ┌────────────────────▼────────────────────────┐ │
 │  │          Database Layer                      │ │
 │  │  PostgreSQL + MinIO (evidence storage)       │ │
 │  └─────────────────────────────────────────────┘ │
 └──────────────────────────────────────────────────┘
 ```
 ### 1.2. Actual Distribution of Responsibilities
 | Layer | Files | Actual Responsibility |
 |-------|-------|----------------------|
 | **Routers** | 21 files | Validation, auth, direct SQL queries, partial business logic, serialization, CSV/JSON report generation |
 | **Services** | 20 files | Complex workflows (test state machine, scoring, notifications), external data source imports |
 | **Models** | 18 files | ORM table definitions — purely anemic (no behavior) |
 | **Schemas** | 10 files | Pydantic DTOs for request/response |
 | **Database** | 1 file | Session factory and `get_db()` generator |
 ### 1.3. The Core Problem: Two Coexisting Patterns
 Aegis has a **split architectural personality**:
 **Pattern A — Router-as-Controller (direct CRUD):**  
 Routers like `techniques.py`, `evidence.py`, `users.py`, `audit.py`, `reports.py`, `heatmap.py`, `metrics.py`, `detection_rules.py`, `threat_actors.py` execute SQLAlchemy queries directly:
 ```python
 # techniques.py — direct query inside the router
 query = db.query(Technique)
 if tactic is not None:
    query = query.filter(Technique.tactic == tactic)
 return query.order_by(Technique.mitre_id).all()
 ```
 **Pattern B — Router-delegates-to-Service:**  
 Routers like `tests.py`, `scores.py`, `notifications.py`, `campaigns.py` delegate to services:
 ```python
 # tests.py — delegates to workflow service
 wf_start_execution(db=db, test=test, user=current_user)
 ```
 **The result:** There is no clear contract about where logic lives. A new developer cannot predict whether to look for logic in the router or in a service.
 ---
 ## 2. Coupling Analysis
 ### 2.1. Coupling Matrix
 ```
              Routers    Services    Models    Database    Schemas    Config
 Routers         —         MEDIUM     HIGH      HIGH        HIGH      LOW
 Services       LOW          —        HIGH      HIGH        NONE      MEDIUM
 Models         NONE       NONE        —        HIGH        NONE      NONE
 Schemas        NONE       NONE       LOW        —          NONE      NONE
 Database       NONE       NONE       NONE       —          NONE      LOW
 ```
 ### 2.2. Router ↔ Model — HIGH COUPLING (Critical)
 Routers import and use SQLAlchemy models directly. **11 out of 21 routers** execute SQL queries without an intermediary:
 | Router | Directly Imported Models | Queries Inside Router |
 |--------|--------------------------|----------------------|
 | `techniques.py` | Technique | `db.query(Technique).filter(...)` |
 | `evidence.py` | Evidence, Test | `db.query(Evidence).filter(...)` |
 | `users.py` | User | `db.query(User).filter(...)` |
 | `audit.py` | AuditLog | `db.query(AuditLog).filter(...)` |
 | `reports.py` | Technique, Test | `db.query(Technique)...`, `db.query(Test)...` |
 | `heatmap.py` | Technique, Test, ThreatActor, DetectionRule, Campaign, DefensiveTechniqueMapping | Multiple complex queries |
 | `metrics.py` | Technique, Test | Aggregations with `func.count` |
 | `detection_rules.py` | DetectionRule, TestDetectionResult | Direct CRUD |
 | `threat_actors.py` | ThreatActor, ThreatActorTechnique, Technique | Queries with joins |
 | `data_sources.py` | DataSource, Technique, Test | CRUD + stats queries |
 | `compliance.py` | ComplianceFramework, ComplianceControl, etc. | Compliance queries |
 **Impact:** Changing a table schema requires modifying both the model and every router that queries it directly. There is no indirection.
 ### 2.3. Router ↔ Database — HIGH COUPLING
 All routers receive `db: Session = Depends(get_db)` and operate with the SQLAlchemy session directly. This means:
 - Routers know the ORM (`db.query`, `db.add`, `db.commit`, `joinedload`)
 - Routers handle transactions implicitly
 - There is no persistence abstraction — migrating from SQLAlchemy to another ORM or raw queries would require rewriting **all** routers
 ### 2.4. Service ↔ Model/Database — HIGH COUPLING
 Services also access SQLAlchemy directly:
 ```python
 # scoring_service.py
 all_tests = db.query(Test).filter(Test.technique_id == technique.id).all()
 # notification_service.py
 notif = db.query(Notification).filter(...).first()
 ```
 Services do not use repositories or abstractions — they are essentially functions that orchestrate queries and logic.
 ### 2.5. Service ↔ Service — MEDIUM COUPLING
 Inter-service coupling exists:
 - `test_workflow_service` → `audit_service` + `notification_service`
 - `scoring_service` reads from `settings` directly (mutable global config)
 - `campaign_scheduler_service` → `campaign_service`
 There is no dependency injection between services — everything is direct imports.
 ### 2.6. Service ↔ Framework — ✅ RESOLVED (was HIGH COUPLING)
 ~~Domain services import `HTTPException` from FastAPI.~~
 **Update (Feb 18):** `test_workflow_service.py` now raises domain exceptions (`InvalidOperationError`, `InvalidStateTransition`) from `app.domain.exceptions`. The `middleware/error_handler.py` maps these to HTTP responses automatically. Services no longer import `HTTPException`.
 ```python
 # Current: domain/errors.py exceptions mapped by middleware
 raise InvalidStateTransition(current_state=..., target_state=..., entity_type="Test")
 # middleware/error_handler.py → 400 Bad Request automatically
 ```
 ### 2.7. Frontend ↔ Backend — LOW COUPLING (Correct)
 Communication is via REST API with aligned but independent types (`types/models.ts` vs `schemas/*.py`). The frontend uses Axios with interceptors — good decoupling.
 ---
 ## 3. Business Logic vs Infrastructure Separation
 ### 3.1. Diagnosis: INSUFFICIENT SEPARATION
 | Aspect | Status | Detail |
 |--------|--------|--------|
 | **Workflow logic** | PARTIAL | `test_workflow_service.py` correctly encapsulates the state machine. It is the best designed service. |
 | **Scoring** | PARTIAL | `scoring_service.py` encapsulates calculations but accesses DB directly and reads `settings` as mutable global state. |
 | **CRUD** | NOT SEPARATED | CRUD operations live in routers, mixed with HTTP concerns. |
 | **Report generation** | NOT SEPARATED | `reports.py` (router) builds complex CSVs and JSONs with inline queries of 50+ lines. |
 | **Heatmap/visualization** | NOT SEPARATED | `heatmap.py` (router) has ~500 lines with all ATT&CK Navigator mapping logic embedded. |
 | **Metrics** | NOT SEPARATED | `metrics.py` and `operational_metrics.py` (routers) have complex aggregation queries. |
 | **Data import** | WELL SEPARATED | The 8 import services (`atomic_import_service`, `sigma_import_service`, etc.) are correctly isolated. |
 | **Notifications** | WELL SEPARATED | `notification_service.py` encapsulates all logic. |
 | **Auditing** | WELL SEPARATED | `audit_service.py` is a pure `log_action()` function. |
 ### 3.2. Anemic Model (Anti-pattern)
 SQLAlchemy models are purely declarative — they have no business methods:
 ```python
 # models/test.py — columns only, zero behavior
 class Test(Base):
    __tablename__ = "tests"
    id = Column(UUID, primary_key=True)
    state = Column(Enum(TestState))
    # ... more columns
    # Missing: can_transition(), validate(), calculate_score()
 ```
 Logic that should be in domain models (business validations, state transitions, calculations) is scattered across routers and services.
 ### 3.3. Infrastructure Bleeding Into Logic
 | Infrastructure | Where It Appears Inappropriately |
 |---------------|--------------------------------|
 | `SQLAlchemy Session` | Inside domain services (scoring, workflow, notifications) |
 | `FastAPI HTTPException` | Inside domain services (test_workflow_service) |
 | `MinIO/boto3` | `storage.py` is well isolated, but called from routers directly |
 | `APScheduler` | Directly coupled in `jobs/mitre_sync_job.py` with `SessionLocal()` |
 ---
 ## 4. SOLID Evaluation
 ### 4.1. Single Responsibility Principle (SRP) — PARTIAL VIOLATION
 | Component | Compliant? | Issue |
 |-----------|-----------|-------|
 | `heatmap.py` (router) | NO | 528 lines — HTTP handling + query building + color mapping + Navigator JSON serialization + export logic |
 | `reports.py` (router) | NO | HTTP handling + aggregation queries + CSV generation + JSON formatting |
 | `tests.py` (router) | PARTIAL | Delegates workflow but maintains CRUD, template instantiation, timeline queries |
 | `scoring_service.py` | PARTIAL | Scoring + mutable global config reading + direct queries |
 | `test_workflow_service.py` | YES | Single responsibility: test state machine |
 | `notification_service.py` | YES | Single responsibility: notification management |
 | `audit_service.py` | YES | Single responsibility: audit logging |
 **Verdict:** Well-isolated services comply with SRP. "Fat routers" flagrantly violate it.
 ### 4.2. Open/Closed Principle (OCP) — VIOLATION
 - **Scoring weights:** Scoring weights are read from `settings` (mutable global object). The `scores.py` router allows **mutating `settings` directly at runtime** via a PATCH endpoint. This is a global change without persistence that affects all requests.
 - **Heatmap layers:** Each heatmap type is a separate endpoint with hardcoded logic. Adding a new layer type requires modifying the router.
 - **Import services:** Each data source is a separate service (`atomic_import_service`, `sigma_import_service`, etc.) without a common interface. Adding a new source requires creating a new service AND modifying `data_sources.py` and `system.py`.
 - **Test states:** The state machine is well defined in `VALID_TRANSITIONS`, but adding a new state requires modifying the dictionary AND potentially all services that read `TestState`.
 ### 4.3. Liskov Substitution Principle (LSP) — N/A (Partial)
 There is no significant inheritance or polymorphism in the backend. Services are functions, not classes. There are no interfaces or abstract classes. **Does not directly apply**, but the absence of formal contracts (protocols/ABCs) is a symptom of not being designed for extensibility.
 ### 4.4. Interface Segregation Principle (ISP) — VIOLATION
 - No interfaces (`Protocol` or `ABC`) exist anywhere in the project.
 - Services expose loose functions, not contracts.
 - Routers depend on complete services when they only use one or two functions.
 - The `Settings` object is a monolithic entity with ~15 properties injected as a global.
 ### 4.5. Dependency Inversion Principle (DIP) — ✅ PARTIALLY RESOLVED (was SEVERE VIOLATION)
 **Update (Feb 18):** Protocol interfaces and abstractions now exist:
 ```python
 # domain/ports/repositories/ — Protocol interfaces
 class TechniqueRepository(Protocol):
    def find_by_id(self, technique_id: UUID) -> TechniqueEntity | None: ...
    def save(self, technique: TechniqueEntity) -> TechniqueEntity: ...
 # dependencies/repositories.py — FastAPI Depends() wiring
 def get_technique_repository(db=Depends(get_db)) -> SATechniqueRepository: ...
 ```
 - **Domain layer** has zero framework imports (no FastAPI, no SQLAlchemy).
 - **Repository ports** define contracts; infrastructure implements them.
 - `test_workflow_service.py` now uses domain exceptions instead of `HTTPException`.
 - `UnitOfWork` manages transactions.
 **Remaining:** Some services still use direct imports for `audit_service`, `notification_service`. Full DIP adoption is incremental.
 ---
 ## 5. Architectural Risks
 ### 5.1. CRITICAL RISK: God Routers
 | Router | Lines | Complexity |
 |--------|-------|------------|
 | `tests.py` | 664 | 15+ endpoints, CRUD + workflow + template instantiation |
 | `heatmap.py` | 528 | 5 endpoints, color logic, Navigator export |
 | `campaigns.py` | ~400+ | CRUD + scheduling + threat actor generation |
 | `reports.py` | 273 | 4 endpoints with complex aggregation queries |
 | `compliance.py` | ~350+ | CRUD + import + gap analysis + CSV export |
 These routers are **Fat Controllers** — they contain logic that should be in services, repositories, or domain objects.
 ### 5.2. ~~CRITICAL RISK: In-Memory Token Blacklist~~ ✅ RESOLVED
 **Update (Feb 18):** The token blacklist is now Redis-backed via `infrastructure/redis_client.py`. Tokens are stored with TTL matching expiration. Shared across all workers and survives restarts.
 ### 5.3. HIGH RISK: Mutable Settings at Runtime
 ```python
 # scores.py — direct mutation of global settings
 settings.SCORING_WEIGHT_TESTS = body.weight_tests
 settings.SCORING_WEIGHT_DETECTION_RULES = body.weight_detection_rules
 ```
 - Changes do not persist between restarts.
 - A server restart loses custom scoring configuration.
 - Thread-unsafe if FastAPI runs with multiple workers.
 - Violates the configuration immutability principle.
 ### 5.4. ~~HIGH RISK: No Repository Layer~~ ✅ PARTIALLY RESOLVED
 **Update (Feb 18):** Repository ports and implementations now exist:
 - `domain/ports/repositories/` — Protocol interfaces for `TechniqueRepository` and `TestRepository`.
 - `infrastructure/persistence/repositories/` — SQLAlchemy implementations (`SATechniqueRepository`, `SATestRepository`) with batch query methods.
 - `dependencies/repositories.py` — FastAPI `Depends()` wiring.
 **Remaining:** Old routers still use direct `db.query()`. Migration is incremental — new endpoints use repositories, old ones coexist.
 ### 5.5. ~~HIGH RISK: No CI/CD~~ ✅ RESOLVED
 **Update (Feb 18):** GitHub Actions CI pipeline exists at `.github/workflows/ci.yml`:
 - Runs `ruff` lint + `pytest` on every push/PR.
 - Uses PostgreSQL + Redis service containers (production-like environment).
 - Local validation via `scripts/agent_validate_backend.sh`.
 ### 5.6. MEDIUM RISK: Background Jobs with Own Sessions
 ```python
 # mitre_sync_job.py
 db = SessionLocal()
 try:
    sync_mitre(db)
 finally:
    db.close()
 ```
 Background jobs create sessions outside the request lifecycle. This is technically correct, but:
 - No robust error handling (no retry mechanism).
 - No observability (no structured logging).
 - No dead letter queue for failed jobs.
 ### 5.7. ~~MEDIUM RISK: Anemic Models~~ ✅ PARTIALLY RESOLVED
 **Update (Feb 18):** Rich domain entities now exist alongside ORM models:
 - `domain/test_entity.py` — Full state machine with business logic, domain events, dual validation, timers.
 - `domain/entities/technique.py` — Status recalculation, review lifecycle, MITRE ID validation.
 - `domain/value_objects/` — `MitreId`, `ScoringWeights` (immutable, validated).
 - ORM models remain anemic by design (persistence mapping only). Business logic lives in domain entities.
 **Remaining:** Campaign, ComplianceFramework, ThreatActor still lack domain entity counterparts.
 ### 5.8. ~~MEDIUM RISK: No Explicit Transaction Management~~ ✅ PARTIALLY RESOLVED
 **Update (Feb 18):** A `UnitOfWork` context manager exists at `domain/unit_of_work.py` with explicit `commit()`, `rollback()`, and `flush()`. Used by `test_workflow_service.py` which explicitly states "The caller (router) is responsible for committing the session via the Unit of Work pattern."
 **Remaining:** Some services like `audit_service.py` still call `db.commit()` directly. Needs incremental migration.
 ### 5.9. LOW RISK: No Semantic API Versioning
 The API is under `/api/v1` but there is no mechanism to support v2 without duplicating entire routers.
 ---
 ## 6. Refactor Proposal Towards Clean Architecture
 ### 6.1. Target Structure
 ```
 backend/
 ├── app/
 │   ├── main.py                          # FastAPI setup (minimal)
 │   ├── config.py                        # Settings (immutable)
 │   │
 │   ├── domain/                          # ★ DOMAIN LAYER (no external dependencies)
 │   │   ├── entities/                    # Entities with behavior
 │   │   │   ├── test.py                  # Test entity with can_transition(), validate()
 │   │   │   ├── technique.py             # Technique with calculate_status()
 │   │   │   ├── campaign.py              # Campaign with add_test(), activate()
 │   │   │   └── ...
 │   │   ├── value_objects/               # Immutable value objects
 │   │   │   ├── score.py                 # TechniqueScore, OrganizationScore
 │   │   │   ├── test_state.py            # TestState with valid transitions
 │   │   │   └── mitre_id.py              # MitreId with validation
 │   │   ├── exceptions.py               # Domain exceptions (NOT HTTPException)
 │   │   │   # InvalidTransitionError, EntityNotFoundError, etc.
 │   │   ├── events.py                   # Domain events
 │   │   │   # TestValidated, TestRejected, CampaignCompleted
 │   │   └── ports/                       # ★ INTERFACES (ABCs / Protocols)
 │   │       ├── repositories/
 │   │       │   ├── test_repository.py   # ABC: find_by_id(), save(), list_by_technique()
 │   │       │   ├── technique_repository.py
 │   │       │   ├── campaign_repository.py
 │   │       │   └── ...
 │   │       ├── services/
 │   │       │   ├── storage_port.py      # ABC: upload_file(), get_presigned_url()
 │   │       │   ├── notification_port.py # ABC: send_notification()
 │   │       │   └── event_bus_port.py    # ABC: publish(event)
 │   │       └── auth/
 │   │           └── token_service_port.py
 │   │
 │   ├── application/                     # ★ APPLICATION LAYER (use cases)
 │   │   ├── use_cases/
 │   │   │   ├── tests/
 │   │   │   │   ├── create_test.py       # CreateTestUseCase
 │   │   │   │   ├── start_execution.py   # StartExecutionUseCase
 │   │   │   │   ├── submit_red.py
 │   │   │   │   ├── validate_test.py
 │   │   │   │   └── get_retest_chain.py
 │   │   │   ├── scoring/
 │   │   │   │   ├── calculate_technique_score.py
 │   │   │   │   └── calculate_organization_score.py
 │   │   │   ├── campaigns/
 │   │   │   │   ├── create_campaign.py
 │   │   │   │   └── generate_from_threat_actor.py
 │   │   │   ├── heatmap/
 │   │   │   │   ├── generate_coverage_layer.py
 │   │   │   │   └── export_navigator.py
 │   │   │   └── reports/
 │   │   │       ├── generate_coverage_report.py
 │   │   │       └── export_coverage_csv.py
 │   │   ├── dto/                         # Input/Output DTOs for use cases
 │   │   │   ├── test_dto.py
 │   │   │   └── ...
 │   │   └── interfaces/                  # Application-level ports
 │   │       └── unit_of_work.py          # ABC: UnitOfWork with commit/rollback
 │   │
 │   ├── infrastructure/                  # ★ INFRASTRUCTURE LAYER (implementations)
 │   │   ├── persistence/
 │   │   │   ├── orm/                     # SQLAlchemy models (mapping only)
 │   │   │   │   ├── test_model.py
 │   │   │   │   ├── technique_model.py
 │   │   │   │   └── ...
 │   │   │   ├── repositories/            # Concrete implementations
 │   │   │   │   ├── sqlalchemy_test_repository.py
 │   │   │   │   ├── sqlalchemy_technique_repository.py
 │   │   │   │   └── ...
 │   │   │   ├── unit_of_work.py          # SQLAlchemy UoW implementation
 │   │   │   └── database.py              # Engine, session factory
 │   │   ├── storage/
 │   │   │   └── minio_storage.py         # Implements StoragePort
 │   │   ├── external/                    # Import services
 │   │   │   ├── mitre_sync.py
 │   │   │   ├── atomic_import.py
 │   │   │   ├── sigma_import.py
 │   │   │   └── ...
 │   │   ├── auth/
 │   │   │   ├── jwt_service.py           # Implements TokenServicePort
 │   │   │   └── token_blacklist.py       # Redis-backed blacklist
 │   │   ├── notifications/
 │   │   │   └── db_notification_service.py
 │   │   ├── jobs/
 │   │   │   └── scheduler.py             # APScheduler setup
 │   │   └── cache/
 │   │       └── redis_cache.py           # Score caching (Redis)
 │   │
 │   └── presentation/                    # ★ PRESENTATION LAYER (HTTP)
 │       ├── api/
 │       │   ├── v1/
 │       │   │   ├── tests.py             # Routing + request/response mapping only
 │       │   │   ├── techniques.py
 │       │   │   ├── heatmap.py
 │       │   │   └── ...
 │       │   └── dependencies.py          # FastAPI Depends() wiring
 │       ├── schemas/                     # Pydantic schemas (request/response)
 │       │   ├── test_schema.py
 │       │   └── ...
 │       ├── middleware/
 │       │   ├── error_handler.py         # Domain exceptions → HTTP responses
 │       │   └── rate_limiter.py
 │       └── mappers/                     # Entity ↔ Schema mappers
 │           ├── test_mapper.py
 │           └── ...
 ```
 ### 6.2. Dependency Rules
 ```
 Presentation → Application → Domain ← Infrastructure
     ↓              ↓           ↑           ↑
  FastAPI       Use Cases    Entities    SQLAlchemy
  Pydantic      DTOs        Ports       MinIO
                                        Redis
                                        APScheduler
 ```
 **The golden rule:** Dependencies only point towards the center (Domain). Infrastructure implements the ports defined in Domain.
 ### 6.3. Key Changes by Layer
 #### Domain Layer (New)
 ```python
 # domain/entities/test.py — Rich entity (not anemic)
 class TestEntity:
    def __init__(self, id, state, technique_id, ...):
        self._state = state
    def can_transition_to(self, target: TestState) -> bool:
        return target in VALID_TRANSITIONS[self._state]
    def start_execution(self, user: UserEntity) -> list[DomainEvent]:
        if not self.can_transition_to(TestState.red_executing):
            raise InvalidTransitionError(self._state, TestState.red_executing)
        self._state = TestState.red_executing
        return [TestExecutionStarted(test_id=self.id, user_id=user.id)]
 # domain/exceptions.py — Domain exceptions, NOT HTTPException
 class InvalidTransitionError(DomainException):
    def __init__(self, current: TestState, target: TestState):
        self.current = current
        self.target = target
 # domain/ports/repositories/test_repository.py — Abstract interface
 class TestRepository(Protocol):
    def find_by_id(self, test_id: UUID) -> TestEntity | None: ...
    def save(self, test: TestEntity) -> None: ...
    def list_by_technique(self, technique_id: UUID) -> list[TestEntity]: ...
 ```
 #### Application Layer (Use Cases)
 ```python
 # application/use_cases/tests/start_execution.py
 class StartExecutionUseCase:
    def __init__(self, test_repo: TestRepository, uow: UnitOfWork):
        self._test_repo = test_repo
        self._uow = uow
    def execute(self, test_id: UUID, user_id: UUID) -> TestDTO:
        with self._uow:
            test = self._test_repo.find_by_id(test_id)
            if not test:
                raise EntityNotFoundError("Test", test_id)
            events = test.start_execution(user)
            self._test_repo.save(test)
            self._uow.commit()
            # events are published after commit
            return TestDTO.from_entity(test)
 ```
 #### Presentation Layer (Slim Routers)
 ```python
 # presentation/api/v1/tests.py — HTTP concerns only
@router.post("/{test_id}/start-execution")
 def start_execution(
    test_id: UUID,
    use_case: StartExecutionUseCase = Depends(get_start_execution_use_case),
    current_user: User = Depends(get_current_user),
 ):
    try:
        result = use_case.execute(test_id, current_user.id)
        return result
    except EntityNotFoundError:
        raise HTTPException(404)
    except InvalidTransitionError as e:
        raise HTTPException(400, detail=str(e))
 ```
 #### Infrastructure Layer (Implementations)
 ```python
 # infrastructure/persistence/repositories/sqlalchemy_test_repository.py
 class SQLAlchemyTestRepository(TestRepository):
    def __init__(self, session: Session):
        self._session = session
    def find_by_id(self, test_id: UUID) -> TestEntity | None:
        model = self._session.query(TestModel).filter(TestModel.id == test_id).first()
        return TestMapper.to_entity(model) if model else None
    def save(self, test: TestEntity) -> None:
        model = TestMapper.to_model(test)
        self._session.merge(model)
 ```
 ### 6.4. Incremental Migration Plan (Phases)
 **The refactor must be incremental — not big bang.** Each phase delivers value and the system continues working.
 #### Phase 1: Foundations (1-2 weeks)
 1. Create the directory structure: `domain/`, `application/`, `infrastructure/`, `presentation/`.
 2. Create `domain/exceptions.py` with domain exceptions.
 3. Create `error_handler.py` middleware that maps domain exceptions → HTTP responses.
 4. Create `domain/ports/repositories/` with Protocol interfaces for the 3-4 most used entities (Test, Technique, Campaign).
 5. Create SQLAlchemy implementations of these repositories.
 6. **Do not move routers yet.**
 #### Phase 2: Extract the Test Domain (1-2 weeks)
 1. Create `domain/entities/test.py` with the state machine (extract from `test_workflow_service`).
 2. Create use cases for each state transition.
 3. Migrate the `tests.py` router to use the use cases.
 4. Remove `HTTPException` from `test_workflow_service`.
 5. **Pure unit tests** for the domain entity (no DB).
 #### Phase 3: Extract Fat Services from Routers (2-3 weeks)
 1. Move `heatmap.py` logic to `application/use_cases/heatmap/`.
 2. Move `reports.py` logic to `application/use_cases/reports/`.
 3. Move `metrics.py` logic to application services.
 4. Routers become thin controllers (< 20 lines per endpoint).
 #### Phase 4: Complete Repository Pattern (1-2 weeks)
 1. Create repositories for all remaining entities.
 2. Migrate scattered queries from routers to repositories.
 3. Remove `db.query(...)` from any file outside `infrastructure/`.
 #### Phase 5: Robust Infrastructure (1-2 weeks)
 1. Move token blacklist to Redis.
 2. Implement the Unit of Work pattern.
 3. Move scoring config to the database (not mutable `settings`).
 4. Add event bus for domain events (notifications, auditing).
 #### Phase 6: CI/CD and Observability
 1. Set up GitHub Actions (lint, type check, tests).
 2. Add structured logging.
 3. Add improved health checks.
 ---
 ## 7. Executive Summary
 ### Current Strengths
 | Strength | Detail |
 |----------|--------|
 | Well-modeled domain | The data model covers ATT&CK, D3FEND, compliance, threat actors, and campaigns comprehensively |
 | Solid test workflow | The state machine in `test_workflow_service` is the best designed component |
 | Clean frontend | API/pages/components separation with TanStack Query is correct |
 | Secure auth | HttpOnly cookies + RBAC with 6 well-defined roles |
 | Import services | The 8 import services are well encapsulated |
 | Existing tests | 18 test files with fixtures — a foundation to build upon |
 ### Critical Weaknesses (Updated Feb 18)
 | Weakness | Original Severity | Current Status |
 |----------|----------|--------|
 | Fat controllers (routers with business logic) | HIGH | Partially resolved — heatmap extracted |
 | No repository layer | HIGH | ✅ Resolved (Test, Technique repos exist) |
 | Services depend on FastAPI | HIGH | ✅ Resolved (domain exceptions + middleware) |
 | Anemic models | MEDIUM | ✅ Partially resolved (TestEntity, TechniqueEntity) |
 | In-memory token blacklist | HIGH | ✅ Resolved (Redis-backed) |
 | Mutable settings at runtime | MEDIUM | Open |
 | No CI/CD | MEDIUM | ✅ Resolved (GitHub Actions) |
 | No dependency inversion | HIGH | ✅ Partially resolved (ports + repos) |
 ### Final Classification
 ```
 ┌──────────────────────────────────────────────────────────┐
 │  Type:        Clean Modular Monolith (in transition)     │
 │  Maturity:    Pre-production → Production-ready          │
 │  SOLID:       3.5/5 (SRP partial, DIP started, OCP/ISP  │
 │               in progress)                               │
 │  Testability: 6/10 (326 tests, domain unit tests, repo  │
 │               integration tests)                         │
 │  Coupling:    5/10 (domain layer fully decoupled, old    │
 │               routers still coupled)                     │
 │  Cohesion:    7/10 (domain entities own business rules)  │
 │  Estimated remaining tech debt: ~2-3 weeks               │
 └──────────────────────────────────────────────────────────┘
 ```
 ### Recommendation (Updated Feb 18)
 The foundational Clean Architecture layers are now in place. The migration is proceeding incrementally. **The top 4 immediate priorities from the original analysis are all resolved:**
 1. ~~Extract domain exceptions~~ ✅ Done
 2. ~~Create repositories for Test and Technique~~ ✅ Done
 3. ~~Move token blacklist to Redis~~ ✅ Done
 4. ~~Set up basic CI/CD~~ ✅ Done
 **Next priorities:**
 1. Migrate fat routers to use repositories (incremental, per-router)
 2. Persist scoring weights in database
 3. Create domain entities for Campaign and ComplianceFramework
 4. Add structured JSON logging
@@ -0,0 +1,388 @@
 # Aegis — Backend Internal Dependency Analysis
 > **Author:** Architecture review  
 > **Date:** February 11, 2026 (updated February 18, 2026)  
 > **Scope:** All 21 routers and 20 services in `backend/app/`
 >
 > **Note:** This analysis describes the original state. Since then, a Clean Architecture refactor has begun. See [ARCHITECTURAL_ANALYSIS.md](ARCHITECTURAL_ANALYSIS.md) for current status. Key changes: domain exceptions replace HTTPException in services, repository ports and implementations exist for Test and Technique, domain entities with business logic exist for Test and Technique, Unit of Work pattern is available, CI pipeline is active.
 ---
 ## Table of Contents
 1. [Do Routers Import SQLAlchemy Models Directly?](#1-do-routers-import-sqlalchemy-models-directly)
 2. [Do Services Access the Database Directly?](#2-do-services-access-the-database-directly)
 3. [Do Services Contain Business Logic or Just CRUD?](#3-do-services-contain-business-logic-or-just-crud)
 4. [Is Business Logic Separated from Persistence?](#4-is-business-logic-separated-from-persistence)
 5. [Is Infrastructure Decoupled from Logic?](#5-is-infrastructure-decoupled-from-logic)
 6. [What Architecture Is Actually Implemented?](#6-what-architecture-is-actually-implemented)
 ---
 ## 1. Do Routers Import SQLAlchemy Models Directly?
 **Yes. Every single router imports at least one SQLAlchemy model. 19 of 21 routers execute raw database operations inline.**
 ### Complete Router-to-Model Import Map
 | Router | Models Imported Directly | DB Operations in Router |
 |--------|-------------------------|------------------------|
 | `audit.py` | AuditLog, User | 3 |
 | `auth.py` | User | 1 |
 | `campaigns.py` | User, Campaign, CampaignTest, Test, Technique, ThreatActor | 36 |
 | `compliance.py` | User, ComplianceFramework, ComplianceControl, ComplianceControlMapping, Technique, TestTemplate, ThreatActorTechnique | 13 |
 | `d3fend.py` | User, Technique, DefensiveTechnique, DefensiveTechniqueMapping | 3 |
 | `data_sources.py` | User, DataSource | 14 |
 | `detection_rules.py` | User, DetectionRule, TestTemplate, TestTemplateDetectionRule, TestDetectionResult | 21 |
 | `evidence.py` | Evidence, Test, User, enums | 11 |
 | `heatmap.py` | User, Technique, Test, ThreatActor, ThreatActorTechnique, DetectionRule, Campaign, CampaignTest, DefensiveTechniqueMapping, enums | 13 |
 | `metrics.py` | Technique, Test, User, enums | 12 |
 | `notifications.py` | Notification, User | 2 |
 | `operational_metrics.py` | User | 0 (delegates) |
 | `reports.py` | Technique, Test, User, enums | 6 |
 | `scores.py` | User, Technique, ThreatActor | 2 |
 | `snapshots.py` | User, CoverageSnapshot, SnapshotTechniqueState | 6 |
 | `system.py` | User | 0 (delegates) |
 | `techniques.py` | Technique, User, enums | 12 |
 | `test_templates.py` | TestTemplate, User | 20 |
 | `tests.py` | AuditLog, Technique, Test, TestTemplate, User, enums | 30 |
 | `threat_actors.py` | User, ThreatActor, ThreatActorTechnique, Technique, Test, TestTemplate, enums | 11 |
 | `users.py` | User | 9 |
 ### Key Numbers
 - **21 / 21** routers import at least one SQLAlchemy model.
 - **19 / 21** routers execute `db.query()`, `db.add()`, `db.commit()`, or `db.delete()` directly (only `operational_metrics.py` and `system.py` fully delegate).
 - **Total DB operations across all routers: 225** (`db.query`, `db.add`, `db.commit`, `db.delete`, `db.refresh` calls).
 - **All 21** routers import `Session` from SQLAlchemy.
 - **7** routers import `func` (aggregations).
 - **7** routers import `joinedload` (eager loading).
 - **2** routers import `or_` (compound filters).
 ### What This Means
 Routers are tightly coupled to the ORM. They know:
 - Table structure (column names, relationships)
 - Query syntax (`filter`, `join`, `group_by`, `order_by`)
 - Transaction management (`commit`, `refresh`, `add`)
 - Eager loading strategy (`joinedload`, `selectinload`)
 There is **no abstraction layer** between routers and the database. Changing a column name on the `Technique` model would require modifying at least 8 routers.
 ---
 ## 2. Do Services Access the Database Directly?
 **Yes. All 19 services that handle data (all except `score_cache.py`) receive a SQLAlchemy `Session` as a parameter and execute queries directly.**
 ### Complete Service-to-Database Map
 | Service | Models Used | DB Operations | Receives `Session` | Imports `app.database` |
 |---------|-------------|---------------|--------------------|-----------------------|
 | `atomic_import_service` | TestTemplate | 3 | Yes | No |
 | `audit_service` | AuditLog | 2 | Yes | No |
 | `caldera_import_service` | TestTemplate, DataSource | 5 | Yes | No |
 | `campaign_scheduler_service` | Campaign, CampaignTest, Test, User | 8 | Yes | No |
 | `campaign_service` | Campaign, CampaignTest, Test, TestTemplate, Technique, ThreatActor, ThreatActorTechnique, User | 10 | Yes | No |
 | `compliance_import_service` | ComplianceFramework, ComplianceControl, ComplianceControlMapping, Technique | 22 | Yes | No |
 | `d3fend_import_service` | Technique, DefensiveTechnique, DefensiveTechniqueMapping | 13 | Yes | No |
 | `elastic_import_service` | DetectionRule, DataSource | 5 | Yes | No |
 | `intel_service` | IntelItem, Technique | 4 | Yes | No |
 | `lolbas_import_service` | TestTemplate, DataSource | 7 | Yes | No |
 | `mitre_sync_service` | Technique, enums | 3 | Yes | No |
 | `notification_service` | Notification, User | 12 | Yes | No |
 | `operational_metrics_service` | Test, Technique, TestDetectionResult, AuditLog, enums | 21 | Yes | No |
 | `score_cache` | — | 0 | No | No |
 | `scoring_service` | Technique, Test, DetectionRule, TestDetectionResult, DefensiveTechniqueMapping, ThreatActor, ThreatActorTechnique | 17 | Yes | No |
 | `sigma_import_service` | DetectionRule, DataSource | 5 | Yes | No |
 | `snapshot_service` | Technique, CoverageSnapshot, SnapshotTechniqueState | 13 | Yes | No |
 | `status_service` | Technique, enums | 1 | Yes | No |
 | `test_workflow_service` | Test, User, enums | 13 | Yes | No |
 | `threat_actor_import_service` | ThreatActor, ThreatActorTechnique, Technique, DataSource | 8 | Yes | No |
 ### Key Numbers
 - **Total DB operations across all services: 172** (`db.query`, `db.add`, `db.commit`, etc.).
 - **19 / 20** services receive `Session` as a function parameter.
 - **0 / 20** services import `app.database` directly — sessions are always injected by callers (routers or background jobs).
 - **All 19** data-handling services import SQLAlchemy symbols (`Session`, `func`, `case`, etc.).
 ### Positive Pattern: Session Injection
 Services do follow one good practice: none of them create their own database sessions. Sessions are always passed in as arguments:
 ```python
 # All services use this pattern:
 def calculate_technique_score(technique: Technique, db: Session) -> dict:
    all_tests = db.query(Test).filter(Test.technique_id == technique.id).all()
 ```
 This makes sessions testable (you can pass a mock or test session). However, the services still know the full ORM API — they construct queries, call `commit()`, and manage eager loading.
 ---
 ## 3. Do Services Contain Business Logic or Just CRUD?
 **Mixed. Services fall into three distinct categories.**
 ### Category A: Rich Business Logic (5 services)
 These services contain genuine domain logic — rules, calculations, state machines, and business decisions:
 | Service | Logic Type | Complexity |
 |---------|-----------|------------|
 | `test_workflow_service` | State machine with valid transition map, role-based guards, multi-step validation, retest chain management | **High** — 456 lines, 10+ public functions, embeds the test lifecycle rules |
 | `scoring_service` | Multi-dimensional scoring algorithm with configurable weights, breakdown calculations, decay functions | **High** — 468 lines, complex math combining 5 weighted factors |
 | `campaign_service` | Circular dependency detection, campaign progress calculation, auto-generation from threat actors | **Medium** — business rules for campaign management |
 | `campaign_scheduler_service` | Recurring campaign scheduling, next-run calculation, campaign cloning | **Medium** — temporal business logic |
 | `operational_metrics_service` | MTTD/MTTR calculation, detection efficacy, trend analysis with time windows | **Medium** — analytical business logic |
 ### Category B: External Data Import (8 services)
 These services handle fetching, parsing, and upserting data from external sources. They are more "integration logic" than "business logic":
 | Service | External Source | Logic |
 |---------|----------------|-------|
 | `mitre_sync_service` | MITRE TAXII + GitHub | STIX 2.0 parsing, technique upsert |
 | `atomic_import_service` | GitHub (ZIP) | YAML parsing, template creation |
 | `sigma_import_service` | GitHub (ZIP) | YAML + ATT&CK tag extraction |
 | `elastic_import_service` | GitHub (ZIP) | TOML parsing, rule creation |
 | `caldera_import_service` | GitHub (ZIP) | YAML parsing, ability import |
 | `d3fend_import_service` | D3FEND REST API | JSON parsing, mapping creation |
 | `lolbas_import_service` | GitHub (ZIP) | YAML/Markdown parsing |
 | `threat_actor_import_service` | GitHub (ZIP) | STIX 2.0 bundle parsing |
 ### Category C: Thin CRUD Wrappers (7 services)
 These services are essentially database operations with minimal logic:
 | Service | What It Does | Lines of Logic |
 |---------|-------------|----------------|
 | `audit_service` | `log_action()` — creates an AuditLog row | ~10 lines |
 | `notification_service` | CRUD for notifications + `notify_test_state_change()` | ~30 lines of logic, rest is DB access |
 | `status_service` | `recalculate_technique_status()` — counts tests by state, sets status | ~20 lines |
 | `snapshot_service` | Creates snapshots by looping over techniques and calling scoring_service | Orchestration + DB writes |
 | `score_cache` | In-memory dict with TTL | ~30 lines, pure caching |
 | `compliance_import_service` | Parses NIST/CIS data and creates DB rows | Parsing + bulk insert |
 | `intel_service` | Fetches RSS/feeds and creates IntelItem rows | Fetch + parse + insert |
 ### The Missing Logic
 Significant business logic that should be in services but lives in routers instead:
 | Logic | Current Location | Should Be |
 |-------|-----------------|-----------|
 | ATT&CK Navigator layer generation | `heatmap.py` router (528 lines) | `heatmap_service` or use case |
 | Coverage report building | `reports.py` router (273 lines) | `report_service` or use case |
 | Coverage metrics aggregation | `metrics.py` router (316 lines) | `metrics_service` |
 | Detection rule CRUD + auto-association | `detection_rules.py` router (21 DB ops) | `detection_rule_service` |
 | Technique CRUD + review workflow | `techniques.py` router (12 DB ops) | `technique_service` |
 | Campaign full lifecycle | `campaigns.py` router (36 DB ops) | Partially in `campaign_service`, but router does most CRUD |
 ---
 ## 4. Is Business Logic Separated from Persistence?
 **No. There is no separation boundary between business logic and persistence anywhere in the codebase.**
 ### The Dependency Graph
 ```
 ┌─────────────────────────────────────────────────────────────────────┐
 │ PRESENTATION LAYER (Routers)                                        │
 │                                                                     │
 │  ┌──────────┐   ┌──────────┐   ┌──────────┐   ┌──────────┐        │
 │  │techniques│   │ heatmap  │   │ reports  │   │ campaigns│  ...    │
 │  │ 12 db.q  │   │ 13 db.q  │   │ 6 db.q   │   │ 36 db.q  │        │
 │  └────┬─────┘   └────┬─────┘   └────┬─────┘   └────┬─────┘        │
 │       │ direct        │ direct        │ direct       │ direct       │
 │       │               │               │              │ + service    │
 ├───────┼───────────────┼───────────────┼──────────────┼──────────────┤
 │ SERVICE LAYER (Partial)                                              │
 │                                                                     │
 │  ┌──────────────┐   ┌───────────┐   ┌──────────────────────────┐   │
 │  │test_workflow  │   │  scoring  │   │     8 import services    │   │
 │  │ 13 db.q      │   │ 17 db.q   │   │    3-22 db.q each       │   │
 │  │ HTTPException │   │ settings  │   │    + HTTP requests       │   │
 │  └──────┬───────┘   └─────┬─────┘   └────────────┬─────────────┘   │
 │         │                  │                       │                 │
 ├─────────┼──────────────────┼───────────────────────┼─────────────────┤
 │ PERSISTENCE LAYER (SQLAlchemy — no abstraction)                      │
 │                                                                     │
 │  ┌─────────────────────────────────────────────────────────────────┐ │
 │  │  db.query(Model).filter(...).all()  ← called from EVERYWHERE   │ │
 │  │  db.add(instance)                                               │ │
 │  │  db.commit()                                                    │ │
 │  │  db.refresh(instance)                                           │ │
 │  └─────────────────────────────────────────────────────────────────┘ │
 │                                                                     │
 │  Total: 225 db operations in routers + 172 in services = 397 total  │
 │  Spread across: 19 routers + 19 services = 38 files                 │
 └─────────────────────────────────────────────────────────────────────┘
 ```
 ### Why There Is No Separation
 1. **No Repository Pattern.** There are no repository classes or functions that encapsulate database access. Every file that needs data constructs its own query.
 2. **No Domain Entity Layer.** The SQLAlchemy models serve dual duty as both persistence mapping AND domain objects. There is no separate domain entity with business methods — the same `Test` class that defines the database table is passed around as the business object.
 3. **No Abstraction Boundary.** There is no interface (Protocol/ABC) anywhere in the codebase that separates "what data I need" from "how to get it from the database."
 4. **Services Commit Transactions.** Some services call `db.commit()` internally, while their calling routers may also call `db.commit()`. There is no Unit of Work pattern governing transaction boundaries.
 ### Concrete Example: Scoring a Technique
 The `scoring_service.calculate_technique_score()` function mixes business logic and persistence in every line:
 ```python
 # Business logic (what to calculate) and persistence (how to get data)
 # are interleaved — inseparable:
 all_tests = db.query(Test).filter(Test.technique_id == technique.id).all()  # ← persistence
 validated_tests = [t for t in all_tests if t.state == TestState.validated]    # ← logic
 detected_tests = [t for t in validated_tests if t.detection_result == TestResult.detected]  # ← logic
 test_ratio = len(detected_tests) / len(validated_tests)                      # ← logic
 test_score = round(test_ratio * w_tests, 1)                                  # ← logic
 rule_count = db.query(func.count(DetectionRule.id))...scalar() or 0          # ← persistence
 rule_score = min(rule_count / 3.0, 1.0) * w_detection                        # ← logic
 ```
 To test the scoring **algorithm** in isolation (without a database), you would need to refactor every query into a repository that can be mocked.
 ---
 ## 5. Is Infrastructure Decoupled from Logic?
 **No. Infrastructure concerns are embedded directly in both routers and services.**
 ### Infrastructure Dependency Map
 | Infrastructure | Where It Bleeds Into Logic | Impact |
 |---------------|---------------------------|--------|
 | **SQLAlchemy ORM** | 19 routers (225 ops) + 19 services (172 ops) = 38 files, 397 operations | Cannot switch ORM or use raw SQL without rewriting 38 files |
 | **FastAPI HTTPException** | `test_workflow_service.py`, `campaign_service.py` (2 services) | Business logic throws HTTP-specific exceptions — cannot reuse from CLI, workers, or pure tests |
 | **MinIO (boto3)** | `storage.py` (well isolated) → called from `evidence.py` router | Storage itself is clean, but the router handles presigned URL generation |
 | **APScheduler** | `mitre_sync_job.py` → creates `SessionLocal()` directly → calls services | Jobs bypass the DI system and create their own sessions |
 | **`app.config.settings`** | `scoring_service.py` (reads weights), `test_workflow_service.py` (reads MAX_RETEST_COUNT), `auth.py` router (reads SECRET_KEY), `scores.py` router (mutates weights) | Global mutable singleton accessed from multiple layers |
 | **External HTTP (requests/httpx)** | 8 import services make outbound HTTP calls | Tightly coupled — cannot test import logic without network access or mocking `requests` |
 ### What Is Well Isolated
 | Component | Isolation Quality |
 |-----------|-------------------|
 | `storage.py` (MinIO) | **Good** — thin wrapper with 3 functions (`ensure_bucket_exists`, `upload_file`, `get_presigned_url`). Only accessed from 1 router. |
 | `auth.py` (JWT/bcrypt) | **Good** — self-contained module for token creation, verification, and password hashing. |
 | `dependencies/auth.py` | **Good** — composable FastAPI `Depends()` chain for auth and RBAC. |
 | `config.py` (Settings) | **Partial** — Pydantic Settings with env loading is clean, but the object is mutable and accessed as a global singleton. |
 ### What Is Poorly Isolated
 | Component | Problem |
 |-----------|---------|
 | Database session lifecycle | `get_db()` is a generator injected via `Depends()` in routers, but services receive raw `Session` objects. Background jobs create sessions with `SessionLocal()` directly, bypassing the DI system entirely. |
 | External API calls | Import services directly call `requests.get()` / `httpx.get()`. No port/adapter pattern — the HTTP client is an implementation detail embedded in business logic. |
 | Scoring configuration | `settings.SCORING_WEIGHT_*` is read from a mutable global object. The `scores.py` router mutates it at runtime. No database-backed configuration. |
 ---
 ## 6. What Architecture Is Actually Implemented?
 ### Classification: Inconsistent Layered Architecture with Partial Service Extraction
 The codebase does **not** follow any named architectural pattern consistently. It is a hybrid of two approaches that were never unified:
 ### Pattern 1: Transaction Script (60% of codebase)
 Most routers follow the Transaction Script pattern — each endpoint is a self-contained script that receives a request, queries the database, applies logic, mutates data, and returns a response. All in one function:
 ```
 HTTP Request → Router Function → [query DB → apply logic → write DB → return response]
 ```
 **Routers using this pattern:** techniques, evidence, users, audit, reports, heatmap, metrics, detection_rules, threat_actors, data_sources, compliance, test_templates, d3fend, snapshots (partially)
 ### Pattern 2: Service Layer (40% of codebase)
 Some routers delegate complex operations to services:
 ```
 HTTP Request → Router Function → Service Function → [query DB → apply logic → write DB]
                                                  → return to router → return response
 ```
 **Routers using this pattern:** tests (workflow), scores (scoring), notifications, operational_metrics, system (imports), campaigns (partially), snapshots (partially)
 ### The Actual Dependency Direction
 ```
             ┌──────────────────────────────────────────┐
             │            EVERYTHING DEPENDS ON          │
             │                                          │
             │   SQLAlchemy Models (18 concrete classes) │
             │   SQLAlchemy Session (passed everywhere)  │
             │                                          │
             └──────────┬───────────────┬───────────────┘
                        │               │
              ┌─────────▼──────┐  ┌─────▼──────────┐
              │    Routers     │  │    Services     │
              │  (21 files)    │  │  (20 files)     │
              │  225 db ops    │  │  172 db ops     │
              │  import models │  │  import models  │
              │  import Session│  │  receive Session│
              └────────┬───────┘  └────────┬────────┘
                       │                    │
                       │   cross-reference  │
                       │◄──────────────────►│
                       │  13 routers import │
                       │  services          │
                       │  10 services import│
                       │  other services    │
                       └────────────────────┘
 ```
 **The dependency direction is: everything points DOWN to SQLAlchemy.** There is no inversion. The models are the center of gravity, not the domain logic.
 ### Comparison with Named Architectures
 | Architecture | Aegis Implementation | Verdict |
 |-------------|---------------------|---------|
 | **Clean Architecture** | No domain layer, no use cases, no ports/adapters, no dependency inversion | **Not implemented** |
 | **Hexagonal Architecture** | No ports, no adapters, infrastructure is not pluggable | **Not implemented** |
 | **Layered Architecture** | Layers exist (routers → services → models) but boundaries are not enforced — routers bypass the service layer freely | **Partially implemented, inconsistently** |
 | **Domain-Driven Design** | Anemic models, no aggregates, no value objects, no domain events, no bounded contexts | **Not implemented** |
 | **Transaction Script** | Most endpoints follow this pattern | **De facto pattern for ~60% of code** |
 | **Active Record** | SQLAlchemy models don't have business methods (they're not Active Record either) | **Not implemented** |
 ### Summary Classification
 ```
 ┌─────────────────────────────────────────────────────────────────┐
 │                                                                 │
 │  Architecture:  Inconsistent Layered Monolith                   │
 │                                                                 │
 │  Dominant pattern:  Transaction Script (routers as scripts)     │
 │  Secondary pattern: Service Layer (for complex workflows)       │
 │                                                                 │
 │  Boundary enforcement:  None                                    │
 │  Dependency direction:  All code → SQLAlchemy (downward)        │
 │  Abstraction layers:    Zero (no interfaces, no repositories)   │
 │                                                                 │
 │  Files with direct DB access:  38 out of 41 (93%)              │
 │  Total scattered DB operations: 397                             │
 │                                                                 │
 │  Well-designed components:                                      │
 │    - test_workflow_service (state machine)                       │
 │    - scoring_service (algorithm — coupled to DB)                 │
 │    - storage.py (clean MinIO wrapper)                           │
 │    - dependencies/auth.py (composable auth chain)               │
 │                                                                 │
 │  Poorly-designed components:                                    │
 │    - heatmap.py router (528 lines, 13 DB ops, zero delegation)  │
 │    - campaigns.py router (36 DB ops, partial delegation)        │
 │    - detection_rules.py router (21 DB ops, zero delegation)     │
 │    - test_templates.py router (20 DB ops, zero delegation)      │
 │                                                                 │
 └─────────────────────────────────────────────────────────────────┘
 ```
@@ -0,0 +1,452 @@
 # SQLAlchemy Performance Analysis — backend/app/services
 **Analysis Date:** 2025-02-18 (updated February 18, 2026)  
 **Scope:** All Python files under `backend/app/services/`  
 **Focus:** N+1 queries, missing eager loading, redundant queries, queries in loops
 > **Update (Feb 18, 2026):** The most critical N+1 issues have been resolved:
 > - `scoring_service.py` — `bulk_technique_scores()` now uses 5 aggregated subqueries instead of per-technique loops (~3,500 queries reduced to ~5).
 > - `heatmap_service.py` — Extracted to a dedicated service with batch-fetching (`test_counts`, `rule_counts` in 2 SQL subqueries instead of per-technique N+1).
 > - `SATechniqueRepository.find_all_with_test_counts()` — Single query with subqueries providing pre-aggregated counts for all techniques.
 > - Missing database indexes added via Alembic migrations (b024, b026) covering `tests`, `techniques`, `audit_logs`, and `detection_rules` tables.
 ---
 ## Executive Summary
 | Severity | Count | Files Affected |
 |----------|-------|----------------|
 | **Critical (N+1)** | 12 | 8 files |
 | **High (Missing eager loading)** | 4 | 4 files |
 | **Medium (Redundant queries)** | 3 | 3 files |
 ---
 ## 1. operational_metrics_service.py
 ### 1.1 `calculate_mttd` — N+1 query problem
 **Lines:** 44–79  
 **Problem type:** N+1 — 2 queries per test inside loop
 ```python
 tests = db.query(Test).filter(Test.state == TestState.validated).all()
 for test in tests:
    red_start = db.query(AuditLog.timestamp).filter(...).first()   # Query per test
    blue_start = db.query(AuditLog.timestamp).filter(...).first()   # Query per test
 ```
 **Extra queries:** 2 × N (N = number of validated tests)  
 **Fix:** Use a single query with `func.max` and `case` to get both timestamps per test, or batch-fetch all audit log entries for the test IDs in one query.
 ---
 ### 1.2 `calculate_mttr` — N+1 query problem
 **Lines:** 86–123  
 **Problem type:** N+1 — 1 query per test inside loop
 ```python
 tests = db.query(Test).filter(...).all()
 for test in tests:
    remediation_complete = db.query(AuditLog.timestamp).filter(
        AuditLog.entity_id == str(test.id), ...
    ).first()
 ```
 **Extra queries:** N (N = tests with completed remediation)  
 **Fix:** Batch-fetch audit log entries for all test IDs in one query, then build a lookup dict.
 ---
 ### 1.3 `get_operational_trend` — N+1 query problem
 **Lines:** 354–392  
 **Problem type:** N+1 — 1 query per week inside loop
 ```python
 while current < now:
    validated_up_to = db.query(Test).filter(
        Test.state == TestState.validated,
        Test.red_validated_at <= week_end,
    ).all()
    # ... process ...
    current = week_end
 ```
 **Extra queries:** ~13 (for 90-day period) or ~52 (for 1-year period)  
 **Fix:** Single query with `date_trunc` and `group_by` to get counts per week, or fetch all validated tests once and filter in Python.
 ---
 ### 1.4 `calculate_rejection_rate` — Redundant queries
 **Lines:** 286–328  
 **Problem type:** Redundant — 6 separate count queries that could be combined
 ```python
 validated_count = db.query(func.count(Test.id)).filter(...).scalar()
 rejected_count = db.query(func.count(Test.id)).filter(...).scalar()
 red_rejected = db.query(func.count(Test.id)).filter(...).scalar()
 red_total = db.query(func.count(Test.id)).filter(...).scalar()
 blue_rejected = db.query(func.count(Test.id)).filter(...).scalar()
 blue_total = db.query(func.count(Test.id)).filter(...).scalar()
 ```
 **Extra queries:** 5 (could be 1–2 with conditional aggregation)  
 **Fix:** Single query with `func.count` and `case` for each condition.
 ---
 ## 2. scoring_service.py
 ### 2.1 `calculate_technique_score` — Multiple queries per call
 **Lines:** 26–204  
 **Problem type:** 5+ separate queries per technique (Tests, DetectionRule count, TestDetectionResult count, DefensiveTechniqueMapping count, Test.max)
 Each call to `calculate_technique_score` executes:
 - 1 query for `all_tests`
 - 1 query for `total_rules`
 - 1 query for `triggered_rules` (if total_rules > 0)
 - 1 query for `total_countermeasures`
 - 1 query for `most_recent_test`
 **Extra queries per technique:** ~5
 ---
 ### 2.2 `calculate_tactic_score` — N+1 via helper
 **Lines:** 209–234  
 **Problem type:** Queries in loop — calls `calculate_technique_score` for each technique
 ```python
 techniques = db.query(Technique).filter(...).all()
 for tech in techniques:
    result = calculate_technique_score(tech, db)  # 5+ queries each
 ```
 **Extra queries:** 5 × N (N = techniques in tactic, often 10–50)
 ---
 ### 2.3 `calculate_actor_coverage_score` — N+1 via helper
 **Lines:** 241–293  
 **Problem type:** Queries in loop — calls `calculate_technique_score` for each technique
 ```python
 for tech in techniques:
    result = calculate_technique_score(tech, db)
 ```
 **Extra queries:** 5 × N (N = techniques used by actor)
 ---
 ### 2.4 `calculate_organization_score` — Severe N+1
 **Lines:** 300–309  
 **Problem type:** Queries in loop — calls `calculate_technique_score` for every technique
 ```python
 all_techniques = db.query(Technique).all()
 for tech in all_techniques:
    result = calculate_technique_score(tech, db)
 ```
 **Extra queries:** 5 × N where N = total techniques (~700–800) → **~3,500–4,000 queries**
 ---
 ### 2.5 `calculate_organization_score` — Second N+1 loop
 **Lines:** 352–355  
 **Problem type:** Queries in loop — second pass over critical techniques
 ```python
 for tech in critical_techniques:
    result = calculate_technique_score(tech, db)
 ```
 **Extra queries:** 5 × M (M = critical techniques, ~50–200)
 ---
 ## 3. d3fend_import_service.py
 ### 3.1 `_upsert_techniques` — N+1 query problem
 **Lines:** 90–96  
 **Problem type:** N+1 — 1 query per technique in loop
 ```python
 for tech_data in techniques:
    existing = db.query(DefensiveTechnique).filter(
        DefensiveTechnique.d3fend_id == tech_data["d3fend_id"]
    ).first()
 ```
 **Extra queries:** N (N = number of D3FEND techniques, ~50–100)
 **Fix:** Pre-load all existing techniques into a dict keyed by `d3fend_id` before the loop.
 ---
 ### 3.2 `import_d3fend_mappings` — N+1 query problem
 **Lines:** 324–331  
 **Problem type:** N+1 — 1 query per (mitre_id, d3fend_id) pair in nested loop
 ```python
 for mitre_id, d3fend_ids in _ATTACK_TO_D3FEND.items():
    for d3fend_id in d3fend_ids:
        existing = db.query(DefensiveTechniqueMapping).filter(
            DefensiveTechniqueMapping.attack_technique_id == attack_tech.id,
            DefensiveTechniqueMapping.defensive_technique_id == def_tech.id,
        ).first()
 ```
 **Extra queries:** ~200–500 (depends on mapping size)
 **Fix:** Pre-load existing mappings into a set of `(attack_tech_id, def_tech_id)` tuples.
 ---
 ### 3.3 `get_defenses_for_technique` — Missing eager loading
 **Lines:** 428–453  
 **Problem type:** Lazy loading — accesses `m.defensive_technique` in loop
 ```python
 mappings = db.query(DefensiveTechniqueMapping).filter(...).all()
 for m in mappings:
    dt = m.defensive_technique  # Lazy load per mapping
 ```
 **Extra queries:** N (N = number of mappings for the technique)
 **Fix:** Add `joinedload(DefensiveTechniqueMapping.defensive_technique)` to the query.
 ---
 ## 4. report_generation_service.py
 ### 4.1 `generate_purple_campaign_report` — N+1 query problem
 **Lines:** 36–46  
 **Problem type:** N+1 — 1 query per test in loop
 ```python
 for test in campaign_tests:
    technique = db.query(Technique).filter(Technique.id == test.technique_id).first()
 ```
 **Extra queries:** N (N = number of campaign tests)
 **Fix:** Eager-load Technique when fetching campaign_tests, or batch-query techniques by IDs.
 ---
 ## 5. osint_enrichment_service.py
 ### 5.1 `enrich_technique_with_cves` — N+1 query problem
 **Lines:** 59–75  
 **Problem type:** N+1 — 1 query per CVE in loop
 ```python
 for vuln in data.get("vulnerabilities", []):
    exists = db.query(OsintItem.id).filter(
        OsintItem.technique_id == technique.id,
        OsintItem.source_url.contains(cve_id),
    ).first()
 ```
 **Extra queries:** Up to 10 per technique (resultsPerPage=10)
 ---
 ### 5.2 `enrich_all_techniques` — N+1 cascade
 **Lines:** 134–153  
 **Problem type:** Queries in loop — calls `enrich_technique_with_cves` for each technique
 ```python
 techniques = db.query(Technique).all()
 for i, tech in enumerate(techniques):
    total += enrich_technique_with_cves(db, tech)  # N+1 inside
 ```
 **Extra queries:** ~10 × N (N = all techniques, ~700+)
 ---
 ## 6. campaign_service.py
 ### 6.1 `get_campaign_progress` — Missing eager loading
 **Lines:** 74–92  
 **Problem type:** Lazy loading — accesses `ct.test` for each CampaignTest
 ```python
 campaign_tests = db.query(CampaignTest).filter(...).all()
 for ct in campaign_tests:
    test = ct.test  # Lazy load per CampaignTest
 ```
 **Extra queries:** N (N = campaign tests)
 **Fix:** Add `joinedload(CampaignTest.test)` or `selectinload(CampaignTest.test)`.
 ---
 ### 6.2 `generate_campaign_from_threat_actor` — N+1 query problem
 **Lines:** 155–168  
 **Problem type:** N+1 — 1 query per technique in loop
 ```python
 for tech, _at in gap_techniques:
    template = db.query(TestTemplate).filter(
        TestTemplate.mitre_technique_id == tech.mitre_id,
        ...
    ).first()
 ```
 **Extra queries:** N (N = gap techniques for the actor)
 **Fix:** Pre-load templates by mitre_id into a dict before the loop.
 ---
 ## 7. campaign_scheduler_service.py
 ### 7.1 `_clone_campaign` — Missing eager loading
 **Lines:** 76–86  
 **Problem type:** Lazy loading — accesses `ct.test` for each CampaignTest
 ```python
 original_cts = db.query(CampaignTest).filter(...).all()
 for ct in original_cts:
    src_test = ct.test  # Lazy load per CampaignTest
 ```
 **Extra queries:** N (N = campaign tests)
 **Fix:** Add `joinedload(CampaignTest.test)`.
 ---
 ### 7.2 `check_and_run_recurring_campaigns` — N+1 query problem
 **Lines:** 175–185  
 **Problem type:** N+1 — 1 query per campaign for red_tech users
 ```python
 for campaign in due_campaigns:
    # ... clone ...
    red_techs = db.query(User).filter(User.role == "red_tech", ...).all()
    for user in red_techs:
        create_notification(...)  # Also commits per notification
 ```
 **Extra queries:** 1 per due campaign (for User query)  
 **Note:** `create_notification` does `db.commit()` each time — consider batching.
 ---
 ## 8. snapshot_service.py
 ### 8.1 `create_snapshot` — Severe N+1 via helper
 **Lines:** 41–77  
 **Problem type:** Queries in loop — calls `calculate_technique_score` for every technique
 ```python
 techniques = db.query(Technique).all()
 for tech in techniques:
    score_data = calculate_technique_score(tech, db)  # 5+ queries each
 ```
 **Extra queries:** 5 × N (N = all techniques, ~700+) → **~3,500+ queries**
 ---
 ## 9. status_service.py
 ### 9.1 `recalculate_technique_status` — Potential lazy loading
 **Lines:** 28–29  
 **Problem type:** Missing eager loading — accesses `technique.tests`
 ```python
 tests = technique.tests  # Lazy load if technique was loaded without tests
 ```
 **Extra queries:** 1 (if technique was loaded without `selectinload(Technique.tests)`)
 **Note:** Caller-dependent; if technique comes from a query without eager loading, this triggers 1 extra query.
 ---
 ## 10. test_workflow_service.py
 ### 10.1 `get_retest_chain` — Redundant queries
 **Lines:** 416–428  
 **Problem type:** Redundant — 3 separate queries that could be 1–2
 ```python
 test = db.query(Test).filter(Test.id == tid).first()
 original = db.query(Test).filter(Test.id == original_id).first()
 retests = db.query(Test).filter(Test.retest_of == original_id).order_by(...).all()
 ```
 **Fix:** Single query: get original by `original_id`, then get all retests in one query. The first test fetch is only needed to determine `original_id`; could use a CTE or single query with `UNION`/subquery.
 ---
 ## 11. Files with no SQLAlchemy performance issues
 The following service files were reviewed and do **not** exhibit the targeted problems:
 | File | Notes |
 |------|-------|
 | `audit_service.py` | Single insert per call, no loops |
 | `atomic_import_service.py` | Pre-loads existing_ids, no N+1 |
 | `caldera_import_service.py` | Pre-loads existing_ids, no N+1 |
 | `compliance_import_service.py` | Pre-loads all_techniques, existing_controls, existing_mappings |
 | `elastic_import_service.py` | Pre-loads existing_ids |
 | `intel_service.py` | Pre-loads techniques and existing_urls |
 | `jira_service.py` | No db.query in loops |
 | `lolbas_import_service.py` | Pre-loads existing_ids |
 | `mitre_sync_service.py` | Pre-loads existing_techniques |
 | `notification_service.py` | Queries are not in loops (create_notification is called in loops but does single insert) |
 | `report_engine.py` | No database access |
 | `score_cache.py` | No direct db queries |
 | `sigma_import_service.py` | Pre-loads existing_ids |
 | `stale_detection_service.py` | Single query with subquery, no N+1 |
 | `tempo_service.py` | Single query per call |
 | `threat_actor_import_service.py` | Pre-loads existing_actors, technique_by_mitre_id, existing_rels |
 | `worklog_service.py` | Simple CRUD, no loops |
 ---
 ## Summary Table
 | File | Function | Problem | Est. Extra Queries |
 |------|----------|---------|--------------------|
 | operational_metrics_service | calculate_mttd | N+1 | 2×N (validated tests) |
 | operational_metrics_service | calculate_mttr | N+1 | N (remediated tests) |
 | operational_metrics_service | get_operational_trend | N+1 | ~13–52 (weeks) |
 | operational_metrics_service | calculate_rejection_rate | Redundant | 5 |
 | scoring_service | calculate_organization_score | N+1 | ~3,500–4,000 |
 | scoring_service | calculate_tactic_score | N+1 | 5×N (tactic techniques) |
 | scoring_service | calculate_actor_coverage_score | N+1 | 5×N (actor techniques) |
 | scoring_service | calculate_technique_score | Multiple per call | 5 per technique |
 | d3fend_import_service | _upsert_techniques | N+1 | N (techniques) |
 | d3fend_import_service | import_d3fend_mappings | N+1 | ~200–500 |
 | d3fend_import_service | get_defenses_for_technique | Missing eager load | N (mappings) |
 | report_generation_service | generate_purple_campaign_report | N+1 | N (campaign tests) |
 | osint_enrichment_service | enrich_technique_with_cves | N+1 | ~10 per technique |
 | osint_enrichment_service | enrich_all_techniques | N+1 cascade | ~7,000+ |
 | campaign_service | get_campaign_progress | Missing eager load | N (campaign tests) |
 | campaign_service | generate_campaign_from_threat_actor | N+1 | N (gap techniques) |
 | campaign_scheduler_service | _clone_campaign | Missing eager load | N (campaign tests) |
 | campaign_scheduler_service | check_and_run_recurring_campaigns | N+1 | 1 per campaign |
 | snapshot_service | create_snapshot | N+1 | ~3,500+ |
 | status_service | recalculate_technique_status | Lazy load | 1 |
 | test_workflow_service | get_retest_chain | Redundant | 2 |
 ---
 ## Recommended Fix Priority
 1. **P0 — scoring_service.py** `calculate_organization_score`: ~3,500+ queries per call.
 2. **P0 — snapshot_service.py** `create_snapshot`: ~3,500+ queries per snapshot.
 3. **P1 — operational_metrics_service.py** `calculate_mttd`, `calculate_mttr`, `get_operational_trend`.
 4. **P1 — osint_enrichment_service.py** `enrich_technique_with_cves` and `enrich_all_techniques`.
 5. **P2 — d3fend_import_service.py** `_upsert_techniques`, `import_d3fend_mappings`, `get_defenses_for_technique`.
 6. **P2 — campaign_service.py** and **campaign_scheduler_service.py**.
 7. **P3 — report_generation_service.py**, **test_workflow_service.py**, **status_service.py**.
@@ -0,0 +1,953 @@
 # Aegis — Target Architecture: Clean Modular Monolith
 > **Author:** Architecture review  
 > **Date:** February 11, 2026 (updated February 18, 2026)  
 > **Status:** In Progress — foundational layers implemented  
 > **Depends on:** ARCHITECTURAL_ANALYSIS.md, DEPENDENCY_ANALYSIS.md, TECH_DEBT_AND_RISKS.md
 >
 > **Implementation Progress (Feb 18, 2026):**
 > - ✅ Domain exceptions hierarchy (`domain/errors.py`, `domain/exceptions.py`)
 > - ✅ Error handler middleware (`middleware/error_handler.py`)
 > - ✅ TestEntity with full state machine (`domain/test_entity.py`)
 > - ✅ TechniqueEntity with status recalculation (`domain/entities/technique.py`)
 > - ✅ Value objects: MitreId, ScoringWeights (`domain/value_objects/`)
 > - ✅ Repository ports/protocols (`domain/ports/repositories/`)
 > - ✅ SQLAlchemy repository implementations (`infrastructure/persistence/repositories/`)
 > - ✅ ORM-Entity mappers (`infrastructure/persistence/mappers/`)
 > - ✅ FastAPI dependency wiring (`dependencies/repositories.py`)
 > - ✅ Unit of Work (`domain/unit_of_work.py`)
 > - ✅ Redis-backed token blacklist (`infrastructure/redis_client.py`)
 > - ✅ CI pipeline (`.github/workflows/ci.yml`)
 > - ✅ 326 tests passing (domain unit tests + integration tests + API tests)
 > - ✅ Architecture rules file (`.cursor/rules/aegis-architecture.md`)
 >
 > **Remaining:** Application layer use cases, Campaign/Compliance domain entities, router migration to repositories, scoring config persistence, structured logging.
 ---
 ## Table of Contents
 1. [Target Architecture Overview](#1-target-architecture-overview)
 2. [Layer Definitions and Responsibilities](#2-layer-definitions-and-responsibilities)
 3. [Module Boundaries](#3-module-boundaries)
 4. [Dependency Rules](#4-dependency-rules)
 5. [Top 5 Modules to Refactor First](#5-top-5-modules-to-refactor-first)
 6. [Repository Pattern for Technique](#6-repository-pattern-for-technique)
 ---
 ## 1. Target Architecture Overview
 ### Design Philosophy
 The target architecture applies Clean Architecture principles to a modular monolith. This is not a microservices migration — it is an internal reorganization of the existing codebase to enforce separation of concerns, dependency inversion, and testability while maintaining a single deployable unit.
 ### Target Directory Structure
 ```
 backend/
 └── app/
    ├── main.py                              # FastAPI app bootstrap (minimal)
    ├── config.py                            # Pydantic Settings (read-only)
    │
    ├── domain/                              # ★ DOMAIN LAYER
    │   ├── __init__.py
    │   │
    │   ├── enums.py                         # TechniqueStatus, TestState, TeamSide, TestResult
    │   │                                    #   (moved from models/enums.py — these are domain concepts)
    │   │
    │   ├── exceptions.py                    # Domain exception hierarchy
    │   │                                    #   EntityNotFoundError
    │   │                                    #   DuplicateEntityError
    │   │                                    #   InvalidTransitionError
    │   │                                    #   InvalidOperationError
    │   │                                    #   AuthorizationError
    │   │
    │   ├── events.py                        # Domain event definitions (data classes)
    │   │                                    #   TestStateChanged, TechniqueStatusRecalculated,
    │   │                                    #   CampaignCompleted, EvidenceUploaded
    │   │
    │   ├── entities/                        # Rich domain entities with behavior
    │   │   ├── __init__.py
    │   │   ├── technique.py                 #   TechniqueEntity: recalculate_status(), mark_reviewed()
    │   │   ├── test.py                      #   TestEntity: can_transition(), start_execution(),
    │   │   │                                #     submit_red(), submit_blue(), validate(), reopen()
    │   │   ├── campaign.py                  #   CampaignEntity: add_test(), remove_test(), activate(),
    │   │   │                                #     complete(), has_circular_dependency()
    │   │   ├── user.py                      #   UserEntity: has_role(), can_access()
    │   │   ├── detection_rule.py            #   DetectionRuleEntity
    │   │   ├── threat_actor.py              #   ThreatActorEntity
    │   │   └── evidence.py                  #   EvidenceEntity: validate_upload_permission()
    │   │
    │   ├── value_objects/                   # Immutable, equality-by-value
    │   │   ├── __init__.py
    │   │   ├── mitre_id.py                  #   MitreId: validated format (T1059, T1059.001)
    │   │   ├── score.py                     #   TechniqueScore, TacticScore, OrgScore (with breakdown)
    │   │   └── scoring_weights.py           #   ScoringWeights: validated weight set (sum == 100)
    │   │
    │   └── ports/                           # ★ INTERFACES — the contracts
    │       ├── __init__.py
    │       ├── repositories/                #   Data access contracts (one per aggregate root)
    │       │   ├── __init__.py
    │       │   ├── technique_repository.py  #   TechniqueRepository protocol
    │       │   ├── test_repository.py       #   TestRepository protocol
    │       │   ├── campaign_repository.py   #   CampaignRepository protocol
    │       │   ├── user_repository.py       #   UserRepository protocol
    │       │   ├── detection_rule_repository.py
    │       │   ├── threat_actor_repository.py
    │       │   ├── evidence_repository.py
    │       │   ├── audit_repository.py
    │       │   ├── notification_repository.py
    │       │   └── snapshot_repository.py
    │       │
    │       └── services/                    #   External capability contracts
    │           ├── __init__.py
    │           ├── storage_port.py          #   StoragePort: upload_file(), get_download_url()
    │           ├── event_publisher_port.py  #   EventPublisherPort: publish(DomainEvent)
    │           └── token_blacklist_port.py  #   TokenBlacklistPort: revoke(), is_revoked()
    │
    ├── application/                         # ★ APPLICATION LAYER
    │   ├── __init__.py
    │   │
    │   ├── interfaces/                      # Application-level contracts
    │   │   ├── __init__.py
    │   │   └── unit_of_work.py              #   UnitOfWork protocol: commit(), rollback(), __enter__/__exit__
    │   │
    │   ├── dto/                             # Input/output data structures for use cases
    │   │   ├── __init__.py                  #   Pure data classes — no ORM, no Pydantic
    │   │   ├── technique_dto.py             #   TechniqueListFilters, TechniqueResult, TechniqueDetail
    │   │   ├── test_dto.py                  #   CreateTestInput, TestResult, TestTimeline
    │   │   ├── scoring_dto.py              #   ScoreRequest, ScoreResult, ScoreHistoryResult
    │   │   ├── heatmap_dto.py              #   HeatmapFilters, HeatmapLayer, NavigatorExport
    │   │   ├── report_dto.py               #   CoverageReportResult, CsvExportResult
    │   │   └── campaign_dto.py             #   CreateCampaignInput, CampaignProgress
    │   │
    │   └── use_cases/                       # Orchestrators — one class per operation
    │       ├── __init__.py
    │       │
    │       ├── techniques/
    │       │   ├── list_techniques.py       #   ListTechniquesUseCase
    │       │   ├── get_technique.py         #   GetTechniqueUseCase
    │       │   ├── create_technique.py      #   CreateTechniqueUseCase
    │       │   ├── update_technique.py      #   UpdateTechniqueUseCase
    │       │   └── review_technique.py      #   ReviewTechniqueUseCase
    │       │
    │       ├── tests/
    │       │   ├── create_test.py           #   CreateTestUseCase
    │       │   ├── create_from_template.py  #   CreateFromTemplateUseCase
    │       │   ├── start_execution.py       #   StartExecutionUseCase
    │       │   ├── submit_red.py            #   SubmitRedUseCase
    │       │   ├── submit_blue.py           #   SubmitBlueUseCase
    │       │   ├── validate_test.py         #   ValidateTestUseCase
    │       │   ├── reopen_test.py           #   ReopenTestUseCase
    │       │   └── get_retest_chain.py      #   GetRetestChainUseCase
    │       │
    │       ├── scoring/
    │       │   ├── calculate_technique_score.py
    │       │   ├── calculate_tactic_score.py
    │       │   ├── calculate_org_score.py
    │       │   └── update_scoring_weights.py
    │       │
    │       ├── heatmap/
    │       │   ├── generate_coverage_layer.py
    │       │   ├── generate_actor_layer.py
    │       │   ├── generate_detection_layer.py
    │       │   └── export_navigator.py
    │       │
    │       ├── reports/
    │       │   ├── generate_coverage_report.py
    │       │   ├── generate_test_results_report.py
    │       │   ├── generate_remediation_report.py
    │       │   └── export_coverage_csv.py
    │       │
    │       └── campaigns/
    │           ├── create_campaign.py
    │           ├── manage_campaign_tests.py
    │           ├── activate_campaign.py
    │           ├── generate_from_threat_actor.py
    │           └── schedule_recurring.py
    │
    ├── infrastructure/                      # ★ INFRASTRUCTURE LAYER
    │   ├── __init__.py
    │   │
    │   ├── persistence/
    │   │   ├── __init__.py
    │   │   ├── database.py                  #   Engine, SessionLocal, get_db() — unchanged
    │   │   │
    │   │   ├── orm/                         #   SQLAlchemy models (table mapping ONLY)
    │   │   │   ├── __init__.py              #     Re-export all models for Alembic
    │   │   │   ├── base.py                  #     declarative_base()
    │   │   │   ├── technique_model.py       #     Current models/technique.py — unchanged
    │   │   │   ├── test_model.py            #     Current models/test.py — unchanged
    │   │   │   ├── campaign_model.py
    │   │   │   ├── user_model.py
    │   │   │   └── ...                      #     All 18 current models, untouched
    │   │   │
    │   │   ├── repositories/                #   Concrete repository implementations
    │   │   │   ├── __init__.py
    │   │   │   ├── sa_technique_repository.py
    │   │   │   ├── sa_test_repository.py
    │   │   │   ├── sa_campaign_repository.py
    │   │   │   └── ...                      #     One per domain port
    │   │   │
    │   │   ├── unit_of_work.py              #   SQLAlchemy UoW (wraps Session commit/rollback)
    │   │   │
    │   │   └── mappers/                     #   ORM Model ↔ Domain Entity converters
    │   │       ├── __init__.py
    │   │       ├── technique_mapper.py      #     to_entity(model) → TechniqueEntity
    │   │       │                            #     to_model(entity) → TechniqueORM
    │   │       ├── test_mapper.py
    │   │       └── ...
    │   │
    │   ├── storage/
    │   │   └── minio_storage.py             #   Implements StoragePort (current storage.py logic)
    │   │
    │   ├── auth/
    │   │   ├── jwt_service.py               #   Token creation and verification
    │   │   └── redis_token_blacklist.py     #   Implements TokenBlacklistPort
    │   │
    │   ├── external/                        #   External data source adapters
    │   │   ├── mitre_taxii_adapter.py       #   Current mitre_sync_service.py
    │   │   ├── atomic_red_team_adapter.py   #   Current atomic_import_service.py
    │   │   ├── sigma_adapter.py
    │   │   ├── elastic_adapter.py
    │   │   ├── caldera_adapter.py
    │   │   ├── d3fend_adapter.py
    │   │   ├── lolbas_adapter.py
    │   │   └── threat_actor_adapter.py
    │   │
    │   ├── events/
    │   │   └── sync_event_publisher.py      #   Implements EventPublisherPort (in-process dispatch)
    │   │
    │   ├── cache/
    │   │   └── redis_score_cache.py         #   Replaces current in-memory score_cache.py
    │   │
    │   └── jobs/
    │       └── scheduler.py                 #   APScheduler setup (current mitre_sync_job.py)
    │
    └── presentation/                        # ★ PRESENTATION LAYER
        ├── __init__.py
        │
        ├── api/
        │   └── v1/                          #   Thin routers — HTTP mapping only
        │       ├── __init__.py
        │       ├── techniques.py            #     Injects use case via Depends(), maps exceptions
        │       ├── tests.py
        │       ├── campaigns.py
        │       ├── heatmap.py
        │       ├── reports.py
        │       ├── scores.py
        │       ├── metrics.py
        │       └── ...                      #     All 21 current routers, thinned
        │
        ├── schemas/                         #   Pydantic models (request/response shapes)
        │   ├── __init__.py                  #     Current schemas/ — unchanged
        │   ├── technique_schema.py
        │   ├── test_schema.py
        │   └── ...
        │
        ├── dependencies/                    #   FastAPI Depends() wiring
        │   ├── __init__.py
        │   ├── auth.py                      #     Current dependencies/auth.py
        │   ├── repositories.py              #     get_technique_repo(), get_test_repo(), ...
        │   └── use_cases.py                 #     get_create_technique_use_case(), ...
        │
        ├── middleware/
        │   ├── error_handler.py             #   Maps domain exceptions → HTTP responses
        │   └── rate_limiter.py
        │
        └── mappers/                         #   Pydantic schema ↔ application DTO converters
            ├── __init__.py
            ├── technique_mapper.py          #     TechniqueCreate → CreateTechniqueInput
            │                                #     TechniqueResult → TechniqueOut
            └── ...
 ```
 ---
 ## 2. Layer Definitions and Responsibilities
 ### Domain Layer — The Core
 ```
 Depends on: NOTHING (zero imports from outside domain/)
 ```
 | Component | Responsibility | What It Must NOT Do |
 |-----------|---------------|---------------------|
 | **Entities** | Encapsulate business rules, invariants, and state transitions. A `TestEntity` knows which transitions are valid. A `TechniqueEntity` can recalculate its own status from a list of test results. | Import SQLAlchemy, FastAPI, Pydantic, or any framework. Access the database. Make HTTP calls. |
 | **Value Objects** | Represent domain concepts with value equality. `MitreId("T1059.001")` validates format on construction. `ScoringWeights` ensures the 5 weights sum to 100. | Be mutable. Have identity (no primary key). |
 | **Enums** | Define domain vocabularies: `TechniqueStatus`, `TestState`, `TeamSide`, `TestResult`. | Change based on infrastructure (these are the same enums currently in `models/enums.py`). |
 | **Exceptions** | Domain-specific error conditions. `InvalidTransitionError(current=draft, target=validated)`. | Reference HTTP status codes. Know about FastAPI. |
 | **Events** | Facts about things that happened. `TestStateChanged(test_id, old_state, new_state, user_id, timestamp)`. | Carry behavior. Know how they will be handled. |
 | **Ports** | Interfaces (Protocol) defining what the domain needs from the outside world. `TechniqueRepository`, `StoragePort`, `EventPublisherPort`. | Contain implementations. Reference concrete classes. |
 ### Application Layer — The Orchestrators
 ```
 Depends on: domain/ only
 ```
 | Component | Responsibility | What It Must NOT Do |
 |-----------|---------------|---------------------|
 | **Use Cases** | Orchestrate a single business operation by calling domain entities and ports. `CreateTechniqueUseCase` validates uniqueness via `TechniqueRepository`, constructs a `TechniqueEntity`, saves it, and publishes an event. | Know about HTTP, Pydantic, SQLAlchemy, or FastAPI. Contain business rules (those belong in entities). Contain queries (those belong in repositories). |
 | **DTOs** | Plain data containers for use case input/output. No validation logic, no ORM awareness. | Inherit from Pydantic `BaseModel`. Reference ORM models. |
 | **Unit of Work** | Interface for transaction boundaries. Use cases call `uow.commit()` or `uow.rollback()`. | Know about SQLAlchemy sessions. |
 ### Infrastructure Layer — The Implementations
 ```
 Depends on: domain/ (implements ports), application/ (implements UoW)
 ```
 | Component | Responsibility | What It Must NOT Do |
 |-----------|---------------|---------------------|
 | **ORM Models** | Map Python classes to database tables. Unchanged from current `models/`. | Contain business logic. Be passed outside the infrastructure layer (use mappers to convert to domain entities). |
 | **Repositories** | Implement port interfaces using SQLAlchemy. `SATechniqueRepository.find_by_mitre_id()` translates to `db.query(Technique).filter(...)`. | Be called by anything outside the application layer. Contain business decisions. |
 | **Mappers** | Convert between ORM models and domain entities. `TechniqueMapper.to_entity(orm_model) → TechniqueEntity`. | Contain business logic. Be a 1:1 field copy (they handle relationship loading and value object construction). |
 | **External Adapters** | Implement data source integrations. Download ZIPs, parse YAML/TOML/STIX, return domain-compatible data. | Be called from routers directly. Know about HTTP responses. |
 | **Storage, Cache, Auth** | Implement service ports. `MinioStorage` implements `StoragePort`. `RedisTokenBlacklist` implements `TokenBlacklistPort`. | Leak implementation details (Redis keys, S3 bucket names) outside the infrastructure layer. |
 ### Presentation Layer — The HTTP Boundary
 ```
 Depends on: application/ (calls use cases), domain/ (reads exceptions)
 ```
 | Component | Responsibility | What It Must NOT Do |
 |-----------|---------------|---------------------|
 | **Routers** | Map HTTP requests to use case calls. Parse path/query/body parameters, call the use case, return the response. 10-20 lines per endpoint maximum. | Contain business logic. Execute database queries. Build complex data structures. |
 | **Schemas** | Pydantic models for HTTP request/response validation. Unchanged from current `schemas/`. | Be used inside use cases or domain entities. |
 | **Dependencies** | Wire use cases via FastAPI `Depends()`. Construct repositories, inject into use cases, return. | Contain logic beyond wiring. |
 | **Error Handler** | Map domain exceptions to HTTP responses. `EntityNotFoundError → 404`, `InvalidTransitionError → 400`, `AuthorizationError → 403`. | Know about business rules. |
 | **Mappers** | Convert between Pydantic schemas and application DTOs. | Contain business logic. |
 ---
 ## 3. Module Boundaries
 The monolith is organized into domain modules. Each module owns its entities, repositories, and use cases. Cross-module communication goes through application-layer use cases or domain events — never through direct repository access.
 ```
 ┌─────────────────────────────────────────────────────────────────┐
 │                        Domain Modules                           │
 │                                                                 │
 │  ┌───────────┐  ┌───────────┐  ┌───────────┐  ┌─────────────┐  │
 │  │ Technique │  │   Test    │  │ Campaign  │  │   Scoring   │  │
 │  │           │  │           │  │           │  │             │  │
 │  │ entity    │  │ entity    │  │ entity    │  │ value objs  │  │
 │  │ repo port │  │ repo port │  │ repo port │  │ use cases   │  │
 │  │ use cases │  │ use cases │  │ use cases │  │ (reads from │  │
 │  │           │  │           │  │           │  │ other repos)│  │
 │  └─────┬─────┘  └─────┬─────┘  └─────┬─────┘  └──────┬──────┘  │
 │        │              │              │               │          │
 │  ┌─────┴──────────────┴──────────────┴───────────────┴──────┐   │
 │  │              Shared Domain: enums, exceptions, events     │   │
 │  └───────────────────────────────────────────────────────────┘   │
 │                                                                 │
 │  ┌───────────┐  ┌───────────┐  ┌───────────┐  ┌─────────────┐  │
 │  │  Heatmap  │  │  Reports  │  │Compliance │  │ Threat Intel│  │
 │  │           │  │           │  │           │  │             │  │
 │  │ use cases │  │ use cases │  │ use cases │  │ adapters    │  │
 │  │ (reads    │  │ (reads    │  │ (reads    │  │ use cases   │  │
 │  │  repos)   │  │  repos)   │  │  repos)   │  │             │  │
 │  └───────────┘  └───────────┘  └───────────┘  └─────────────┘  │
 └─────────────────────────────────────────────────────────────────┘
 ```
 **Cross-module rule:** A use case in the Scoring module may read from `TechniqueRepository` and `TestRepository` (both defined as ports in the domain layer). It must NOT import the SQLAlchemy model directly.
 ---
 ## 4. Dependency Rules
 ```
         ┌─────────────────┐
         │   Presentation   │  Knows: FastAPI, Pydantic, HTTP
         │   (routers,      │  Depends on: Application, Domain
         │    schemas)       │
         └────────┬─────────┘
                  │ calls use cases
         ┌────────▼─────────┐
         │   Application     │  Knows: Domain entities, ports, DTOs
         │   (use cases)     │  Depends on: Domain ONLY
         └────────┬─────────┘
                  │ uses entities + ports
         ┌────────▼─────────┐
         │     Domain        │  Knows: NOTHING external
         │   (entities,      │  Depends on: NOTHING
         │    ports, enums)  │  (this is the core)
         └────────▲─────────┘
                  │ implements ports
         ┌────────┴─────────┐
         │  Infrastructure   │  Knows: SQLAlchemy, boto3, Redis, requests
         │  (repositories,   │  Depends on: Domain (ports), Application (UoW)
         │   adapters)       │
         └──────────────────┘
 ```
 ### Import Rules (Enforceable by Linting)
 | From \ To | domain/ | application/ | infrastructure/ | presentation/ |
 |-----------|---------|-------------|----------------|--------------|
 | **domain/** | Self only | FORBIDDEN | FORBIDDEN | FORBIDDEN |
 | **application/** | ALLOWED | Self only | FORBIDDEN | FORBIDDEN |
 | **infrastructure/** | ALLOWED (ports) | ALLOWED (UoW) | Self only | FORBIDDEN |
 | **presentation/** | ALLOWED (exceptions) | ALLOWED (use cases, DTOs) | ALLOWED (wiring only, in dependencies/) | Self only |
 ---
 ## 5. Top 5 Modules to Refactor First
 ### Selection Criteria
 Each module is scored on three axes from the DEPENDENCY_ANALYSIS.md findings:
 | Axis | Weight | Measurement |
 |------|--------|-------------|
 | **Complexity** | 35% | Lines of code, number of DB operations, number of models imported, number of concerns mixed |
 | **Technical Risk** | 35% | N+1 queries, security issues, silent exception swallowing, framework coupling, scalability bottleneck |
 | **Business Impact** | 30% | Centrality to the domain (how many other modules depend on it), user-facing frequency, correctness criticality |
 ---
 ### #1: Test Workflow Module
 **Refactor scope:** `routers/tests.py` (664 lines, 30 db ops) + `services/test_workflow_service.py` (456 lines, 13 db ops) + `services/status_service.py` (47 lines)
 | Axis | Score | Evidence |
 |------|-------|----------|
 | Complexity | **10/10** | 664-line router with 15+ endpoints. Mixes CRUD, template instantiation, timeline queries, and workflow delegation. The workflow service itself is 456 lines with a state machine, notifications, and audit logging. |
 | Technical Risk | **10/10** | `test_workflow_service` imports `FastAPI.HTTPException` — the most severe framework coupling in the codebase. 4 `except Exception: pass` blocks silently swallow notification failures. No way to unit test the state machine without a database session. |
 | Business Impact | **10/10** | The Red/Blue validation workflow IS the core product. Every user role interacts with tests daily. A state transition bug could invalidate an entire assessment. 5 other modules depend on test data (scoring, heatmap, reports, metrics, campaigns). |
 **Why first:** This module contains the single most important business logic in Aegis (the test state machine), yet it has the most severe coupling problems (HTTPException in domain logic, swallowed exceptions). Extracting a `TestEntity` with the state machine as a domain object unlocks pure unit testing of the most critical business rules.
 **What to extract:**
 - `TestEntity` with `can_transition()`, `start_execution()`, `submit_red()`, `submit_blue()`, `validate()`, `reopen()` → `domain/entities/test.py`
 - `InvalidTransitionError`, `EntityNotFoundError` → `domain/exceptions.py`
 - `TestRepository` protocol → `domain/ports/repositories/test_repository.py`
 - One use case per state transition → `application/use_cases/tests/`
 - Remove all `HTTPException` from services
 - Replace `except Exception: pass` with event-based notification dispatch
 ---
 ### #2: Scoring Module
 **Refactor scope:** `services/scoring_service.py` (468 lines, 17 db ops) + `services/score_cache.py` + `routers/scores.py` (2 db ops) + `services/operational_metrics_service.py` (21 db ops)
 | Axis | Score | Evidence |
 |------|-------|----------|
 | Complexity | **9/10** | Multi-dimensional scoring algorithm reading from 7 different models. 5 configurable weights. Tactic, actor, and org scores compound technique scores. Operational metrics add MTTD/MTTR calculations with audit log queries. |
 | Technical Risk | **9/10** | **SR-001 from risk registry:** Org score generates ~3,500 DB queries (N+1 pattern). Settings mutated at runtime (thread-unsafe). In-memory cache does not scale across workers. Operational metrics N+1 on audit logs adds ~1,000 more queries. |
 | Business Impact | **9/10** | Scores drive executive dashboards, compliance reports, and snapshot history. Incorrect scores misrepresent organizational security posture. Scoring weights mutability without persistence means config is lost on restart. |
 **Why second:** Scoring is the second most critical domain concept and the most severe scalability bottleneck. Refactoring it introduces the repository pattern for batch queries and moves scoring weights to a persistent, immutable configuration.
 **What to extract:**
 - `TechniqueScore`, `TacticScore`, `OrgScore` value objects → `domain/value_objects/score.py`
 - `ScoringWeights` value object with validation → `domain/value_objects/scoring_weights.py`
 - Scoring algorithm as pure functions operating on domain objects → `application/use_cases/scoring/`
 - Batch query methods in repositories → `TechniqueRepository.find_all_with_test_counts()`
 - Redis-backed cache → `infrastructure/cache/`
 - Persist weights in DB → `ScoringConfigRepository`
 ---
 ### #3: Heatmap Module
 **Refactor scope:** `routers/heatmap.py` (528 lines, 13 db ops, 0 service delegation)
 | Axis | Score | Evidence |
 |------|-------|----------|
 | Complexity | **9/10** | 528 lines in a single router file. Imports 10 models from 6 different domains. Mixes HTTP handling, complex multi-table queries, color mapping algorithms, ATT&CK Navigator JSON serialization, and streaming export — all in one file with zero delegation. |
 | Technical Risk | **8/10** | **SR-003 from risk registry:** 1,400+ queries per request (2 per technique × 700). No caching. Full table scan. Every heatmap page load hammers the database. Most-visited view in the platform. |
 | Business Impact | **8/10** | The ATT&CK heatmap is the primary visualization — it is the first thing executives see. Navigator export is used for external reporting and audit evidence. Incorrect heatmap data directly impacts security decision-making. |
 **Why third:** This is the purest "fat controller" in the codebase — 528 lines of business logic, queries, and serialization with zero abstraction. It is also the most-visited page and the second-worst scalability bottleneck. Extracting it demonstrates the pattern for all other fat routers.
 **What to extract:**
 - Layer generation logic → `application/use_cases/heatmap/generate_coverage_layer.py` etc.
 - Navigator export format → `application/use_cases/heatmap/export_navigator.py`
 - Color mapping → `domain/value_objects/` or utility in application layer
 - Batch metadata queries → `TechniqueRepository.find_all_with_coverage_metadata()`
 - Router reduced from 528 lines to ~80 (5 endpoints × ~15 lines each)
 ---
 ### #4: Campaign Module
 **Refactor scope:** `routers/campaigns.py` (36 db ops) + `services/campaign_service.py` (10 db ops, imports HTTPException) + `services/campaign_scheduler_service.py` (8 db ops)
 | Axis | Score | Evidence |
 |------|-------|----------|
 | Complexity | **8/10** | Router has 36 db operations — the highest count of any router. Campaign lifecycle spans creation, test management, activation, completion, scheduling, and threat actor generation. Three files with partially overlapping responsibilities. |
 | Technical Risk | **7/10** | `campaign_service.py` imports `HTTPException` (framework coupling). Scheduler creates campaigns in background jobs with its own session. Circular dependency detection logic is complex and untested (no campaign router tests exist). |
 | Business Impact | **8/10** | Campaigns organize test execution for entire threat actor profiles. A bug in campaign scheduling or circular dependency detection could spawn infinite campaigns or skip critical test coverage. Campaigns drive the operational workflow for Red/Blue leads. |
 **Why fourth:** The campaign module has the most scattered responsibilities (36 db ops in router + service + scheduler) and the second instance of HTTPException in a service. It is a natural candidate after tests, scoring, and heatmap because it depends on both test and technique entities, testing the cross-module communication pattern.
 **What to extract:**
 - `CampaignEntity` with `add_test()`, `activate()`, `complete()`, `has_circular_dependency()` → `domain/entities/campaign.py`
 - `CampaignRepository` protocol → `domain/ports/repositories/`
 - Use cases for lifecycle operations → `application/use_cases/campaigns/`
 - Remove `HTTPException` from `campaign_service.py`
 - Campaign scheduling as infrastructure concern → `infrastructure/jobs/`
 ---
 ### #5: Reports & Metrics Module
 **Refactor scope:** `routers/reports.py` (273 lines, 6 db ops) + `routers/metrics.py` (316 lines, 12 db ops) + `routers/compliance.py` (~350 lines, 13 db ops)
 | Axis | Score | Evidence |
 |------|-------|----------|
 | Complexity | **8/10** | Three routers totaling ~940 lines with zero service delegation. Complex aggregation queries, CSV generation, in-memory data transformation, and compliance gap analysis — all inline in route handlers. |
 | Technical Risk | **7/10** | **SR-004 from risk registry:** Reports load unbounded result sets (all techniques, all tests). N+1 per-technique test counts in reports. In-memory aggregation instead of SQL GROUP BY. No streaming for CSV export. Compliance calls `calculate_technique_score()` per technique per control — multiplicative N+1. |
 | Business Impact | **7/10** | Reports and metrics are consumed by leads and executives for decision-making. Compliance reports map to regulatory requirements (NIST 800-53, CIS Controls). Incorrect metrics erode trust in the platform. |
 **Why fifth:** These three routers share the same anti-pattern (fat controller with inline queries and aggregations) and the same fix (extract to application-layer use cases with repository-backed batch queries). Refactoring them as a group establishes the pattern for the remaining 8 routers that still have direct DB access.
 **What to extract:**
 - Report generation → `application/use_cases/reports/`
 - Metrics calculation → `application/use_cases/metrics/` (or merge with scoring)
 - Compliance gap analysis → `application/use_cases/compliance/`
 - SQL-level aggregation in repositories → `TechniqueRepository.get_coverage_summary()`
 - CSV streaming as infrastructure concern → `infrastructure/export/csv_writer.py`
 ---
 ### Refactor Priority Summary
 ```
 Module              Complexity  Risk  Impact  Weighted  Order
 ─────────────────────────────────────────────────────────
 Test Workflow        10         10     10      10.0       #1
 Scoring              9          9      9       9.0        #2
 Heatmap              9          8      8       8.4        #3
 Campaigns            8          7      8       7.7        #4
 Reports & Metrics    8          7      7       7.4        #5
 ```
 ---
 ## 6. Repository Pattern for Technique
 This section designs a concrete repository pattern for `Technique` that can be introduced **without breaking existing code**. The strategy is additive: new code uses the repository, old code continues working until incrementally migrated.
 ### 6.1. Domain Port — The Interface
 ```python
 # domain/ports/repositories/technique_repository.py
 from __future__ import annotations
 import uuid
 from typing import Protocol, runtime_checkable
 from app.domain.enums import TechniqueStatus
@runtime_checkable
 class TechniqueRepository(Protocol):
    """Port defining how the application accesses technique data.
    This is a domain contract — implementations live in infrastructure/.
    The domain layer NEVER imports the implementation.
    """
    # ── Single-entity access ─────────────────────────────────────
    def find_by_id(self, technique_id: uuid.UUID) -> TechniqueEntity | None:
        """Return a technique by primary key, or None."""
        ...
    def find_by_mitre_id(self, mitre_id: str) -> TechniqueEntity | None:
        """Return a technique by its MITRE ATT&CK identifier (e.g. 'T1059.001')."""
        ...
    def find_by_mitre_id_with_tests(self, mitre_id: str) -> TechniqueEntity | None:
        """Return a technique with its tests eagerly loaded."""
        ...
    # ── List access ──────────────────────────────────────────────
    def list_all(
        self,
        *,
        tactic: str | None = None,
        status: TechniqueStatus | None = None,
        review_required: bool | None = None,
    ) -> list[TechniqueEntity]:
        """Return techniques matching the given filters, ordered by mitre_id."""
        ...
    def list_by_tactic(self, tactic: str) -> list[TechniqueEntity]:
        """Return all techniques for a given tactic."""
        ...
    def list_by_ids(self, ids: list[uuid.UUID]) -> list[TechniqueEntity]:
        """Return techniques matching a list of primary keys."""
        ...
    # ── Batch queries (for scoring/heatmap performance) ──────────
    def count_by_status(self) -> dict[TechniqueStatus, int]:
        """Return technique counts grouped by status_global.
        Single SQL query — replaces the per-technique counting pattern."""
        ...
    def find_all_with_test_counts(self) -> list[TechniqueWithCounts]:
        """Return all techniques with pre-aggregated test counts and
        detection rule counts. Single query with subqueries — eliminates
        the N+1 pattern in heatmap and scoring."""
        ...
    # ── Mutations ────────────────────────────────────────────────
    def save(self, technique: TechniqueEntity) -> TechniqueEntity:
        """Persist a new or updated technique. Returns the saved entity."""
        ...
    def exists_by_mitre_id(self, mitre_id: str) -> bool:
        """Check existence without loading the full entity."""
        ...
 ```
 **Key design decisions:**
 - Uses `typing.Protocol` (structural subtyping) rather than `ABC` — no need for the implementation to explicitly inherit. This is idiomatic Python and works with `isinstance()` checks via `@runtime_checkable`.
 - Methods return domain entities (`TechniqueEntity`), never ORM models.
 - Batch methods (`count_by_status`, `find_all_with_test_counts`) are designed to eliminate the N+1 patterns identified in SR-001 and SR-003.
 - No `Session` parameter — the session is an implementation detail of the SQLAlchemy repository.
 ### 6.2. Infrastructure Implementation — SQLAlchemy
 ```python
 # infrastructure/persistence/repositories/sa_technique_repository.py
 import uuid
 from typing import NamedTuple
 from sqlalchemy import func
 from sqlalchemy.orm import Session, joinedload
 from app.domain.enums import TechniqueStatus
 from app.domain.entities.technique import TechniqueEntity
 from app.domain.ports.repositories.technique_repository import TechniqueRepository
 from app.infrastructure.persistence.orm.technique_model import Technique
 from app.infrastructure.persistence.orm.test_model import Test
 from app.infrastructure.persistence.orm.detection_rule_model import DetectionRule
 from app.infrastructure.persistence.mappers.technique_mapper import TechniqueMapper
 class TechniqueWithCounts(NamedTuple):
    """Pre-aggregated technique data for heatmap/scoring."""
    entity: TechniqueEntity
    test_count: int
    validated_test_count: int
    detection_rule_count: int
 class SATechniqueRepository:
    """SQLAlchemy implementation of TechniqueRepository.
    Receives a Session from the Unit of Work — does NOT create its own.
    Does NOT call commit() — that is the Unit of Work's responsibility.
    """
    def __init__(self, session: Session) -> None:
        self._session = session
    # ── Single-entity access ─────────────────────────────────────
    def find_by_id(self, technique_id: uuid.UUID) -> TechniqueEntity | None:
        model = self._session.query(Technique).filter(
            Technique.id == technique_id
        ).first()
        return TechniqueMapper.to_entity(model) if model else None
    def find_by_mitre_id(self, mitre_id: str) -> TechniqueEntity | None:
        model = self._session.query(Technique).filter(
            Technique.mitre_id == mitre_id
        ).first()
        return TechniqueMapper.to_entity(model) if model else None
    def find_by_mitre_id_with_tests(self, mitre_id: str) -> TechniqueEntity | None:
        model = (
            self._session.query(Technique)
            .options(joinedload(Technique.tests))
            .filter(Technique.mitre_id == mitre_id)
            .first()
        )
        return TechniqueMapper.to_entity_with_tests(model) if model else None
    # ── List access ──────────────────────────────────────────────
    def list_all(
        self,
        *,
        tactic: str | None = None,
        status: TechniqueStatus | None = None,
        review_required: bool | None = None,
    ) -> list[TechniqueEntity]:
        query = self._session.query(Technique)
        if tactic is not None:
            query = query.filter(Technique.tactic == tactic)
        if status is not None:
            query = query.filter(Technique.status_global == status)
        if review_required is not None:
            query = query.filter(Technique.review_required == review_required)
        models = query.order_by(Technique.mitre_id).all()
        return [TechniqueMapper.to_entity(m) for m in models]
    def list_by_tactic(self, tactic: str) -> list[TechniqueEntity]:
        models = (
            self._session.query(Technique)
            .filter(Technique.tactic == tactic)
            .order_by(Technique.mitre_id)
            .all()
        )
        return [TechniqueMapper.to_entity(m) for m in models]
    def list_by_ids(self, ids: list[uuid.UUID]) -> list[TechniqueEntity]:
        models = (
            self._session.query(Technique)
            .filter(Technique.id.in_(ids))
            .all()
        )
        return [TechniqueMapper.to_entity(m) for m in models]
    # ── Batch queries ────────────────────────────────────────────
    def count_by_status(self) -> dict[TechniqueStatus, int]:
        rows = (
            self._session.query(
                Technique.status_global,
                func.count(Technique.id),
            )
            .group_by(Technique.status_global)
            .all()
        )
        result = {s: 0 for s in TechniqueStatus}
        for status_val, count in rows:
            result[status_val] = count
        return result
    def find_all_with_test_counts(self) -> list[TechniqueWithCounts]:
        """Single query that replaces the N+1 pattern.
        Instead of: for each technique → query tests → query rules
        This does:  one query with subqueries for counts.
        """
        test_count_sq = (
            self._session.query(
                Test.technique_id,
                func.count(Test.id).label("test_count"),
                func.count(Test.id).filter(Test.state == "validated").label("validated_count"),
            )
            .group_by(Test.technique_id)
            .subquery()
        )
        rule_count_sq = (
            self._session.query(
                DetectionRule.mitre_technique_id,
                func.count(DetectionRule.id).label("rule_count"),
            )
            .group_by(DetectionRule.mitre_technique_id)
            .subquery()
        )
        rows = (
            self._session.query(
                Technique,
                func.coalesce(test_count_sq.c.test_count, 0),
                func.coalesce(test_count_sq.c.validated_count, 0),
                func.coalesce(rule_count_sq.c.rule_count, 0),
            )
            .outerjoin(test_count_sq, Technique.id == test_count_sq.c.technique_id)
            .outerjoin(rule_count_sq, Technique.mitre_id == rule_count_sq.c.mitre_technique_id)
            .order_by(Technique.mitre_id)
            .all()
        )
        return [
            TechniqueWithCounts(
                entity=TechniqueMapper.to_entity(tech),
                test_count=tc,
                validated_test_count=vtc,
                detection_rule_count=rc,
            )
            for tech, tc, vtc, rc in rows
        ]
    # ── Mutations ────────────────────────────────────────────────
    def save(self, technique: TechniqueEntity) -> TechniqueEntity:
        model = TechniqueMapper.to_model(technique)
        merged = self._session.merge(model)
        self._session.flush()  # flush to get generated values, but do NOT commit
        return TechniqueMapper.to_entity(merged)
    def exists_by_mitre_id(self, mitre_id: str) -> bool:
        return (
            self._session.query(Technique.id)
            .filter(Technique.mitre_id == mitre_id)
            .first()
        ) is not None
 ```
 **Key design decisions:**
 - **No `commit()`**: The repository flushes but never commits. Transaction control belongs to the Unit of Work, which the use case manages.
 - **Returns domain entities**: The mapper converts ORM models to domain entities at the repository boundary. No ORM model ever crosses into the application or domain layers.
 - **Batch method**: `find_all_with_test_counts()` replaces the N+1 pattern with subqueries — reducing 1,400+ queries to 1 for the heatmap.
 ### 6.3. Injection into a Use Case
 ```python
 # presentation/dependencies/repositories.py
 from fastapi import Depends
 from sqlalchemy.orm import Session
 from app.domain.ports.repositories.technique_repository import TechniqueRepository
 from app.infrastructure.persistence.database import get_db
 from app.infrastructure.persistence.repositories.sa_technique_repository import (
    SATechniqueRepository,
 )
 def get_technique_repository(
    db: Session = Depends(get_db),
 ) -> TechniqueRepository:
    """FastAPI dependency that provides a TechniqueRepository.
    Wiring lives ONLY in the presentation layer — the use case
    never knows it's getting a SQLAlchemy implementation.
    """
    return SATechniqueRepository(db)
 ```
 ```python
 # presentation/dependencies/use_cases.py
 from fastapi import Depends
 from app.application.use_cases.techniques.create_technique import CreateTechniqueUseCase
 from app.domain.ports.repositories.technique_repository import TechniqueRepository
 from app.presentation.dependencies.repositories import get_technique_repository
 def get_create_technique_use_case(
    technique_repo: TechniqueRepository = Depends(get_technique_repository),
 ) -> CreateTechniqueUseCase:
    return CreateTechniqueUseCase(technique_repo=technique_repo)
 ```
 ```python
 # application/use_cases/techniques/create_technique.py
 import uuid
 from app.domain.entities.technique import TechniqueEntity
 from app.domain.exceptions import DuplicateEntityError
 from app.domain.ports.repositories.technique_repository import TechniqueRepository
 from app.application.dto.technique_dto import CreateTechniqueInput, TechniqueResult
 class CreateTechniqueUseCase:
    """Application use case: create a new MITRE ATT&CK technique.
    This class knows NOTHING about:
    - FastAPI, HTTP, Pydantic
    - SQLAlchemy, PostgreSQL
    - How the repository is implemented
    """
    def __init__(self, technique_repo: TechniqueRepository) -> None:
        self._repo = technique_repo
    def execute(self, input: CreateTechniqueInput, user_id: uuid.UUID) -> TechniqueResult:
        # Business rule: mitre_id must be unique
        if self._repo.exists_by_mitre_id(input.mitre_id):
            raise DuplicateEntityError("Technique", "mitre_id", input.mitre_id)
        # Create domain entity
        technique = TechniqueEntity.create(
            mitre_id=input.mitre_id,
            name=input.name,
            description=input.description,
            tactic=input.tactic,
            platforms=input.platforms,
        )
        # Persist through repository
        saved = self._repo.save(technique)
        # Return application DTO
        return TechniqueResult.from_entity(saved)
 ```
 ```python
 # presentation/api/v1/techniques.py  (refactored — thin router)
 from fastapi import APIRouter, Depends, status
 from app.application.use_cases.techniques.create_technique import CreateTechniqueUseCase
 from app.domain.exceptions import DuplicateEntityError, EntityNotFoundError
 from app.presentation.dependencies.auth import get_current_user, require_role
 from app.presentation.dependencies.use_cases import get_create_technique_use_case
 from app.presentation.schemas.technique_schema import TechniqueCreate, TechniqueOut
 router = APIRouter(prefix="/techniques", tags=["techniques"])
@router.post("", response_model=TechniqueOut, status_code=status.HTTP_201_CREATED)
 def create_technique(
    payload: TechniqueCreate,
    use_case: CreateTechniqueUseCase = Depends(get_create_technique_use_case),
    current_user = Depends(require_role("admin")),
 ):
    """Create a new technique.
    This router:
    - Receives the HTTP request (Pydantic validates it)
    - Calls the use case
    - The error handler middleware maps domain exceptions to HTTP responses
    - Returns the result
    Total: 5 lines of actual logic.
    """
    result = use_case.execute(
        input=CreateTechniqueInput(
            mitre_id=payload.mitre_id,
            name=payload.name,
            description=payload.description,
            tactic=payload.tactic,
            platforms=payload.platforms,
        ),
        user_id=current_user.id,
    )
    return result
 ```
 ### 6.4. Coexistence Strategy — No Big Bang
 The repository can be introduced **alongside existing code** without breaking anything:
 ```
 Phase 1: Create the repository interface and SQLAlchemy implementation.
         Both old (direct db.query) and new (repository) code coexist.
         New endpoints use the repository. Old endpoints are unchanged.
 Phase 2: Migrate routers one endpoint at a time.
         Replace db.query(Technique).filter(...) with repo.find_by_mitre_id().
         Each migration is a small, reviewable PR.
 Phase 3: When all consumers are migrated, the ORM model is no longer
         imported outside infrastructure/. Enforce via linting rule.
 ```
 At no point does existing functionality break. Both patterns access the same database, the same tables, the same session. The repository is an additive abstraction — it wraps what already exists.
@@ -0,0 +1,654 @@
 # Aegis — Technical Debt, Risks & Improvement Plan
 > **Author:** Architecture review  
 > **Date:** February 11, 2026 (updated February 18, 2026)  
 > **Scope:** Backend, Frontend, Infrastructure, Security, Scalability, Maintainability
 >
 > **Note:** Items marked with ✅ have been resolved. See inline status annotations.
 ---
 ## Table of Contents
 1. [Technical Debt](#1-technical-debt)
 2. [Scalability Risks](#2-scalability-risks)
 3. [Security Risks](#3-security-risks)
 4. [Maintainability Risks](#4-maintainability-risks)
 5. [Recommended Medium-Term Improvements](#5-recommended-medium-term-improvements)
 6. [Priority Matrix](#6-priority-matrix)
 ---
 ## 1. Technical Debt
 ### HIGH PRIORITY
 #### TD-001: Fat Controllers (Routers with Embedded Business Logic)
 **Current state:** 11 of 21 routers execute raw SQLAlchemy queries directly. The worst offenders:
 | Router | Lines | Embedded Logic |
 |--------|-------|----------------|
 | `heatmap.py` | 528 | Query building + color mapping + ATT&CK Navigator JSON serialization + export |
 | `tests.py` | 664 | CRUD + template instantiation + timeline queries (workflow delegated) |
 | `reports.py` | 273 | Aggregation queries + CSV generation + JSON formatting |
 | `compliance.py` | ~350 | CRUD + import + gap analysis + CSV export |
 | `metrics.py` | ~316 | Complex aggregation queries with in-memory processing |
 **Impact:** Cannot unit test business logic without spinning up FastAPI + DB. Logic duplication across routers. Changes to one query pattern must be replicated manually in every router that uses it.
 **Remediation:** Extract query logic to service/repository layer. Each router endpoint should be < 20 lines.
 ---
 #### TD-002: No Repository Layer — Scattered Duplicate Queries ✅ PARTIALLY RESOLVED
 **Current state (updated Feb 18):** Repository ports (Protocol interfaces) and SQLAlchemy implementations now exist for `Technique` and `Test`:
 - `domain/ports/repositories/technique_repository.py` — Protocol with `find_by_id()`, `find_by_mitre_id()`, `list_all()`, `count_by_status()`, `find_all_with_test_counts()`, `save()`, etc.
 - `domain/ports/repositories/test_repository.py` — Protocol with `find_by_id()`, `list_by_technique()`, `get_states_and_results_for_technique()`, etc.
 - `infrastructure/persistence/repositories/sa_technique_repository.py` — Concrete implementation with batch queries (eliminates N+1 for heatmap/scoring).
 - `infrastructure/persistence/repositories/sa_test_repository.py` — Concrete implementation.
 - `dependencies/repositories.py` — FastAPI `Depends()` wiring.
 **Remaining:** Old routers still use direct `db.query()`. New endpoints should use repositories; existing endpoints will be migrated incrementally.
 **Impact:** New code has centralized query management. Old queries still scattered but coexist safely.
 ---
 #### TD-003: Services Depend on FastAPI (HTTPException in Domain Logic) ✅ RESOLVED
 **Current state (updated Feb 18):** Domain exceptions have been implemented and are in active use:
 - `domain/errors.py` — Full exception hierarchy: `DomainError`, `EntityNotFoundError`, `DuplicateEntityError`, `InvalidStateTransition`, `BusinessRuleViolation`, `InvalidOperationError`, `PermissionViolation`.
 - `domain/exceptions.py` — Backward-compatible re-exports.
 - `middleware/error_handler.py` — Maps domain exceptions to HTTP responses automatically (404, 409, 400, 403).
 - `test_workflow_service.py` — Now raises `InvalidOperationError` and `InvalidStateTransition` instead of `HTTPException`.
 **No further action needed** for the core services. Some secondary routers may still raise `HTTPException` directly (which is acceptable at the presentation layer).
 ---
 #### TD-004: Mutable Global Settings at Runtime
 **Current state:** The `scores.py` router mutates `settings` directly:
 ```python
 settings.SCORING_WEIGHT_TESTS = body.weight_tests
 settings.SCORING_WEIGHT_DETECTION_RULES = body.weight_detection_rules
 ```
 **Impact:** Changes lost on restart. Thread-unsafe with multiple workers. No audit trail for config changes.
 **Remediation:** Persist scoring weights in the database. Create a `ScoringConfig` table. Load weights from DB in scoring_service.
 ---
 #### TD-005: Anemic Domain Models ✅ PARTIALLY RESOLVED
 **Current state (updated Feb 18):** Rich domain entities now exist alongside the ORM models:
 - `domain/test_entity.py` — `TestEntity` dataclass with full state machine (`can_transition()`, `transition_to()`, `start_execution()`, `submit_red_evidence()`, `submit_blue_evidence()`, `validate()`, `reopen()`), dual validation, pause/resume timers, and domain events. Comprehensive unit tests (46 tests).
 - `domain/entities/technique.py` — `TechniqueEntity` with `recalculate_status()`, `mark_reviewed()`, `flag_for_review()`, `create()`, `from_orm()`/`apply_to()`. Comprehensive unit tests (16 tests).
 - `domain/value_objects/mitre_id.py` — Immutable value object with ATT&CK ID validation.
 - `domain/value_objects/scoring_weights.py` — Immutable weight set enforcing sum-to-100.
 **ORM models remain anemic** (by design — they are persistence mapping only). Business logic lives in domain entities, bridged via `from_orm()`/`apply_to()`.
 **Remaining:** Campaign, ComplianceFramework, and other entities still lack domain entity counterparts.
 ---
 ### MEDIUM PRIORITY
 #### TD-006: Inconsistent Error Response Format
 **Current state:** API error responses use three different formats:
 | Format | Used In |
 |--------|---------|
 | `detail: "string"` | Most routers (`techniques.py`, `users.py`, `evidence.py`) |
 | `detail: {message, code, ...}` | `tests.py`, `test_workflow_service.py` |
 | `detail: "Validation error", code: "VALIDATION_ERROR", errors: [...]` | Global handler in `main.py` |
 **Impact:** Frontend must handle multiple error shapes. No reliable error code for programmatic handling.
 **Remediation:** Standardize all errors to `{detail: string, code: string, errors?: [...]}`.
 ---
 #### TD-007: Silently Swallowed Exceptions in Workflow Service
 **Current state:** `test_workflow_service.py` has 4 bare `except Exception: pass` blocks:
 | Line | What is swallowed |
 |------|-------------------|
 | 106 | `notify_test_state_change()` failure |
 | 286 | Notification failure |
 | 295 | Notification failure |
 | 299 | Score cache invalidation failure |
 **Impact:** Notification failures and cache invalidation errors go completely unnoticed. Users may miss critical workflow notifications with no trace in logs.
 **Remediation:** Replace `pass` with `logger.warning(...)` at minimum. Consider async event dispatch so failures don't block the main flow.
 ---
 #### TD-008: Test Suite Gaps
 **Current state:** ~167 test functions across 18 files, but coverage is uneven:
 | Category | Covered | Not Covered |
 |----------|---------|-------------|
 | **Routers** | auth, techniques, tests, evidence, test_templates, metrics, system | audit, campaigns, compliance, d3fend, detection_rules, heatmap, operational_metrics, scores, snapshots, threat_actors, users |
 | **Services** | workflow, status, atomic_import, campaign, scoring, notifications | audit, caldera, compliance_import, d3fend, elastic, intel, lolbas, mitre_sync, score_cache, sigma, threat_actor_import |
 4 integration tests are `pytest.skip`ped by default (Sigma, LOLBAS, CALDERA, Elastic full imports).
 Some tests use `inspect.getsource()` to verify code structure rather than actually calling endpoints.
 **Impact:** Regressions in untested routers/services go undetected. No security-focused tests (injection, rate limiting, CSRF).
 **Remediation:** Add integration tests for all routers. Add dedicated security test suite. Run skipped integration tests in CI.
 ---
 #### TD-009: No CI/CD Pipeline ✅ RESOLVED
 **Current state (updated Feb 18):** A fully functional CI pipeline exists at `.github/workflows/ci.yml`:
 - Runs `ruff` linting on every push/PR.
 - Runs `pytest` against a real PostgreSQL + Redis service container.
 - Tests run against the same stack as production (not SQLite).
 Additionally, `scripts/agent_validate_backend.sh` provides a local validation script that runs lint + tests inside the Docker container.
 **No further action needed** for basic CI. Potential enhancements: add `mypy` type checking, Docker build verification.
 ---
 #### TD-010: Unstructured Logging
 **Current state:** Logging uses plain format strings with no structured fields:
 ```python
 logging.basicConfig(
    level=logging.INFO,
    format="%(asctime)s  %(levelname)-8s  %(name)s — %(message)s",
 )
 ```
 Global exception handlers use `logging.error(f"...")` instead of a logger instance. No request ID, user ID, or correlation ID in log output.
 **Impact:** Cannot query logs for "all actions by user X" or "all errors in request Y". Log analysis in production requires manual grep.
 **Remediation:** Add structured JSON logging (e.g., `structlog` or `python-json-logger`). Include request_id middleware.
 ---
 ### LOW PRIORITY
 #### TD-011: Entrypoint Scripts Have No Retry Logic
 **Current state:** Both `entrypoint.sh` and `entrypoint.prod.sh` use `set -e`. If `alembic upgrade head` or `python -m app.seed` fails, Uvicorn never starts. No retry, no clear error message.
 **Impact:** Transient DB connection failures during container startup cause the backend to fail permanently until manually restarted (Docker `restart: always` will retry, but seed may fail repeatedly).
 **Remediation:** Add retry loop for migration with backoff. Make seed idempotent and non-fatal.
 ---
 #### TD-012: No Database Migration Tests
 **Current state:** Alembic migrations (18 versions) are never tested in isolation. The test suite uses in-memory SQLite with tables created from models, bypassing Alembic entirely.
 **Impact:** Migration scripts may fail on real PostgreSQL (different dialect, JSONB handling) despite tests passing on SQLite.
 **Remediation:** Add a CI step that runs `alembic upgrade head` against a real PostgreSQL container.
 ---
 ## 2. Scalability Risks
 ### HIGH PRIORITY
 #### SR-001: N+1 Query Explosion in Scoring Engine ✅ PARTIALLY RESOLVED
 **Current state (updated Feb 18):** The worst N+1 patterns have been addressed:
 - `scoring_service.py` — `bulk_technique_scores()` performs 5 aggregated subqueries to fetch all scoring data in bulk, reducing organization-wide scoring from ~3,500 queries to ~5.
 - `SATechniqueRepository.find_all_with_test_counts()` — Single query with subqueries for test counts, validated test counts, and detection rule counts.
 - Heatmap service uses batch-fetching techniques.
 **Remaining:** Individual technique scoring (`calculate_technique_score()`) still performs per-technique queries when called in isolation. `create_snapshot()` could benefit from using the bulk method.
 **Impact:** Organization score calculation reduced from seconds to sub-second. Individual technique scoring unchanged.
 ---
 #### SR-002: In-Memory Cache Does Not Scale
 **Current state:** `score_cache.py` uses a Python dict with 300-second TTL. Each worker process has its own cache.
 **Impact:** With N workers, each has a cold cache on startup and after every TTL expiration. Cache miss triggers the full org score calculation (3,500+ queries). Effectively no caching under multiple workers.
 **Remediation:** Move cache to Redis. Invalidate granularly when tests or techniques change.
 ---
 #### SR-003: Heatmap Endpoints Load All Techniques Without Pagination ✅ RESOLVED
 **Current state (updated Feb 18):** The heatmap service has been extracted and optimized:
 - `services/heatmap_service.py` — Dedicated service with batch-fetching techniques (pre-aggregated `test_counts`, `rule_counts` in 2 SQL subqueries instead of N+1).
 - The `SATechniqueRepository.find_all_with_test_counts()` method provides a single-query alternative for scoring/heatmap use cases.
 - Router reduced from ~528 lines to a thin delegation layer.
 **No further action needed** for query performance. The repository method can replace direct usage in remaining endpoints.
 ---
 ### MEDIUM PRIORITY
 #### SR-004: Reports Load Full Tables Into Memory
 **Current state:** All 4 report endpoints load unbounded result sets:
 | Endpoint | Pattern |
 |----------|---------|
 | `coverage-summary` | All techniques + per-technique test count query (N+1) |
 | `coverage-csv` | Same as above + CSV serialization in memory |
 | `test-results` | All tests, aggregated in Python |
 | `remediation-status` | All tests, filtered in Python |
 **Impact:** For datasets with thousands of tests, memory usage spikes. No streaming — entire response built in memory before sending.
 **Remediation:** Use SQL aggregations. Stream CSV output. Add date range filters as required parameters.
 ---
 #### SR-005: Operational Metrics N+1 on Audit Logs
 **Current state:** MTTD and MTTR calculations in `operational_metrics_service.py` load all validated tests, then query `AuditLog` twice per test to find state transition timestamps.
 **Impact:** For 500 validated tests: 1,000 audit log queries. Grows linearly with test count.
 **Remediation:** Denormalize key timestamps onto the Test model (e.g., `red_started_at`, `blue_started_at`, `remediation_completed_at`) or use a single batch audit log query with window functions.
 ---
 #### SR-006: Missing Database Indexes ✅ RESOLVED
 **Current state (updated Feb 18):** All critical indexes are now in place:
 | Table | Index | Status |
 |-------|-------|--------|
 | `tests` | `(technique_id, state)` | ✅ Exists (model `__table_args__`) |
 | `tests` | `(created_at)`, `(state, created_at)` | ✅ Added in migration `b024` |
 | `techniques` | `(tactic)` | ✅ Added in migration `b026` |
 | `techniques` | `(status_global)` | ✅ Added in migration `b026` |
 | `audit_logs` | `(entity_type, entity_id)`, `(timestamp)`, `(entity_type, entity_id, action)` | ✅ Exists (model `__table_args__`) |
 | `detection_rules` | `(mitre_technique_id)`, `(source)`, `(severity)` | ✅ Exists (model `__table_args__`) |
 **No further action needed.**
 ---
 ### LOW PRIORITY
 #### SR-007: Single-Instance Scheduler Constraint
 **Current state:** APScheduler runs in-process. If multiple backend instances exist, each runs its own scheduler — causing duplicate MITRE syncs, duplicate snapshots, duplicate campaign spawns.
 **Impact:** No impact today (single instance), but blocks horizontal scaling.
 **Remediation:** Use APScheduler PostgreSQL JobStore for distributed locking. Or migrate to Celery Beat.
 ---
 #### SR-008: Evidence Presigned URLs Point to Internal Hostname
 **Current state:** MinIO presigned URLs contain `minio:9000` (Docker internal hostname), which is not resolvable from the user's browser.
 **Impact:** Evidence download links fail in production unless Nginx proxies MinIO or MinIO has a public endpoint.
 **Remediation:** Configure `MINIO_EXTERNAL_ENDPOINT` env var. Use it when generating presigned URLs.
 ---
 ## 3. Security Risks
 ### HIGH PRIORITY
 #### SEC-001: In-Memory Token Blacklist ✅ RESOLVED
 **Current state (updated Feb 18):** The token blacklist is now Redis-backed:
 - `infrastructure/redis_client.py` — Singleton Redis connection.
 - `auth.py` — `blacklist_token()` and `is_token_blacklisted()` use Redis with TTL matching token expiration.
 - Shared across all workers. Survives server restarts.
 **No further action needed.**
 ---
 #### SEC-002: Default Credentials in Configuration ✅ RESOLVED
 **Current state (updated Feb 18):** Production startup validation now rejects default credentials:
 - `config.py` — `SECRET_KEY` validation already existed; now also checks `MINIO_ACCESS_KEY` and `MINIO_SECRET_KEY` against their defaults (`minioadmin`). Fails fast with `RuntimeError` in production mode.
 - The `install.sh` script generates random passwords for production.
 **No further action needed.**
 ---
 #### SEC-003: Rate Limiting Only on Login
 **Current state:** SlowAPI rate limiting is applied only to `POST /auth/login` (5/minute). All other endpoints have no rate limits:
 | Unprotected Endpoint | Risk |
 |---------------------|------|
 | `POST /users` | Bulk user creation |
 | `POST /tests` | Resource exhaustion |
 | `POST /system/sync-mitre` | Repeated expensive syncs |
 | `POST /system/import-atomic-tests` | Repeated 40MB ZIP downloads |
 | `POST /tests/{id}/evidence` | Large file upload flooding |
 | `GET /reports/*` | Expensive report generation DoS |
 **Impact:** An authenticated attacker (or compromised account) can DoS the system by triggering expensive operations repeatedly.
 **Remediation:** Add tiered rate limits: strict on auth, moderate on write endpoints, relaxed on read endpoints. Add specific limits on sync/import endpoints (1/hour).
 ---
 ### MEDIUM PRIORITY
 #### SEC-004: No Input Validation on Username ✅ RESOLVED
 **Current state (updated Feb 18):** Username validation is now enforced:
 - `schemas/user.py` — `_validate_username()` function checks: 3-50 characters, only letters/digits/underscores/hyphens, rejects reserved names (`admin`, `root`, `system`, `api`, `null`, `undefined`, `administrator`, `superuser`, `aegis`).
 - Applied via `@field_validator("username")` on `UserCreate`.
 - 9 unit tests covering valid, invalid, and reserved usernames.
 **No further action needed.**
 ---
 #### SEC-005: Timing-Based User Enumeration on Login ✅ RESOLVED
 **Current state (updated Feb 18):** Login now uses constant-time comparison:
 - `routers/auth.py` — Always runs `verify_password()` against a dummy bcrypt hash when user is not found, ensuring consistent response time regardless of whether the username exists.
 **No further action needed.**
 ---
 #### SEC-006: Pydantic Validation Errors Leak Schema Details
 **Current state:** The global validation error handler returns full Pydantic error details:
 ```python
 content={
    "detail": "Validation error",
    "code": "VALIDATION_ERROR",
    "errors": exc.errors(),  # Full field paths, types, constraints
 }
 ```
 **Impact:** Attackers can probe endpoints to discover internal field names, types, and validation rules.
 **Remediation:** Sanitize error output in production. Return field names and human-readable messages only, strip internal type information.
 ---
 #### SEC-007: No Password Complexity Requirements ✅ RESOLVED
 **Current state (updated Feb 18):** Password complexity is enforced:
 - `schemas/user.py` — `_validate_password_strength()` requires: minimum 12 characters, at least one uppercase, one lowercase, one digit, one special character.
 - Applied on `UserCreate`, `UserUpdate`, and `PasswordChange` schemas.
 - 6 unit tests covering all complexity rules.
 **No further action needed.**
 ---
 ### LOW PRIORITY
 #### SEC-008: CORS Origins Not Validated in Production
 **Current state:** `CORS_ORIGINS` is a comma-separated string from environment. If set to `*` or overly broad patterns, credentials (HttpOnly cookies) are sent to unintended origins.
 **Impact:** Low (requires misconfiguration), but could enable cross-origin attacks.
 **Remediation:** Validate `CORS_ORIGINS` at startup — reject `*` when `AEGIS_ENV=production`.
 ---
 #### SEC-009: No Audit Log for Failed Login Attempts
 **Current state:** Successful logins are not audited. Failed logins are not audited. Only post-login actions are recorded.
 **Impact:** Cannot detect brute force attacks or compromised account usage patterns.
 **Remediation:** Log all login attempts (success/failure) to audit_logs with IP address and timestamp.
 ---
 ## 4. Maintainability Risks
 ### HIGH PRIORITY
 #### MR-001: No Dependency Inversion — Everything Points to Concrete Implementations ✅ PARTIALLY RESOLVED
 **Current state (updated Feb 18):** Protocol interfaces and dependency injection now exist for core entities:
 - `domain/ports/repositories/technique_repository.py` — `@runtime_checkable` Protocol.
 - `domain/ports/repositories/test_repository.py` — `@runtime_checkable` Protocol.
 - `dependencies/repositories.py` — FastAPI `Depends()` wiring for `SATechniqueRepository` and `SATestRepository`.
 - `domain/unit_of_work.py` — `UnitOfWork` context manager for transaction control.
 **Remaining:** Services like `notification_service`, `audit_service`, `scoring_service` still use direct imports. Additional ports needed for storage, notifications, and event bus.
 **Impact:** New code follows DIP. Old code will be migrated incrementally.
 ---
 #### MR-002: Two Coexisting Architectural Patterns
 **Current state:** Some routers delegate to services, others do everything inline. A developer cannot predict where to find or place logic.
 | Pattern | Routers |
 |---------|---------|
 | Delegates to services | tests, scores, notifications, campaigns, snapshots |
 | Direct DB queries | techniques, evidence, users, audit, reports, heatmap, metrics, detection_rules, threat_actors, data_sources, compliance |
 **Impact:** Inconsistent codebase. New developers learn one pattern and find the other. Code reviews cannot enforce a single standard.
 **Remediation:** Establish a single pattern (all through services/use cases) and migrate incrementally.
 ---
 ### MEDIUM PRIORITY
 #### MR-003: No Type Checking Enforcement
 **Current state:** `tsconfig.json` has `strict: true` for the frontend, but the backend has no `mypy` configuration. Python type hints exist but are never verified.
 **Impact:** Type errors in Python code go undetected until runtime. Particularly risky for Optional fields and JSONB data.
 **Remediation:** Add `mypy` to requirements. Create `mypy.ini` with strict settings. Add to CI pipeline.
 ---
 #### MR-004: Test Infrastructure Uses SQLite Instead of PostgreSQL
 **Current state:** `conftest.py` creates an in-memory SQLite database and patches PostgreSQL-specific types (UUID → String, JSONB → JSON):
 ```python
 from sqlalchemy.dialects.postgresql import UUID as PG_UUID, JSONB
 # Patch to use SQLite-compatible types
 sqlalchemy.dialects.postgresql.UUID = _patched_uuid
 sqlalchemy.dialects.postgresql.JSONB = _patched_jsonb
 ```
 **Impact:** Tests pass on SQLite but may fail on real PostgreSQL. JSONB-specific queries (containment `@>`, GIN indexes) are untestable. UUID behavior differs between dialects.
 **Remediation:** Use `testcontainers-python` to spin up a real PostgreSQL container for tests, or use PostgreSQL in CI.
 ---
 #### MR-005: Frontend Types Not Generated from Backend Schemas
 **Current state:** `types/models.ts` is manually maintained and must stay in sync with `schemas/*.py`. There is no code generation or validation step.
 **Impact:** Type drift between frontend and backend. A backend schema change that isn't reflected in `types/models.ts` causes runtime errors in the frontend.
 **Remediation:** Generate TypeScript types from OpenAPI spec (`openapi-typescript` or similar). Run as a pre-build step.
 ---
 ### LOW PRIORITY
 #### MR-006: Documentation Scattered Across Multiple Formats
 **Current state:** Documentation exists in `README.md`, `docs/API.md`, `docs/ARCHITECTURE.md`, `docs/DATA_SOURCES.md`, `docs/SCORING.md`, plus the new analysis documents. No central index or documentation site.
 **Impact:** New developers must discover docs by browsing. No searchable documentation.
 **Remediation:** Create a `docs/INDEX.md` linking all documents. Consider MkDocs or similar for a browsable doc site.
 ---
 #### MR-007: No Conventional Commit or Changelog
 **Current state:** No commit message convention enforced. No CHANGELOG file.
 **Impact:** Difficult to understand what changed between releases. No automated release notes.
 **Remediation:** Adopt Conventional Commits. Add commitlint as a pre-commit hook. Generate CHANGELOG automatically.
 ---
 ## 5. Recommended Medium-Term Improvements
 ### Architecture
 | ID | Improvement | Effort | Impact | Status |
 |----|-------------|--------|--------|--------|
 | IMP-001 | Extract domain exceptions + error handler middleware | 2-3 days | Removes FastAPI dependency from services | ✅ Done |
 | IMP-002 | Create repository layer for Test, Technique, Campaign | 1 week | Centralizes queries, enables caching and mocking | ✅ Done (Test, Technique) |
 | IMP-003 | Extract heatmap/reports/metrics logic to application services | 1-2 weeks | Thin controllers, testable business logic | ✅ Heatmap done |
 | IMP-004 | Persist scoring weights in database | 2-3 days | Eliminates mutable global state | Pending |
 | IMP-005 | Add domain entities with behavior (rich models) | 2-3 weeks | Consolidates scattered business rules | ✅ Done (Test, Technique) |
 ### Scalability
 | ID | Improvement | Effort | Impact | Status |
 |----|-------------|--------|--------|--------|
 | IMP-006 | Batch scoring queries (single SQL per metric) | 1 week | Reduces org score from 3,500 queries to ~10 | ✅ Done |
 | IMP-007 | Add missing composite indexes | 1 day | Immediate query performance improvement | ✅ Done |
 | IMP-008 | Move score cache to Redis | 2-3 days | Shared cache across workers | Pending |
 | IMP-009 | Batch heatmap metadata queries | 2-3 days | Reduces heatmap from 1,400 to 3 queries | ✅ Done |
 | IMP-010 | Denormalize MTTD/MTTR timestamps onto Test model | 3-5 days | Eliminates operational metrics N+1 | Pending |
 ### Security
 | ID | Improvement | Effort | Impact | Status |
 |----|-------------|--------|--------|--------|
 | IMP-011 | Move token blacklist to Redis | 1-2 days | Fixes multi-instance logout | ✅ Done |
 | IMP-012 | Reject default credentials in production | 0.5 days | Prevents insecure deployments | ✅ Done |
 | IMP-013 | Add rate limiting to write/sync endpoints | 1 day | Prevents DoS from authenticated users | Pending |
 | IMP-014 | Add password complexity validation | 0.5 days | Prevents weak passwords | ✅ Done |
 | IMP-015 | Add login attempt auditing | 1 day | Enables brute force detection | Pending |
 ### DevOps
 | ID | Improvement | Effort | Impact | Status |
 |----|-------------|--------|--------|--------|
 | IMP-016 | Create GitHub Actions CI pipeline | 1-2 days | Automated lint + type check + test | ✅ Done |
 | IMP-017 | Add mypy strict type checking | 1-2 days | Catches type errors before runtime | Pending |
 | IMP-018 | Replace SQLite test DB with PostgreSQL (testcontainers) | 1 day | Tests match production behavior | ✅ CI uses PG |
 | IMP-019 | Generate frontend types from OpenAPI | 0.5 days | Eliminates frontend/backend type drift | Pending |
 | IMP-020 | Add structured JSON logging | 1-2 days | Production-ready observability | Pending |
 ---
 ## 6. Priority Matrix
 ### Immediate (Sprint 1 — Week 1-2)
 | ID | Item | Category | Effort |
 |----|------|----------|--------|
 | SEC-001 | Move token blacklist to Redis | Security | 1-2 days |
 | SEC-002 | Reject default credentials in production | Security | 0.5 days |
 | SR-006 | Add missing database indexes | Scalability | 1 day |
 | TD-007 | Replace `except: pass` with logging in workflow service | Tech Debt | 0.5 days |
 | SEC-007 | Add password complexity requirements | Security | 0.5 days |
 | IMP-016 | Create basic CI pipeline | DevOps | 1-2 days |
 **Total estimated effort: ~5-7 days**
 ---
 ### Short-Term (Sprint 2-3 — Week 3-6)
 | ID | Item | Category | Effort |
 |----|------|----------|--------|
 | TD-003 | Extract domain exceptions, remove HTTPException from services | Tech Debt | 2-3 days |
 | SR-001 | Batch scoring queries to eliminate N+1 | Scalability | 1 week |
 | SR-003 | Batch heatmap metadata queries | Scalability | 2-3 days |
 | TD-002 | Create repository layer for core entities | Tech Debt | 1 week |
 | SEC-003 | Add rate limiting to write/sync endpoints | Security | 1 day |
 | TD-004 | Persist scoring weights in database | Tech Debt | 2-3 days |
 | SEC-009 | Add login attempt auditing | Security | 1 day |
 | IMP-008 | Move score cache to Redis | Scalability | 2-3 days |
 **Total estimated effort: ~3-4 weeks**
 ---
 ### Medium-Term (Month 2-3)
 | ID | Item | Category | Effort |
 |----|------|----------|--------|
 | TD-001 | Extract heatmap/reports/metrics to application services | Tech Debt | 2-3 weeks |
 | TD-008 | Expand test coverage to all routers and services | Maintainability | 2-3 weeks |
 | TD-005 | Create rich domain entities (Clean Architecture Phase 2) | Tech Debt | 2-3 weeks |
 | SR-004 | Optimize report endpoints (SQL aggregations, streaming) | Scalability | 1 week |
 | SR-005 | Denormalize MTTD/MTTR timestamps | Scalability | 3-5 days |
 | MR-003 | Add mypy type checking | Maintainability | 1-2 days |
 | MR-004 | Replace SQLite tests with PostgreSQL | Maintainability | 1 day |
 | MR-005 | Generate frontend types from OpenAPI | Maintainability | 0.5 days |
 | IMP-020 | Structured JSON logging | DevOps | 1-2 days |
 **Total estimated effort: ~6-8 weeks**
 ---
 ### Low Priority (Backlog)
 | ID | Item | Category | Effort |
 |----|------|----------|--------|
 | TD-011 | Add retry logic to entrypoint scripts | Tech Debt | 0.5 days |
 | TD-012 | Add migration tests against real PostgreSQL | Maintainability | 1 day |
 | SR-007 | APScheduler PostgreSQL JobStore for horizontal scaling | Scalability | 2-3 days |
 | SR-008 | Fix MinIO presigned URL hostname for production | Scalability | 1 day |
 | SEC-008 | Validate CORS origins in production | Security | 0.5 days |
 | SEC-005 | Constant-time login to prevent user enumeration | Security | 0.5 days |
 | SEC-006 | Sanitize Pydantic validation errors in production | Security | 1 day |
 | MR-006 | Create documentation index / MkDocs site | Maintainability | 1-2 days |
 | MR-007 | Adopt Conventional Commits + CHANGELOG | Maintainability | 1 day |
 | SEC-004 | Username input validation | Security | 0.5 days |
 ---
 ## Summary Scorecard
 ```
 ┌─────────────────────────────────────────────────────────────────────────────┐
 │              Category          │ High │ Med │ Low │ Total│ Resolved │ Open │
 │────────────────────────────────┼──────┼─────┼─────┼──────┼──────────┼──────│
 │  Technical Debt                │   5  │  4  │  2  │  11  │    4     │   7  │
 │  Scalability Risks             │   3  │  3  │  2  │   8  │    3     │   5  │
 │  Security Risks                │   3  │  4  │  2  │   9  │    5     │   4  │
 │  Maintainability Risks         │   2  │  3  │  2  │   7  │    1     │   6  │
 │────────────────────────────────┼──────┼─────┼─────┼──────┼──────────┼──────│
 │  TOTAL                         │  13  │ 14  │  8  │  35  │   13     │  22  │
 └─────────────────────────────────────────────────────────────────────────────┘
 Resolved: 13 of 35 items (37%)
 Remaining estimated effort: ~7-9 weeks (down from ~12-15)
 ```