25 KiB
Aegis — Backend Internal Dependency Analysis
Author: Architecture review
Date: February 11, 2026 (updated February 18, 2026)
Scope: All 21 routers and 20 services inbackend/app/Note: This analysis describes the original state. Since then, a Clean Architecture refactor has begun. See ARCHITECTURAL_ANALYSIS.md for current status. Key changes: domain exceptions replace HTTPException in services, repository ports and implementations exist for Test and Technique, domain entities with business logic exist for Test and Technique, Unit of Work pattern is available, CI pipeline is active.
Table of Contents
- Do Routers Import SQLAlchemy Models Directly?
- Do Services Access the Database Directly?
- Do Services Contain Business Logic or Just CRUD?
- Is Business Logic Separated from Persistence?
- Is Infrastructure Decoupled from Logic?
- What Architecture Is Actually Implemented?
1. Do Routers Import SQLAlchemy Models Directly?
Yes. Every single router imports at least one SQLAlchemy model. 19 of 21 routers execute raw database operations inline.
Complete Router-to-Model Import Map
| Router | Models Imported Directly | DB Operations in Router |
|---|---|---|
audit.py |
AuditLog, User | 3 |
auth.py |
User | 1 |
campaigns.py |
User, Campaign, CampaignTest, Test, Technique, ThreatActor | 36 |
compliance.py |
User, ComplianceFramework, ComplianceControl, ComplianceControlMapping, Technique, TestTemplate, ThreatActorTechnique | 13 |
d3fend.py |
User, Technique, DefensiveTechnique, DefensiveTechniqueMapping | 3 |
data_sources.py |
User, DataSource | 14 |
detection_rules.py |
User, DetectionRule, TestTemplate, TestTemplateDetectionRule, TestDetectionResult | 21 |
evidence.py |
Evidence, Test, User, enums | 11 |
heatmap.py |
User, Technique, Test, ThreatActor, ThreatActorTechnique, DetectionRule, Campaign, CampaignTest, DefensiveTechniqueMapping, enums | 13 |
metrics.py |
Technique, Test, User, enums | 12 |
notifications.py |
Notification, User | 2 |
operational_metrics.py |
User | 0 (delegates) |
reports.py |
Technique, Test, User, enums | 6 |
scores.py |
User, Technique, ThreatActor | 2 |
snapshots.py |
User, CoverageSnapshot, SnapshotTechniqueState | 6 |
system.py |
User | 0 (delegates) |
techniques.py |
Technique, User, enums | 12 |
test_templates.py |
TestTemplate, User | 20 |
tests.py |
AuditLog, Technique, Test, TestTemplate, User, enums | 30 |
threat_actors.py |
User, ThreatActor, ThreatActorTechnique, Technique, Test, TestTemplate, enums | 11 |
users.py |
User | 9 |
Key Numbers
- 21 / 21 routers import at least one SQLAlchemy model.
- 19 / 21 routers execute
db.query(),db.add(),db.commit(), ordb.delete()directly (onlyoperational_metrics.pyandsystem.pyfully delegate). - Total DB operations across all routers: 225 (
db.query,db.add,db.commit,db.delete,db.refreshcalls). - All 21 routers import
Sessionfrom SQLAlchemy. - 7 routers import
func(aggregations). - 7 routers import
joinedload(eager loading). - 2 routers import
or_(compound filters).
What This Means
Routers are tightly coupled to the ORM. They know:
- Table structure (column names, relationships)
- Query syntax (
filter,join,group_by,order_by) - Transaction management (
commit,refresh,add) - Eager loading strategy (
joinedload,selectinload)
There is no abstraction layer between routers and the database. Changing a column name on the Technique model would require modifying at least 8 routers.
2. Do Services Access the Database Directly?
Yes. All 19 services that handle data (all except score_cache.py) receive a SQLAlchemy Session as a parameter and execute queries directly.
Complete Service-to-Database Map
| Service | Models Used | DB Operations | Receives Session |
Imports app.database |
|---|---|---|---|---|
atomic_import_service |
TestTemplate | 3 | Yes | No |
audit_service |
AuditLog | 2 | Yes | No |
caldera_import_service |
TestTemplate, DataSource | 5 | Yes | No |
campaign_scheduler_service |
Campaign, CampaignTest, Test, User | 8 | Yes | No |
campaign_service |
Campaign, CampaignTest, Test, TestTemplate, Technique, ThreatActor, ThreatActorTechnique, User | 10 | Yes | No |
compliance_import_service |
ComplianceFramework, ComplianceControl, ComplianceControlMapping, Technique | 22 | Yes | No |
d3fend_import_service |
Technique, DefensiveTechnique, DefensiveTechniqueMapping | 13 | Yes | No |
elastic_import_service |
DetectionRule, DataSource | 5 | Yes | No |
intel_service |
IntelItem, Technique | 4 | Yes | No |
lolbas_import_service |
TestTemplate, DataSource | 7 | Yes | No |
mitre_sync_service |
Technique, enums | 3 | Yes | No |
notification_service |
Notification, User | 12 | Yes | No |
operational_metrics_service |
Test, Technique, TestDetectionResult, AuditLog, enums | 21 | Yes | No |
score_cache |
— | 0 | No | No |
scoring_service |
Technique, Test, DetectionRule, TestDetectionResult, DefensiveTechniqueMapping, ThreatActor, ThreatActorTechnique | 17 | Yes | No |
sigma_import_service |
DetectionRule, DataSource | 5 | Yes | No |
snapshot_service |
Technique, CoverageSnapshot, SnapshotTechniqueState | 13 | Yes | No |
status_service |
Technique, enums | 1 | Yes | No |
test_workflow_service |
Test, User, enums | 13 | Yes | No |
threat_actor_import_service |
ThreatActor, ThreatActorTechnique, Technique, DataSource | 8 | Yes | No |
Key Numbers
- Total DB operations across all services: 172 (
db.query,db.add,db.commit, etc.). - 19 / 20 services receive
Sessionas a function parameter. - 0 / 20 services import
app.databasedirectly — sessions are always injected by callers (routers or background jobs). - All 19 data-handling services import SQLAlchemy symbols (
Session,func,case, etc.).
Positive Pattern: Session Injection
Services do follow one good practice: none of them create their own database sessions. Sessions are always passed in as arguments:
# All services use this pattern:
def calculate_technique_score(technique: Technique, db: Session) -> dict:
all_tests = db.query(Test).filter(Test.technique_id == technique.id).all()
This makes sessions testable (you can pass a mock or test session). However, the services still know the full ORM API — they construct queries, call commit(), and manage eager loading.
3. Do Services Contain Business Logic or Just CRUD?
Mixed. Services fall into three distinct categories.
Category A: Rich Business Logic (5 services)
These services contain genuine domain logic — rules, calculations, state machines, and business decisions:
| Service | Logic Type | Complexity |
|---|---|---|
test_workflow_service |
State machine with valid transition map, role-based guards, multi-step validation, retest chain management | High — 456 lines, 10+ public functions, embeds the test lifecycle rules |
scoring_service |
Multi-dimensional scoring algorithm with configurable weights, breakdown calculations, decay functions | High — 468 lines, complex math combining 5 weighted factors |
campaign_service |
Circular dependency detection, campaign progress calculation, auto-generation from threat actors | Medium — business rules for campaign management |
campaign_scheduler_service |
Recurring campaign scheduling, next-run calculation, campaign cloning | Medium — temporal business logic |
operational_metrics_service |
MTTD/MTTR calculation, detection efficacy, trend analysis with time windows | Medium — analytical business logic |
Category B: External Data Import (8 services)
These services handle fetching, parsing, and upserting data from external sources. They are more "integration logic" than "business logic":
| Service | External Source | Logic |
|---|---|---|
mitre_sync_service |
MITRE TAXII + GitHub | STIX 2.0 parsing, technique upsert |
atomic_import_service |
GitHub (ZIP) | YAML parsing, template creation |
sigma_import_service |
GitHub (ZIP) | YAML + ATT&CK tag extraction |
elastic_import_service |
GitHub (ZIP) | TOML parsing, rule creation |
caldera_import_service |
GitHub (ZIP) | YAML parsing, ability import |
d3fend_import_service |
D3FEND REST API | JSON parsing, mapping creation |
lolbas_import_service |
GitHub (ZIP) | YAML/Markdown parsing |
threat_actor_import_service |
GitHub (ZIP) | STIX 2.0 bundle parsing |
Category C: Thin CRUD Wrappers (7 services)
These services are essentially database operations with minimal logic:
| Service | What It Does | Lines of Logic |
|---|---|---|
audit_service |
log_action() — creates an AuditLog row |
~10 lines |
notification_service |
CRUD for notifications + notify_test_state_change() |
~30 lines of logic, rest is DB access |
status_service |
recalculate_technique_status() — counts tests by state, sets status |
~20 lines |
snapshot_service |
Creates snapshots by looping over techniques and calling scoring_service | Orchestration + DB writes |
score_cache |
In-memory dict with TTL | ~30 lines, pure caching |
compliance_import_service |
Parses NIST/CIS data and creates DB rows | Parsing + bulk insert |
intel_service |
Fetches RSS/feeds and creates IntelItem rows | Fetch + parse + insert |
The Missing Logic
Significant business logic that should be in services but lives in routers instead:
| Logic | Current Location | Should Be |
|---|---|---|
| ATT&CK Navigator layer generation | heatmap.py router (528 lines) |
heatmap_service or use case |
| Coverage report building | reports.py router (273 lines) |
report_service or use case |
| Coverage metrics aggregation | metrics.py router (316 lines) |
metrics_service |
| Detection rule CRUD + auto-association | detection_rules.py router (21 DB ops) |
detection_rule_service |
| Technique CRUD + review workflow | techniques.py router (12 DB ops) |
technique_service |
| Campaign full lifecycle | campaigns.py router (36 DB ops) |
Partially in campaign_service, but router does most CRUD |
4. Is Business Logic Separated from Persistence?
No. There is no separation boundary between business logic and persistence anywhere in the codebase.
The Dependency Graph
┌─────────────────────────────────────────────────────────────────────┐
│ PRESENTATION LAYER (Routers) │
│ │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │techniques│ │ heatmap │ │ reports │ │ campaigns│ ... │
│ │ 12 db.q │ │ 13 db.q │ │ 6 db.q │ │ 36 db.q │ │
│ └────┬─────┘ └────┬─────┘ └────┬─────┘ └────┬─────┘ │
│ │ direct │ direct │ direct │ direct │
│ │ │ │ │ + service │
├───────┼───────────────┼───────────────┼──────────────┼──────────────┤
│ SERVICE LAYER (Partial) │
│ │
│ ┌──────────────┐ ┌───────────┐ ┌──────────────────────────┐ │
│ │test_workflow │ │ scoring │ │ 8 import services │ │
│ │ 13 db.q │ │ 17 db.q │ │ 3-22 db.q each │ │
│ │ HTTPException │ │ settings │ │ + HTTP requests │ │
│ └──────┬───────┘ └─────┬─────┘ └────────────┬─────────────┘ │
│ │ │ │ │
├─────────┼──────────────────┼───────────────────────┼─────────────────┤
│ PERSISTENCE LAYER (SQLAlchemy — no abstraction) │
│ │
│ ┌─────────────────────────────────────────────────────────────────┐ │
│ │ db.query(Model).filter(...).all() ← called from EVERYWHERE │ │
│ │ db.add(instance) │ │
│ │ db.commit() │ │
│ │ db.refresh(instance) │ │
│ └─────────────────────────────────────────────────────────────────┘ │
│ │
│ Total: 225 db operations in routers + 172 in services = 397 total │
│ Spread across: 19 routers + 19 services = 38 files │
└─────────────────────────────────────────────────────────────────────┘
Why There Is No Separation
-
No Repository Pattern. There are no repository classes or functions that encapsulate database access. Every file that needs data constructs its own query.
-
No Domain Entity Layer. The SQLAlchemy models serve dual duty as both persistence mapping AND domain objects. There is no separate domain entity with business methods — the same
Testclass that defines the database table is passed around as the business object. -
No Abstraction Boundary. There is no interface (Protocol/ABC) anywhere in the codebase that separates "what data I need" from "how to get it from the database."
-
Services Commit Transactions. Some services call
db.commit()internally, while their calling routers may also calldb.commit(). There is no Unit of Work pattern governing transaction boundaries.
Concrete Example: Scoring a Technique
The scoring_service.calculate_technique_score() function mixes business logic and persistence in every line:
# Business logic (what to calculate) and persistence (how to get data)
# are interleaved — inseparable:
all_tests = db.query(Test).filter(Test.technique_id == technique.id).all() # ← persistence
validated_tests = [t for t in all_tests if t.state == TestState.validated] # ← logic
detected_tests = [t for t in validated_tests if t.detection_result == TestResult.detected] # ← logic
test_ratio = len(detected_tests) / len(validated_tests) # ← logic
test_score = round(test_ratio * w_tests, 1) # ← logic
rule_count = db.query(func.count(DetectionRule.id))...scalar() or 0 # ← persistence
rule_score = min(rule_count / 3.0, 1.0) * w_detection # ← logic
To test the scoring algorithm in isolation (without a database), you would need to refactor every query into a repository that can be mocked.
5. Is Infrastructure Decoupled from Logic?
No. Infrastructure concerns are embedded directly in both routers and services.
Infrastructure Dependency Map
| Infrastructure | Where It Bleeds Into Logic | Impact |
|---|---|---|
| SQLAlchemy ORM | 19 routers (225 ops) + 19 services (172 ops) = 38 files, 397 operations | Cannot switch ORM or use raw SQL without rewriting 38 files |
| FastAPI HTTPException | test_workflow_service.py, campaign_service.py (2 services) |
Business logic throws HTTP-specific exceptions — cannot reuse from CLI, workers, or pure tests |
| MinIO (boto3) | storage.py (well isolated) → called from evidence.py router |
Storage itself is clean, but the router handles presigned URL generation |
| APScheduler | mitre_sync_job.py → creates SessionLocal() directly → calls services |
Jobs bypass the DI system and create their own sessions |
app.config.settings |
scoring_service.py (reads weights), test_workflow_service.py (reads MAX_RETEST_COUNT), auth.py router (reads SECRET_KEY), scores.py router (mutates weights) |
Global mutable singleton accessed from multiple layers |
| External HTTP (requests/httpx) | 8 import services make outbound HTTP calls | Tightly coupled — cannot test import logic without network access or mocking requests |
What Is Well Isolated
| Component | Isolation Quality |
|---|---|
storage.py (MinIO) |
Good — thin wrapper with 3 functions (ensure_bucket_exists, upload_file, get_presigned_url). Only accessed from 1 router. |
auth.py (JWT/bcrypt) |
Good — self-contained module for token creation, verification, and password hashing. |
dependencies/auth.py |
Good — composable FastAPI Depends() chain for auth and RBAC. |
config.py (Settings) |
Partial — Pydantic Settings with env loading is clean, but the object is mutable and accessed as a global singleton. |
What Is Poorly Isolated
| Component | Problem |
|---|---|
| Database session lifecycle | get_db() is a generator injected via Depends() in routers, but services receive raw Session objects. Background jobs create sessions with SessionLocal() directly, bypassing the DI system entirely. |
| External API calls | Import services directly call requests.get() / httpx.get(). No port/adapter pattern — the HTTP client is an implementation detail embedded in business logic. |
| Scoring configuration | settings.SCORING_WEIGHT_* is read from a mutable global object. The scores.py router mutates it at runtime. No database-backed configuration. |
6. What Architecture Is Actually Implemented?
Classification: Inconsistent Layered Architecture with Partial Service Extraction
The codebase does not follow any named architectural pattern consistently. It is a hybrid of two approaches that were never unified:
Pattern 1: Transaction Script (60% of codebase)
Most routers follow the Transaction Script pattern — each endpoint is a self-contained script that receives a request, queries the database, applies logic, mutates data, and returns a response. All in one function:
HTTP Request → Router Function → [query DB → apply logic → write DB → return response]
Routers using this pattern: techniques, evidence, users, audit, reports, heatmap, metrics, detection_rules, threat_actors, data_sources, compliance, test_templates, d3fend, snapshots (partially)
Pattern 2: Service Layer (40% of codebase)
Some routers delegate complex operations to services:
HTTP Request → Router Function → Service Function → [query DB → apply logic → write DB]
→ return to router → return response
Routers using this pattern: tests (workflow), scores (scoring), notifications, operational_metrics, system (imports), campaigns (partially), snapshots (partially)
The Actual Dependency Direction
┌──────────────────────────────────────────┐
│ EVERYTHING DEPENDS ON │
│ │
│ SQLAlchemy Models (18 concrete classes) │
│ SQLAlchemy Session (passed everywhere) │
│ │
└──────────┬───────────────┬───────────────┘
│ │
┌─────────▼──────┐ ┌─────▼──────────┐
│ Routers │ │ Services │
│ (21 files) │ │ (20 files) │
│ 225 db ops │ │ 172 db ops │
│ import models │ │ import models │
│ import Session│ │ receive Session│
└────────┬───────┘ └────────┬────────┘
│ │
│ cross-reference │
│◄──────────────────►│
│ 13 routers import │
│ services │
│ 10 services import│
│ other services │
└────────────────────┘
The dependency direction is: everything points DOWN to SQLAlchemy. There is no inversion. The models are the center of gravity, not the domain logic.
Comparison with Named Architectures
| Architecture | Aegis Implementation | Verdict |
|---|---|---|
| Clean Architecture | No domain layer, no use cases, no ports/adapters, no dependency inversion | Not implemented |
| Hexagonal Architecture | No ports, no adapters, infrastructure is not pluggable | Not implemented |
| Layered Architecture | Layers exist (routers → services → models) but boundaries are not enforced — routers bypass the service layer freely | Partially implemented, inconsistently |
| Domain-Driven Design | Anemic models, no aggregates, no value objects, no domain events, no bounded contexts | Not implemented |
| Transaction Script | Most endpoints follow this pattern | De facto pattern for ~60% of code |
| Active Record | SQLAlchemy models don't have business methods (they're not Active Record either) | Not implemented |
Summary Classification
┌─────────────────────────────────────────────────────────────────┐
│ │
│ Architecture: Inconsistent Layered Monolith │
│ │
│ Dominant pattern: Transaction Script (routers as scripts) │
│ Secondary pattern: Service Layer (for complex workflows) │
│ │
│ Boundary enforcement: None │
│ Dependency direction: All code → SQLAlchemy (downward) │
│ Abstraction layers: Zero (no interfaces, no repositories) │
│ │
│ Files with direct DB access: 38 out of 41 (93%) │
│ Total scattered DB operations: 397 │
│ │
│ Well-designed components: │
│ - test_workflow_service (state machine) │
│ - scoring_service (algorithm — coupled to DB) │
│ - storage.py (clean MinIO wrapper) │
│ - dependencies/auth.py (composable auth chain) │
│ │
│ Poorly-designed components: │
│ - heatmap.py router (528 lines, 13 DB ops, zero delegation) │
│ - campaigns.py router (36 DB ops, partial delegation) │
│ - detection_rules.py router (21 DB ops, zero delegation) │
│ - test_templates.py router (20 DB ops, zero delegation) │
│ │
└─────────────────────────────────────────────────────────────────┘