fix: production detection only triggers on AEGIS_ENV=production, not SECRET_KEY presence
Some checks failed
Aegis CI / lint-and-test (push) Has been cancelled

This commit is contained in:
2026-02-20 17:20:48 +01:00
parent 309b3bc02d
commit abef2a45e0
2 changed files with 283 additions and 3 deletions

View File

@@ -7,9 +7,7 @@ from pydantic_settings import BaseSettings
# ---------------------------------------------------------------------------
# Detect environment: "production" when AEGIS_ENV or common indicators are set
# ---------------------------------------------------------------------------
_is_production = os.environ.get("AEGIS_ENV", "").lower() == "production" or bool(
os.environ.get("SECRET_KEY") # having an explicit SECRET_KEY hints prod
)
_is_production = os.environ.get("AEGIS_ENV", "").lower() == "production"
class Settings(BaseSettings):

282
docs/FEATURE_ROADMAP.md Normal file
View File

@@ -0,0 +1,282 @@
# Aegis — Feature Roadmap
> **Status:** Phase 0 (Foundations) completed. Platform ready for new feature development.
> **Architecture:** Clean Modular Monolith · 367+ tests · CI/CD · Zero tech debt
---
## Vision
Aegis evolves in three stages:
1. **Operational Features (Phases 1-7)** — Integrations, reporting, compliance, intelligence
2. **Detection Assurance Platform (Phases 8-14)** — Every detection has lifecycle, ownership, measurable health, and the system proactively orchestrates revalidation
3. **Enterprise Readiness (Phase 14)** — SSO/SAML and API keys for corporate deployment
---
## Dependency Map
```
Phase 0 (DONE) ───────────────────────────────────────────────────
│ │
├──► Phase 1 (Jira + Tempo) │
│ │
├──► Phase 2 (Reporting) │
│ └──► Phase 6 (Analytics + Webhooks) │
│ └──► Phase 13 (Intelligent Alerts) │
│ ▲ │
├──► Phase 3 (Compliance) │
│ └──► Phase 7 (Multi-Channel Notifications) │
│ │
├──► Phase 4 (Intel Auto) │
│ └──► Phase 12 (Risk Intelligence) │
│ ▲ │
├──► Phase 5 (Advanced Operations) │
│ │
├──► Phase 8 (Detection Lifecycle) ──────────────────────── │
│ │ │
│ ├──► Phase 9 (Ownership & Daily Ops) │
│ │ ├──► Phase 10 (Attack Paths) │
│ │ └──► Phase 12 (Risk Intelligence) │
│ │ │
│ ├──► Phase 11 (Knowledge Management) │
│ │ │
│ └──► Phase 13 (Intelligent Alerts) │
│ │
└──► Phase 14 (Enterprise SSO + API Keys) │
```
> **Parallelism:** Phases 1-7 and Phase 8 can run in parallel. Phases 9-13 are sequential on Phase 8. Phase 14 is independent.
---
## Phase 1 — Jira + Tempo Integration
**Dependencies:** Phase 0 (done)
| Feature | Description | User Value |
|---------|-------------|------------|
| Jira Link Management | Associate any Aegis entity (test, technique, campaign) with a Jira ticket via bidirectional links | Traceability between security testing and project management |
| Jira Issue Search & Auto-creation | Search Jira from Aegis; auto-create tickets from tests/campaigns with pre-filled data | Eliminates context-switching and manual ticket creation |
| Jira Bidirectional Sync | Hourly background sync pulls Jira status/assignee/priority; push test results as Jira comments | Single pane of glass for both teams |
| Tempo Worklog Integration | Automatically log time to Tempo when tests complete, using Jira link | Accurate time tracking without manual entry |
| Internal Audited Worklogs | Immutable internal time registry with SHA256 integrity hash | Compliance-grade time audit trail |
| Frontend: Jira Panel + Worklog Timeline | React components for linking issues and viewing worklog history in detail views | Self-service Jira integration for all team members |
---
## Phase 2 — Professional Reporting Engine
**Dependencies:** Phase 0 (done)
| Feature | Description | User Value |
|---------|-------------|------------|
| Report Template Engine | Jinja2-based HTML rendering with PDF (WeasyPrint) and DOCX (docxtpl) export | Professional branded reports for stakeholders |
| Purple Team Campaign Report | Executive summary, scope, techniques tested, critical findings, coverage evolution | Deliverable for management after every Purple Team exercise |
| Coverage & Executive Summary Reports | Coverage report, quarterly summary, technique detail — PDF/DOCX/HTML | Board-level reporting without spreadsheets |
| BI-Ready Analytics Endpoints | Flat JSON endpoints for coverage, tests, trends, operator metrics | Direct PowerBI/Tableau integration, zero ETL |
| Advanced Metrics | Coverage by tactic, never-tested techniques, avg validation time, detection trends | Operational KPIs for security leadership |
---
## Phase 3 — Compliance & Security Hardening
**Dependencies:** Phase 0 (done)
| Feature | Description | User Value |
|---------|-------------|------------|
| Enhanced Audit Trail | IP address, user agent, SHA256 integrity hash, session ID on every audit entry | SOC2 / ISO 27001 audit compliance |
| Login Attempt Auditing | Record all login successes/failures with IP; constant-time comparison | Security monitoring and incident response |
| Password & Username Validation | Minimum 10 chars with complexity; reserved username blocking; character whitelist | Credential hardening |
| Extended Rate Limiting | Per-endpoint limits: sync 2/hr, writes 30/min, uploads 10/min, reports 5/min | DDoS and abuse protection |
| Data Classification & Retention | Labels (public/internal/sensitive/restricted) on tests, evidence, campaigns; automated retention | Data governance compliance |
---
## Phase 4 — Automated Intelligence
**Dependencies:** Phase 0 (done)
| Feature | Description | User Value |
|---------|-------------|------------|
| OSINT Enrichment per Technique | Automatic CVE discovery via NVD API linked to ATT&CK techniques; weekly job | Proactive awareness of exploitable techniques |
| Stale Coverage Detection | Flag techniques with last validated test >12 months old; daily job | Prevents false sense of security from outdated validations |
> Note: Stale detection is superseded by Phase 8's Decay Engine but serves as a functional stepping stone.
---
## Phase 5 — Advanced Operational Management
**Dependencies:** Phase 0 (done)
| Feature | Description | User Value |
|---------|-------------|------------|
| Mature Composite Scoring | Recency decay factor (1.0 recent → 0.2 if >1yr); DB-persisted configurable weights | Scores reflect actual security posture, not just test count |
| Coverage Evolution & History | Enhanced snapshots with tactic breakdown, stale/never-tested counts; temporal comparison | Track security improvement over months/quarters |
---
## Phase 6 — BI Analytics + Webhooks
**Dependencies:** Phase 2
| Feature | Description | User Value |
|---------|-------------|------------|
| Webhook System | Configurable outbound HTTP on events (test validated, campaign completed, MITRE sync) with HMAC signatures | Real-time integration with Slack, Teams, SOAR, SIEM |
| Webhook Management | CRUD for configs; failure tracking; auto-disable on repeated failures | Self-service integrations for ops team |
---
## Phase 7 — Multi-Channel Notifications
**Dependencies:** Phase 3
| Feature | Description | User Value |
|---------|-------------|------------|
| Email Notifications | SMTP-based dispatch for critical events (test validated, campaign completed, new MITRE techniques) | Reach team members not actively in the platform |
| Per-User Notification Preferences | Configurable preferences per user: email on test validated, campaign completed, etc. | Users control their notification volume |
---
## Phase 8 — Detection Lifecycle Management (DLM)
**Dependencies:** Phase 0 (done). Can run in parallel with Phases 1-7.
> This is the transformational phase — Aegis evolves from MITRE tracker to Detection Assurance Platform.
| Feature | Description | User Value |
|---------|-------------|------------|
| Detection Assets | First-class entities for SIEM/EDR/Sigma/YARA/SPL/KQL rules with content hashing, version tracking, log source tracking | Every detection rule is a managed, versioned asset |
| Detection-Technique Mapping | N:M between detection assets and ATT&CK techniques with coverage type and confidence | Know exactly which detections cover which techniques |
| Detection Validations | Immutable records with expiry dates, environment snapshots, integrity hashes | Every detection has a "quality stamp" with an expiration date |
| Decay Engine | Configurable policies per platform/tactic; daily recalculation using recency, coverage, health, diversity factors | Automated detection of degrading security posture |
| Technique Confidence Scores | Composite 0-100 score with 4 factors and risk factor identification | Quantified confidence in detection capability per technique |
| Infrastructure Change Tracking | Log SIEM/EDR updates, parser changes, log source changes; auto-invalidate affected detections | No more silent detection failures after infrastructure changes |
| Configurable Decay Policies | Different decay rates for different platforms, asset types, or tactics | Policy flexibility for different risk appetites |
| DLM Dashboard | Health distribution, confidence distribution, expiring validations, infrastructure changes | Single-view detection health for CISO / SOC Manager |
---
## Phase 9 — Ownership & Daily Operations
**Dependencies:** Phase 8
| Feature | Description | User Value |
|---------|-------------|------------|
| Technique & Detection Ownership | Owner, backup owner, and team on every technique and detection rule | Clear accountability for every detection gap |
| Bulk Ownership Assignment | Assign by tactic, platform, or team; orphan detection report | Quick onboarding of ownership model |
| Revalidation Queue | Auto-generated prioritized queue from expired validations, infra changes, OSINT, MITRE updates | Analysts know exactly what to work on each day |
| Analyst Dashboard | Personalized daily view: pending revalidations, expiring validations, active tests, infra changes | "My workday" in one API call |
---
## Phase 10 — Attack Paths & Advanced Purple Team
**Dependencies:** Phases 8, 9
| Feature | Description | User Value |
|---------|-------------|------------|
| Attack Path Modeling | Chained attack scenarios (Initial Access → Execution → Persistence → Lateral Movement → Exfiltration) as first-class entities | Model realistic adversary behavior, not isolated techniques |
| Step-by-Step Execution | Execute attack paths step-by-step with detection tracking at each stage | Measure where in the kill chain detection fails |
| Collaborative Timeline | Real-time Red/Blue action recording with timestamps for MTTD/MTTR | Precise detection and response time measurement |
| Kill Chain Metrics | Auto-calculated detection rate, MTTD, furthest step reached without detection | Quantified Purple Team exercise results |
---
## Phase 11 — Knowledge Management
**Dependencies:** Phase 8
| Feature | Description | User Value |
|---------|-------------|------------|
| Playbooks per Technique | Attack, detect, investigate, respond, hunt — Markdown, versioned, with tools and prerequisites | Institutional knowledge capture; onboarding accelerator |
| Lessons Learned | Immutable records linked to tests, campaigns, attack paths: what happened, root cause, fix | Continuous improvement loop |
---
## Phase 12 — Risk Intelligence & Recommendations
**Dependencies:** Phases 4, 8, 9
| Feature | Description | User Value |
|---------|-------------|------------|
| Technique Risk Score | Multidimensional: exploitability, threat frequency, detection gap, staleness, tactic severity | Prioritize by actual risk, not just coverage status |
| Automated Recommendations | Prioritized actions: uncovered critical techniques, silent detections, orphan rules, tactic gaps | Intelligent prioritization out of the box |
---
## Phase 13 — Intelligent Operational Alerts
**Dependencies:** Phases 6, 8, 9, 12
| Feature | Description | User Value |
|---------|-------------|------------|
| Rule-Based Operational Alerts | Configurable rules evaluated hourly; multi-channel dispatch (in-app, email, webhook) | Proactive detection of operational issues |
| Pre-configured Alert Rules | Stale critical techniques, EDR update pending revalidation, new uncovered MITRE techniques, coverage regression, validation expiry wave | Operational intelligence out of the box |
---
## Phase 14 — Enterprise Readiness (SSO + API Keys)
**Dependencies:** Phase 0 (done)
| Feature | Description | User Value |
|---------|-------------|------------|
| API Key Management | Scoped API keys for BI tools, SOAR, scripts; SHA256-hashed; shown once on creation | Secure automated integrations without sharing user credentials |
| SSO / SAML 2.0 | Single Sign-On via SAML 2.0 with any IdP (Okta, Azure AD, etc.) | Enterprise authentication; eliminates password management |
---
## Phase Summary
| Phase | Name | Dependencies |
|-------|------|-------------|
| ~~0~~ | ~~Foundations~~ | **DONE** |
| 1 | Jira + Tempo | Phase 0 |
| 2 | Professional Reporting | Phase 0 |
| 3 | Compliance & Security | Phase 0 |
| 4 | Automated Intelligence | Phase 0 |
| 5 | Advanced Operations | Phase 0 |
| 6 | Analytics + Webhooks | Phase 2 |
| 7 | Multi-Channel Notifications | Phase 3 |
| **8** | **Detection Lifecycle (DLM)** | **Phase 0** |
| 9 | Ownership & Daily Ops | Phase 8 |
| 10 | Attack Paths & Purple Team | Phases 8, 9 |
| 11 | Knowledge Management | Phase 8 |
| 12 | Risk Intelligence | Phases 4, 8, 9 |
| 13 | Intelligent Alerts | Phases 6, 8, 9, 12 |
| 14 | Enterprise SSO + API Keys | Phase 0 |
---
## Recommended Additional Features
| # | Feature | Rationale | Suggested Phase |
|---|---------|-----------|-----------------|
| A1 | Role-customizable dashboard | CISO sees executive metrics, Red Tech sees pending tests | Phase 5 |
| A2 | ATT&CK Navigator layer import/export | Teams already use Navigator externally | Phase 2 |
| A3 | Approval workflow for scoring weight changes | Prevent unsupervised config changes | Phase 3 |
| A4 | Custom tags and fields | Every org has its own taxonomy | Phase 5 |
| A5 | Bulk operations | Validate/reject multiple tests at once, mass campaign assignment | Phases 5, 9 |
| A6 | Markdown in descriptions | Technicians want to format procedures | Phase 11 |
| A7 | Detection Rule Git Sync | Sync rules from corporate Git repo | Phase 8 |
| A8 | Confidence overlay on heatmap | Heatmap shows coverage + confidence as second layer | Phase 8 |
| A9 | Auto Detection Gap → Ticket pipeline | Red breaks something → auto-queue item → assign to Blue | Phase 10 |
| A10 | Navigator export with Confidence | Export layer including confidence level per technique | Phase 8 |
| A11 | Comparative Attack Path Results | Compare same path executed on different dates | Phase 10 |
| A12 | SLA Tracking for Detection Gaps | Measure time from gap to rule implementation | Phase 13 |
---
## New Python Dependencies by Phase
| Phase | Package | Purpose |
|-------|---------|---------|
| 1 | `atlassian-python-api` | Jira REST API |
| 1 | `tempo-api-python-client` | Tempo worklog API |
| 2 | `weasyprint` | HTML → PDF |
| 2 | `docxtpl` | DOCX template rendering |
| 11 | `markdown`, `Pygments` | Markdown rendering + syntax highlighting |
| 14 | `python3-saml` | SAML 2.0 SSO |