docs: enterprise refactor plan with ralph specs

This commit is contained in:
debian
2026-03-04 16:17:03 -05:00
parent 4c92712d20
commit f8191133c8
204 changed files with 32722 additions and 422 deletions

View File

@@ -0,0 +1,130 @@
# ABE — AI Bug Report Enrichment Specification
## Concepto
Este es el diferenciador más importante de ABE frente a cualquier competidor.
Después de detectar una anomalía, ABE puede usar una LLM para enriquecer
el bug report con un análisis inteligente: causa probable, impacto,
sugerencia de fix, y prompt listo para usar con Claude/GPT.
## IMPORTANTE: esto es una capa OPCIONAL sobre el core determinista.
El core engine nunca llama a LLMs. El enriquecimiento es post-procesado,
ejecutado solo si el usuario lo configura.
## Qué genera la IA
### 1. Root Cause Analysis
A partir del action trace, HTTP log, console errors y DOM snapshot,
la IA propone la causa más probable del bug.
Ejemplo: "The 500 error is likely caused by missing server-side validation
of the email field. The server crashes when receiving an empty string
where a valid email is expected."
### 2. User Impact Assessment
La IA evalúa el impacto del bug en términos de negocio:
"This bug blocks users from completing registration. Any user who
submits an empty email will encounter an unhandled server error,
preventing account creation."
### 3. Suggested Fix
La IA propone un fix concreto:
"Add server-side validation: check if email is present and valid
before processing. Return a 422 with a descriptive error message
instead of propagating the exception."
### 4. AI-Ready Debug Prompt
Un prompt completo listo para copiar y pegar en Claude/ChatGPT:
```
Bug Report Context:
- Type: HTTP 500 on form submission
- Steps to reproduce: [exact action trace]
- Error: [exact error message]
- Request: POST /api/register with body {"email": ""}
- Response: 500 Internal Server Error
Please analyze this bug and provide:
1. Root cause
2. Code fix
3. Test case to prevent regression
```
## Implementación
### Provider abstraction
```typescript
interface IAIProvider {
name: string;
enrich(anomaly: IAnomaly, context: IEnrichmentContext): Promise<IAIEnrichment>;
}
interface IEnrichmentContext {
domSnapshot: string;
httpLog: IHttpResponse[];
consoleErrors: string[];
actionTrace: IAction[];
pageTitle: string;
url: string;
}
interface IAIEnrichment {
rootCause: string;
userImpact: string;
suggestedFix: string;
debugPrompt: string;
confidence: 'low' | 'medium' | 'high';
generatedAt: number;
provider: string;
model: string;
}
```
### Providers implementados
- `ClaudeProvider` — usa Anthropic API (claude-3-5-haiku — rápido y barato)
- `OpenAIProvider` — usa OpenAI API (gpt-4o-mini)
- `OllamaProvider` — usa Ollama local (llama3.2 — sin API key, offline)
### Cuándo se ejecuta
- Automático: si `aiEnrichment.autoEnrich: true`, se ejecuta tras cada anomalía high/critical
- Manual: botón "Enrich with AI" en AnomalyDetail page
- No bloquea: el bug report se guarda sin enriquecimiento, la IA lo añade async
## Configuración en .env
```
ABE_AI_PROVIDER=claude # claude | openai | ollama | none
ABE_AI_API_KEY=sk-ant-xxx # Anthropic key (si provider=claude)
ABE_OPENAI_API_KEY=sk-xxx # OpenAI key (si provider=openai)
ABE_OLLAMA_URL=http://localhost:11434 # (si provider=ollama)
ABE_AI_MODEL=claude-haiku-4-5 # modelo específico (opcional)
ABE_AI_AUTO_ENRICH=false # default false para no incurrir en costes
ABE_AI_MIN_SEVERITY=high # solo enriquecer high/critical automáticamente
```
## Modelo de datos — añadir a SQLite
### Añadir columna a anomalies
```sql
ALTER TABLE anomalies ADD COLUMN ai_enrichment_json TEXT;
ALTER TABLE anomalies ADD COLUMN ai_enriched_at INTEGER;
```
## Frontend — AI panel en AnomalyDetail
Si la anomalía tiene ai_enrichment_json, mostrar panel "AI Analysis" con:
- 🔍 Root Cause (texto con ícono)
- 👥 User Impact (texto con ícono)
- 🔧 Suggested Fix (bloque de código si contiene código)
- 📋 "Copy debug prompt" button (copia el debugPrompt al clipboard)
- Badge: "Analyzed by Claude" / "Analyzed by GPT-4o-mini" / "Analyzed by Llama 3.2"
- Timestamp de cuándo se generó
Si no tiene enriquecimiento, mostrar botón "✨ Analyze with AI" que llama a:
POST /api/anomalies/:id/enrich
## Endpoint nuevo
### POST /api/anomalies/:anomalyId/enrich
Dispara el enriquecimiento de una anomalía concreta (async).
Response inmediata: { status: 'enriching' }
Cuando termina, emite WebSocket event: anomaly:enriched { anomalyId, enrichment }
### GET /api/anomalies/:anomalyId — actualizado
Incluye ai_enrichment si está disponible.

View File

@@ -0,0 +1,59 @@
# ABE — API Security Specification
## Authentication: API Key
All API endpoints require an API key passed in the header:
`X-ABE-API-Key: <key>`
If missing or invalid → 401 Unauthorized.
## Configuration
API key is set via environment variable: `ABE_API_KEY`
If not set, server logs a warning and runs without auth (dev mode only).
## Implementation
Create `src/server/middleware/auth.ts`:
```typescript
export function apiKeyAuth(req, res, next) {
const apiKey = process.env.ABE_API_KEY;
if (!apiKey) return next(); // dev mode: no auth
const provided = req.headers['x-abe-api-key'];
if (!provided || provided !== apiKey) {
return res.status(401).json({ error: 'Invalid or missing API key' });
}
next();
}
```
Apply this middleware to ALL routes EXCEPT:
- GET /health
- GET /ready
## CORS
Only allow requests from the frontend origin.
Configure via environment variable: `ABE_CORS_ORIGIN` (default: `http://localhost:5173`)
## Rate Limiting
Add `express-rate-limit`:
- Max 20 POST /api/sessions per hour per IP
- Max 200 requests per minute per IP for other endpoints
## Environment Variables (full list for .env)
```
ABE_API_KEY=change-me-in-production
ABE_CORS_ORIGIN=http://localhost:5173
ABE_PORT=3001
ABE_DB_PATH=./data/abe.db
ABE_REPORTS_DIR=./reports
ABE_LOGS_DIR=./logs
NODE_ENV=production
```
## docker-compose update
Add .env file support and environment variables to docker-compose.yml.
Add a volumes entry for `data/` directory for SQLite persistence.

View File

@@ -0,0 +1,187 @@
# ABE — API Server Specification
## Arquitectura general
```
React (puerto 5173)
↕ HTTP REST + WebSocket
API Server Express (puerto 3001)
↕ imports directos
ExplorationEngine (core)
```
El servidor vive en `src/server/` y es el único punto de entrada al motor desde el exterior. El frontend NUNCA importa código del core directamente.
---
## Tecnología del servidor
- Framework: Express.js
- WebSocket: socket.io (para streaming en tiempo real)
- Archivos: `src/server/index.ts` y `src/server/routes/`
---
## REST Endpoints
### POST /api/sessions
Lanza una nueva exploración.
Request body:
```json
{
"url": "http://localhost:3000",
"seed": 42,
"maxStates": 50
}
```
Response:
```json
{
"sessionId": "sess_abc123",
"status": "running",
"startedAt": "2025-01-15T10:00:00.000Z"
}
```
---
### GET /api/sessions
Lista todas las sesiones (activas e históricas).
Response:
```json
[
{
"sessionId": "sess_abc123",
"url": "http://localhost:3000",
"status": "running",
"startedAt": "2025-01-15T10:00:00.000Z",
"anomaliesFound": 3,
"statesVisited": 12
}
]
```
---
### GET /api/sessions/:sessionId
Detalle de una sesión específica.
Response:
```json
{
"sessionId": "sess_abc123",
"url": "http://localhost:3000",
"status": "completed",
"startedAt": "2025-01-15T10:00:00.000Z",
"finishedAt": "2025-01-15T10:05:00.000Z",
"statesVisited": 12,
"anomaliesFound": 3,
"seed": 42
}
```
---
### DELETE /api/sessions/:sessionId
Detiene una sesión activa.
Response:
```json
{ "stopped": true }
```
---
### GET /api/anomalies
Lista todas las anomalías encontradas (todas las sesiones).
Query params opcionales: `?sessionId=sess_abc123&severity=high`
Response:
```json
[
{
"id": "anom_a1b2c3",
"sessionId": "sess_abc123",
"type": "http_error",
"severity": "high",
"description": "Form returns HTTP 500 on empty email",
"timestamp": 1705312200000,
"screenshotUrl": "/api/anomalies/anom_a1b2c3/screenshot"
}
]
```
---
### GET /api/anomalies/:anomalyId
Detalle completo de una anomalía incluyendo pasos de reproducción.
Response: el objeto IAnomaly completo serializado (definido en interfaces.md)
---
### GET /api/anomalies/:anomalyId/screenshot
Devuelve la imagen PNG del screenshot de la anomalía.
Response: imagen binaria con Content-Type: image/png
---
### POST /api/anomalies/:anomalyId/replay
Lanza el replay de una anomalía específica.
Response:
```json
{
"replayId": "replay_xyz",
"status": "running"
}
```
---
## WebSocket Events (socket.io)
El cliente se conecta a `ws://localhost:3001` y escucha estos eventos:
### Eventos que emite el SERVIDOR → cliente
`session:started`
```json
{ "sessionId": "sess_abc123", "url": "http://localhost:3000" }
```
`state:discovered`
```json
{ "sessionId": "sess_abc123", "stateId": "s_xyz", "url": "/register", "title": "Register" }
```
`action:executed`
```json
{ "sessionId": "sess_abc123", "actionType": "click", "selector": "button#submit", "timestamp": 1705312197000 }
```
`anomaly:detected`
```json
{ "sessionId": "sess_abc123", "anomalyId": "anom_a1b2c3", "type": "http_error", "severity": "high", "description": "..." }
```
`session:completed`
```json
{ "sessionId": "sess_abc123", "statesVisited": 12, "anomaliesFound": 3 }
```
`session:error`
```json
{ "sessionId": "sess_abc123", "error": "Target URL unreachable" }
```
### Eventos que emite el CLIENTE → servidor
`session:stop`
```json
{ "sessionId": "sess_abc123" }
```

View File

@@ -0,0 +1,118 @@
# ABE — CLI & CI/CD Integration Specification
## CLI Entry Point
File: `src/cli.ts`
Script in package.json: `"abe": "ts-node src/cli.ts"`
Global after install: `npx abe` or `abe` if installed globally.
## CLI Usage
```bash
# Basic run
abe run --url http://localhost:3000
# With auth
abe run --url http://app.com \
--auth-type login_flow \
--login-url http://app.com/login \
--username test@app.com \
--password secret
# With scope limits
abe run --url http://app.com \
--max-states 30 \
--max-depth 4 \
--allowed-domains app.com
# CI mode: exit 1 if any anomaly found
abe run --url http://localhost:3000 --fail-on-anomaly
# CI mode: exit 1 only on high/critical anomalies
abe run --url http://localhost:3000 --fail-on-severity high
# Output formats
abe run --url http://localhost:3000 --output json # prints JSON summary to stdout
abe run --url http://localhost:3000 --output junit # generates junit.xml for CI
# Connect to a running ABE server instead of running inline
abe run --url http://localhost:3000 --server http://abe-server:3001 --api-key mykey
```
## Exit Codes
- 0 → exploration complete, no anomalies (or no anomalies above threshold)
- 1 → anomalies found above threshold
- 2 → exploration failed (target unreachable, auth failed, etc.)
## stdout JSON output (--output json)
```json
{
"sessionId": "sess_abc123",
"url": "http://localhost:3000",
"duration_ms": 45000,
"states_visited": 12,
"anomalies": [
{
"id": "anom_xyz",
"type": "http_error",
"severity": "high",
"description": "Form returns 500 on empty email",
"report_path": "reports/anom_xyz/report.json"
}
],
"exit_code": 1
}
```
## JUnit XML output (--output junit)
Generates `abe-results.xml` compatible with Jenkins, GitHub Actions, GitLab CI:
- Each anomaly = one failing test case
- Each explored state = one passing test case
## GitHub Actions Example Workflow
Create file: `.github/workflows/abe-example.yml` in the repo:
```yaml
name: ABE Exploratory Testing
on:
push:
branches: [main]
pull_request:
jobs:
explore:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Start application
run: docker-compose up -d app
# assumes the project has a docker-compose with the target app
- name: Wait for app
run: npx wait-on http://localhost:3000 --timeout 30000
- name: Run ABE
run: |
npm install -g abe-explorer # or: npx abe
abe run \
--url http://localhost:3000 \
--max-states 30 \
--fail-on-severity high \
--output junit
- name: Upload results
if: always()
uses: actions/upload-artifact@v4
with:
name: abe-reports
path: reports/
- name: Publish test results
if: always()
uses: EnricoMi/publish-unit-test-result-action@v2
with:
files: abe-results.xml
```

View File

@@ -0,0 +1,99 @@
# ABE — Database Specification (SQLite)
## Rationale
File-based storage loses all data on container restart.
SQLite requires zero extra services and is perfect for self-hosted deployment.
## Library
Use `better-sqlite3` (synchronous, faster than async alternatives for this use case).
## Location
Database file: `data/abe.db` (persisted via Docker volume)
## Schema
### Table: sessions
```sql
CREATE TABLE IF NOT EXISTS sessions (
id TEXT PRIMARY KEY,
url TEXT NOT NULL,
status TEXT NOT NULL DEFAULT 'running',
seed INTEGER NOT NULL,
max_states INTEGER NOT NULL DEFAULT 50,
states_visited INTEGER NOT NULL DEFAULT 0,
anomalies_found INTEGER NOT NULL DEFAULT 0,
started_at INTEGER NOT NULL,
finished_at INTEGER,
config_json TEXT NOT NULL DEFAULT '{}'
);
```
### Table: states
```sql
CREATE TABLE IF NOT EXISTS states (
id TEXT PRIMARY KEY,
session_id TEXT NOT NULL REFERENCES sessions(id),
url TEXT NOT NULL,
title TEXT NOT NULL,
dom_snapshot_path TEXT,
visit_count INTEGER NOT NULL DEFAULT 0,
discovered_at INTEGER NOT NULL
);
```
### Table: actions
```sql
CREATE TABLE IF NOT EXISTS actions (
id TEXT PRIMARY KEY,
session_id TEXT NOT NULL REFERENCES sessions(id),
state_id TEXT NOT NULL REFERENCES states(id),
type TEXT NOT NULL,
selector TEXT,
value TEXT,
url TEXT,
seed INTEGER NOT NULL,
executed_at INTEGER NOT NULL,
sequence_order INTEGER NOT NULL
);
```
### Table: anomalies
```sql
CREATE TABLE IF NOT EXISTS anomalies (
id TEXT PRIMARY KEY,
session_id TEXT NOT NULL REFERENCES sessions(id),
type TEXT NOT NULL,
severity TEXT NOT NULL,
description TEXT NOT NULL,
action_trace_json TEXT NOT NULL,
evidence_json TEXT NOT NULL,
screenshot_path TEXT,
dom_snapshot_path TEXT,
detected_at INTEGER NOT NULL
);
```
### Table: notifications
```sql
CREATE TABLE IF NOT EXISTS notifications (
id TEXT PRIMARY KEY,
anomaly_id TEXT NOT NULL REFERENCES anomalies(id),
channel TEXT NOT NULL,
status TEXT NOT NULL DEFAULT 'pending',
sent_at INTEGER,
error TEXT
);
```
## Repository Pattern
Create `src/db/` with:
- `src/db/connection.ts` — singleton SQLite connection, runs migrations on startup
- `src/db/SessionRepository.ts` — CRUD for sessions
- `src/db/AnomalyRepository.ts` — CRUD for anomalies, includes filter by session/severity
- `src/db/migrations.ts` — runs all CREATE TABLE IF NOT EXISTS on startup
## Rules
- All DB operations are synchronous (better-sqlite3 is sync)
- Repositories are injected into the API server, never imported directly by core engine
- The engine emits events → the API server listens and persists to DB

View File

@@ -0,0 +1,102 @@
# ABE — Docker Specification
## Objetivo
Permitir arrancar todo el proyecto (backend + frontend) con un solo comando:
docker-compose up --build
## Backend Dockerfile (raíz del proyecto)
```dockerfile
FROM node:20-alpine
WORKDIR /app
COPY package*.json ./
RUN npm install
COPY . .
RUN npm run build
EXPOSE 3001
CMD ["node", "dist/server/index.js"]
```
## Frontend Dockerfile (frontend/Dockerfile)
Usa build multistage: primero compila con Node, luego sirve con nginx.
```dockerfile
FROM node:20-alpine AS builder
WORKDIR /app
COPY package*.json ./
RUN npm install
COPY . .
RUN npm run build
FROM nginx:alpine
COPY --from=builder /app/dist /usr/share/nginx/html
COPY nginx.conf /etc/nginx/conf.d/default.conf
EXPOSE 80
```
## nginx.conf (frontend/nginx.conf)
Necesario para que React Router funcione correctamente (todas las rutas apuntan a index.html):
```nginx
server {
listen 80;
root /usr/share/nginx/html;
index index.html;
location / {
try_files $uri $uri/ /index.html;
}
location /api {
proxy_pass http://backend:3001;
}
location /socket.io {
proxy_pass http://backend:3001;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
}
}
```
## docker-compose.yml (raíz)
```yaml
version: '3.8'
services:
backend:
build:
context: .
dockerfile: Dockerfile
ports:
- "3001:3001"
environment:
- NODE_ENV=production
- PORT=3001
volumes:
- ./reports:/app/reports
- ./logs:/app/logs
networks:
- abe-network
frontend:
build:
context: ./frontend
dockerfile: Dockerfile
ports:
- "5173:80"
depends_on:
- backend
networks:
- abe-network
networks:
abe-network:
driver: bridge
```
## Notas importantes
- El frontend en producción (nginx) hace proxy de /api y /socket.io al backend
- Los volúmenes reports/ y logs/ persisten datos entre reinicios del contenedor
- El frontend se accede en http://localhost:5173
- El backend se accede en http://localhost:3001

View File

@@ -0,0 +1,84 @@
# ABE — Exploration Scope & Target Authentication Specification
## Exploration Config Object
This config is passed via POST /api/sessions and stored in sessions.config_json.
```typescript
interface ExplorationConfig {
// Scope
allowedDomains: string[]; // e.g. ["localhost", "myapp.com"] — never follow external links
maxStates: number; // default: 50 — stop after this many unique states
maxDepth: number; // default: 5 — max click depth from start URL
actionDelayMs: number; // default: 500 — wait between actions (politeness)
sessionTimeoutMs: number; // default: 300000 (5 min) — hard stop
// Exclusions
excludedPaths: string[]; // e.g. ["/logout", "/admin"] — never navigate here
excludedSelectors: string[]; // e.g. ["button.delete", "a[href*='delete']"]
// Target authentication
auth: AuthConfig | null;
// Fuzzing
fuzzingEnabled: boolean; // default: true
fuzzingIntensity: 'low' | 'medium' | 'high'; // default: 'medium'
}
type AuthConfig =
| { type: 'cookies'; cookies: Array<{ name: string; value: string; domain: string }> }
| { type: 'headers'; headers: Record<string, string> }
| { type: 'login_flow'; loginUrl: string; usernameSelector: string; passwordSelector: string; submitSelector: string; username: string; password: string }
```
## Scope Rules (enforced in PlaywrightAgent)
1. Before navigating to any URL, check if hostname is in allowedDomains. If not, skip.
2. Before executing any action, check if current path matches excludedPaths. If yes, skip.
3. Before clicking any element, check if it matches excludedSelectors. If yes, skip.
4. Stop exploration when statesVisited >= maxStates OR depth >= maxDepth OR elapsed > sessionTimeoutMs.
## Authentication Flow
### type: 'cookies'
Inject cookies before the first navigation using playwright context.addCookies().
### type: 'headers'
Set extra HTTP headers on the browser context using context.setExtraHTTPHeaders().
### type: 'login_flow'
Before starting exploration:
1. Navigate to loginUrl
2. Fill usernameSelector with username
3. Fill passwordSelector with password
4. Click submitSelector
5. Wait for navigation to complete
6. Verify we are no longer on loginUrl (if still there, login failed → abort session with error)
7. Proceed with exploration from startUrl
## Updated POST /api/sessions request body
```json
{
"url": "http://localhost:3000",
"seed": 42,
"config": {
"allowedDomains": ["localhost"],
"maxStates": 50,
"maxDepth": 5,
"actionDelayMs": 500,
"sessionTimeoutMs": 300000,
"excludedPaths": ["/logout"],
"excludedSelectors": [],
"auth": {
"type": "login_flow",
"loginUrl": "http://localhost:3000/login",
"usernameSelector": "input[name='email']",
"passwordSelector": "input[name='password']",
"submitSelector": "button[type='submit']",
"username": "test@example.com",
"password": "password123"
},
"fuzzingEnabled": true,
"fuzzingIntensity": "medium"
}
}
```

View File

@@ -0,0 +1,72 @@
# ABE — Frontend v2 Specification
## New pages and components to add
### New Page: Settings (ruta: /settings)
Sections:
1. API Key — show current key, button to copy
2. Notifications — form to set Slack webhook URL and min severity (calls PATCH /api/config)
3. Default Exploration Config — form with default values for maxStates, maxDepth, delay, excluded paths
4. About — version, links to docs
### Updated: NewSessionForm
Add fields:
- Allowed Domains (chips input, default: hostname of URL)
- Max States (number, default 50)
- Max Depth (number, default 5)
- Action Delay ms (number, default 500)
- Excluded Paths (chips input)
- Auth Type (select: none / cookies / headers / login_flow)
- If login_flow: show loginUrl, usernameSelector, passwordSelector, submitSelector, username, password
- If cookies: textarea for JSON cookie array
- If headers: key-value pairs input
- Fuzzing enabled (toggle)
- Fuzzing intensity (select: low / medium / high)
### Updated: Dashboard
Add stats bar at the top with 4 numbers:
- Total sessions
- Total anomalies found
- Critical/High anomalies (highlighted in red)
- Sessions running now
### Updated: AnomalyList
Add filter bar:
- Filter by severity (multi-select: low, medium, high, critical)
- Filter by type (multi-select: http_error, js_exception, etc.)
- Filter by session (dropdown)
- Search by description (text input)
- Sort by: newest first / severity desc
### Updated: AnomalyDetail
Add:
- Download button → downloads report.json
- Download MD button → downloads report.md
- Copy replay command button → copies `abe replay --anomaly-id anom_xxx` to clipboard
### New Component: SeverityBadge
Reusable badge component used everywhere:
- critical → red bg, white text
- high → orange bg, white text
- medium → yellow bg, dark text
- low → blue bg, white text
### New API endpoints needed (add to api-server spec)
PATCH /api/config
- Updates server config (slack webhook, min severity, defaults)
- Body: Partial<ServerConfig>
- Returns: updated ServerConfig
GET /api/config
- Returns current server config (without API key value)
GET /api/stats
- Returns: { totalSessions, totalAnomalies, criticalHighCount, runningSessions }
- Used by dashboard stats bar

View File

@@ -0,0 +1,99 @@
# ABE — Frontend Specification
## Tecnología
- React 18 + TypeScript
- Vite (bundler, más simple que webpack)
- TailwindCSS (estilos sin escribir CSS manual)
- socket.io-client (WebSocket)
- React Router v6 (navegación entre páginas)
## Ubicación
El frontend vive en `frontend/` en la raíz del proyecto, completamente separado de `src/`.
```
frontend/
├── src/
│ ├── pages/
│ │ ├── Dashboard.tsx ← página principal
│ │ ├── SessionDetail.tsx ← detalle de una sesión en vivo
│ │ └── AnomalyDetail.tsx ← detalle de un bug report
│ ├── components/
│ │ ├── NewSessionForm.tsx ← formulario para lanzar exploración
│ │ ├── SessionList.tsx ← lista de sesiones
│ │ ├── AnomalyList.tsx ← lista de anomalías
│ │ ├── LiveFeed.tsx ← stream en tiempo real de eventos
│ │ └── AnomalyCard.tsx ← tarjeta de una anomalía
│ ├── hooks/
│ │ ├── useSocket.ts ← conexión WebSocket reutilizable
│ │ └── useApi.ts ← fetch helper para la API REST
│ ├── types.ts ← tipos TypeScript del frontend (espejo de interfaces.ts)
│ ├── App.tsx ← router principal
│ └── main.tsx ← entry point
├── index.html
├── vite.config.ts
├── tailwind.config.ts
└── package.json
```
---
## Página 1 — Dashboard (ruta: `/`)
Contiene:
- Botón "New Exploration" que abre el formulario
- `NewSessionForm`: campos URL y Seed, botón Start
- `SessionList`: tabla con todas las sesiones (estado, URL, anomalías encontradas, fecha)
- `AnomalyList`: lista de las últimas anomalías de todas las sesiones
---
## Página 2 — Session Detail (ruta: `/sessions/:sessionId`)
Contiene:
- Header con URL explorada, estado (running/completed), seed
- Botón "Stop" si la sesión está activa
- `LiveFeed`: lista en tiempo real de eventos WebSocket
- Cada evento muestra icono + texto + timestamp
- Scroll automático al último evento
- Colores: verde para state:discovered, amarillo para action:executed, rojo para anomaly:detected
- `AnomalyList`: anomalías encontradas en esta sesión (se actualiza en tiempo real)
---
## Página 3 — Anomaly Detail (ruta: `/anomalies/:anomalyId`)
Contiene:
- Header con tipo, severidad (badge de color), descripción
- Sección "Reproduction Steps": lista numerada de acciones
- Sección "Evidence":
- Screenshot a tamaño completo (imagen)
- Botón para ver DOM snapshot (abre en nueva pestaña)
- Sección "HTTP Log": tabla con requests (URL, método, status, duración)
- Sección "Raw Errors": bloque de código con los errores textuales
- Botón "Run Replay": llama a POST /api/anomalies/:id/replay y muestra estado
---
## Colores de severidad (badges)
- critical → rojo (#ef4444)
- high → naranja (#f97316)
- medium → amarillo (#eab308)
- low → azul (#3b82f6)
---
## Conexión con la API
Todas las llamadas van a `http://localhost:3001`.
En `vite.config.ts` configurar proxy para `/api` y `/socket.io` apuntando a `localhost:3001`.
```typescript
// vite.config.ts
export default defineConfig({
plugins: [react()],
server: {
proxy: {
'/api': 'http://localhost:3001',
'/socket.io': { target: 'http://localhost:3001', ws: true }
}
}
})
```

View File

@@ -0,0 +1,94 @@
# ABE — Fuzzing / Disruption Module Specification
## Purpose
This is ABE's core differentiator. Instead of only clicking valid elements,
ABE injects abnormal inputs into forms to provoke unexpected server behavior.
## Architecture
```
src/plugins/fuzzers/
├── FuzzingEngine.ts ← orchestrator, decides when and how to fuzz
├── strategies/
│ ├── EmptyValueStrategy.ts
│ ├── OversizedStringStrategy.ts
│ ├── SpecialCharsStrategy.ts
│ ├── TypeMismatchStrategy.ts
│ └── BoundaryValueStrategy.ts
└── InputTypeDetector.ts ← detects field type from DOM attributes
```
## InputTypeDetector
Detects field type from: input[type], input[name], input[placeholder], label text, aria-label.
```typescript
type DetectedInputType =
| 'email' | 'password' | 'number' | 'date' | 'phone'
| 'url' | 'search' | 'text' | 'textarea' | 'select' | 'file'
```
## Fuzzing Strategies
### EmptyValueStrategy
Submits forms with all fields empty. Catches missing server-side validation.
Applies to: all input types.
Values: `""`, `" "` (space only), `"\t"` (tab).
### OversizedStringStrategy
Submits strings far beyond expected length. Catches buffer issues and UI overflow.
Applies to: text, email, password, textarea.
Values by intensity:
- low: 256 chars
- medium: 1024 chars
- high: 10000 chars + unicode chars
### SpecialCharsStrategy
Injects characters that break SQL, HTML, and shell contexts.
Applies to: text, email, search, textarea.
Values:
```
' OR 1=1 --
<script>alert(1)</script>
../../etc/passwd
${7*7}
\x00\x01\x02
```
### TypeMismatchStrategy
Submits wrong data types for the field.
- email field → "not-an-email", "12345", "@@@"
- number field → "abc", "-999999", "9.9.9", "NaN"
- date field → "yesterday", "32/13/2025", "0000-00-00"
- url field → "javascript:alert(1)", "not a url"
- phone field → "000", "++++", "abcdefghij"
### BoundaryValueStrategy
Tests values at the edges of expected ranges.
- number field → 0, -1, 2147483647, 2147483648, -2147483648
- date field → "1900-01-01", "2099-12-31", "1970-01-01"
## Fuzzing Execution Flow
```
For each form discovered in state:
1. InputTypeDetector analyzes each field
2. FuzzingEngine selects strategies based on fuzzingIntensity:
- low: EmptyValue + TypeMismatch only
- medium: + OversizedString + BoundaryValue
- high: + SpecialChars
3. For each strategy, fill all fields with fuzz values
4. Submit the form
5. Observe response via AnomalyDetector
6. Record results
```
## AnomalyDetector additions for fuzzing
Add these new anomaly types:
- `validation_bypass` — server accepted clearly invalid input (e.g. submitted empty required email, got 200)
- `server_error_on_fuzz` — server returned 500 on a fuzzed input
- `xss_reflection` — fuzzed script tag appears in response body
## Integration point
FuzzingEngine is called from ExplorationEngine AFTER normal action discovery,
only when `config.fuzzingEnabled === true`.
It is passed as an optional plugin, so the core engine doesn't depend on it directly.

View File

@@ -0,0 +1,164 @@
# ABE — Core Interfaces Specification
## Regla fundamental
`src/core/` solo puede importar desde este documento.
`src/plugins/` implementa estas interfaces, nunca al revés.
---
## IState
Representa un estado único de la aplicación explorada.
```typescript
interface IState {
id: string; // hash SHA1 del snapshot DOM + URL
url: string; // URL completa en este estado
title: string; // document.title
timestamp: number; // Date.now() cuando se capturó
domSnapshot: string; // outerHTML del body serializado
visitCount: number; // cuántas veces se ha visitado este estado
}
```
---
## IAction
Representa una acción que el agente puede ejecutar.
```typescript
interface IAction {
id: string; // uuid v4 generado al crear la acción
type: 'click' | 'fill' | 'navigate' | 'select' | 'submit';
selector?: string; // CSS selector del elemento (si aplica)
value?: string; // valor a introducir (para fill/select)
url?: string; // destino (solo para navigate)
timestamp: number; // cuando se ejecutó
seed: number; // semilla usada para selección aleatoria
stateId: string; // ID del estado desde el que se ejecutó
}
```
---
## IObservation
Lo que el agente observa DESPUÉS de ejecutar una acción.
```typescript
interface IObservation {
id: string; // uuid v4
actionId: string; // acción que provocó esta observación
newStateId: string; // ID del nuevo estado resultante
httpResponses: IHttpResponse[]; // todas las requests durante la acción
consoleErrors: string[]; // mensajes de console.error capturados
jsExceptions: string[]; // excepciones JS no capturadas
timestamp: number;
}
interface IHttpResponse {
url: string;
status: number;
method: string;
durationMs: number;
}
```
---
## IAnomaly
Una desviación detectada del comportamiento esperado.
```typescript
interface IAnomaly {
id: string; // uuid v4
type: AnomalyType;
severity: 'low' | 'medium' | 'high' | 'critical';
observationId: string; // observación que la provocó
actionTrace: IAction[]; // secuencia exacta de acciones que llevaron aquí
description: string; // texto legible explicando qué pasó
evidence: IAnomalyEvidence;
timestamp: number;
}
type AnomalyType =
| 'http_error' // respuesta HTTP 4xx o 5xx
| 'js_exception' // excepción JavaScript no capturada
| 'console_error' // console.error detectado
| 'navigation_fail' // navegación no completada
| 'element_missing' // elemento esperado desaparece
| 'timeout'; // acción excede tiempo límite
interface IAnomalyEvidence {
screenshotPath?: string; // ruta relativa al screenshot
domSnapshotPath?: string; // ruta relativa al DOM serializado
httpLog?: IHttpResponse[]; // requests relevantes
rawErrors?: string[]; // errores textuales originales
}
```
---
## IInteractionAgent (plugin interface)
Lo que cualquier agente de interacción debe implementar.
```typescript
interface IInteractionAgent {
launch(url: string): Promise<void>;
close(): Promise<void>;
discoverActions(state: IState): Promise<IAction[]>;
executeAction(action: IAction): Promise<IObservation>;
captureState(): Promise<IState>;
}
```
---
## ICollector (plugin interface)
Lo que cualquier colector de contexto debe implementar.
```typescript
interface ICollector {
name: string;
collect(anomaly: IAnomaly, agent: IInteractionAgent): Promise<IAnomalyEvidence>;
}
```
---
## IReproducer
Genera un script de replay a partir de una traza de acciones.
```typescript
interface IReproducer {
serialize(trace: IAction[]): string; // JSON serializado
deserialize(raw: string): IAction[]; // reconstruye la traza
generateScript(trace: IAction[]): string; // script Playwright ejecutable
}
```
---
## IExporter (plugin interface)
Transforma una anomalía en un reporte consumible.
```typescript
interface IExporter {
format: 'markdown' | 'json';
export(anomaly: IAnomaly, outputDir: string): Promise<string>; // retorna la ruta del archivo generado
}
```
---
## StateGraph
No es una interfaz pero su contrato debe ser explícito.
```typescript
class StateGraph {
addState(state: IState): void;
hasState(stateId: string): boolean;
recordTransition(fromId: string, action: IAction, toId: string): void;
getUnvisited(): IState[]; // estados con visitCount === 0
getNextToExplore(): IState | null; // heurística BFS por defecto
toJSON(): object; // serializable para logs
}
```

View File

@@ -0,0 +1,119 @@
# ABE — Multi-Browser, Mobile Emulation & Accessibility Specification
## Multi-browser testing
### Browsers soportados (via Playwright)
- chromium (Chrome/Edge) — siempre disponible
- firefox — opcional
- webkit (Safari) — opcional
### Configuración en ExplorationConfig
```typescript
browsers: Array<'chromium' | 'firefox' | 'webkit'>; // default: ['chromium']
```
### Comportamiento
Cuando se especifican múltiples browsers:
- ABE ejecuta la misma exploración en paralelo en cada browser
- Cada browser crea su propia sub-sesión con el mismo seed
- Los resultados se agrupan bajo la misma sesión padre
- Las anomalías incluyen qué browser las detectó
- Anomalías que aparecen en TODOS los browsers → severity += 1 level
- Anomalías que aparecen solo en un browser → añadir tag "browser-specific: webkit"
### Añadir a IAnomaly
```typescript
browser: 'chromium' | 'firefox' | 'webkit';
browserVersion: string;
```
---
## Mobile Viewport Emulation
### Devices predefinidos (usar Playwright devices)
```typescript
type MobileDevice =
| 'iPhone 14'
| 'iPhone 14 Pro Max'
| 'Pixel 7'
| 'Galaxy S23'
| 'iPad Pro'
| 'none' // desktop (default)
```
### En ExplorationConfig
```typescript
mobileDevice: MobileDevice; // default: 'none'
viewport: { width: number; height: number } | null; // override manual
```
### Implementación en PlaywrightAgent
```typescript
// Si mobileDevice !== 'none':
const device = playwright.devices[config.mobileDevice];
const context = await browser.newContext({ ...device });
```
### Anomalías específicas de mobile
Añadir tipo: `mobile_layout_issue` — detectado cuando:
- Un elemento clickable tiene menos de 44x44px (WCAG touch target)
- Hay scroll horizontal inesperado (viewport overflow)
- Un elemento está fuera del viewport en mobile
---
## Accessibility Testing (axe-core)
### Librería
Usar `@axe-core/playwright` (integración oficial axe + Playwright).
### Cuándo ejecutar
Después de cada acción que cambia el estado (navigation + click que resulta en nuevo estado).
NO ejecutar en cada acción fill (demasiado frecuente).
### Implementación
```typescript
import { checkA11y } from 'axe-playwright';
// En PlaywrightAgent, después de captureState():
async function runAccessibilityCheck(page: Page): Promise<IAccessibilityResult[]> {
const results = await checkA11y(page, undefined, {
detailedReport: true,
detailedReportOptions: { html: true },
});
return results.violations.map(v => ({
id: v.id,
impact: v.impact, // 'minor' | 'moderate' | 'serious' | 'critical'
description: v.description,
helpUrl: v.helpUrl,
nodes: v.nodes.length,
selector: v.nodes[0]?.target?.join(', '),
}));
}
```
### Nuevo tipo de anomalía
- type: `accessibility_violation`
- severity mapping desde axe impact:
- minor → low
- moderate → medium
- serious → high
- critical → critical
- description: "[axe] {violation.description}"
- evidence: { helpUrl, affectedNodes, wcagCriteria }
### En ExplorationConfig
```typescript
accessibility: {
enabled: boolean; // default: true
minImpact: 'minor' | 'moderate' | 'serious' | 'critical'; // default: 'serious'
wcagLevel: 'A' | 'AA' | 'AAA'; // default: 'AA'
}
```
### En el bug report
Añadir sección "Accessibility Violations" en report.md con:
- Lista de violaciones con impact badge
- Link a la documentación de cada regla (helpUrl de axe)
- Selector CSS del elemento afectado

View File

@@ -0,0 +1,88 @@
# ABE — Network Chaos Specification
## Concepto
Inspirado en Gremlin y LitmusChaos, pero aplicado a nivel de browser.
ABE puede simular condiciones de red adversas durante la exploración
para descubrir cómo se comporta el app en redes lentas, intermitentes,
o con servicios externos fallando.
## Esto es diferente al fuzzing de inputs:
- Fuzzing: inputs inválidos en formularios
- Network chaos: condiciones de red adversas (latencia, pérdida de paquetes, timeout)
## Implementación via Playwright CDP
Playwright expone Chrome DevTools Protocol (CDP) que permite controlar la red:
```typescript
// En PlaywrightAgent
async function applyNetworkCondition(condition: NetworkCondition): Promise<void> {
const client = await this.page.context().newCDPSession(this.page);
await client.send('Network.emulateNetworkConditions', {
offline: condition.offline,
downloadThroughput: condition.downloadKbps * 1024 / 8,
uploadThroughput: condition.uploadKbps * 1024 / 8,
latency: condition.latencyMs,
});
}
```
## Perfiles de red predefinidos
```typescript
const NETWORK_PROFILES = {
'fast-3g': { downloadKbps: 1500, uploadKbps: 750, latencyMs: 40, offline: false },
'slow-3g': { downloadKbps: 400, uploadKbps: 150, latencyMs: 400, offline: false },
'2g': { downloadKbps: 50, uploadKbps: 30, latencyMs: 800, offline: false },
'offline': { downloadKbps: 0, uploadKbps: 0, latencyMs: 0, offline: true },
'none': null // sin limitación (default)
}
```
## API request interception (simular servicios caídos)
```typescript
// Simular que un endpoint específico falla con 503
await page.route('**/api/payment**', route => {
route.fulfill({ status: 503, body: 'Service Unavailable' });
});
// Simular latencia en un endpoint específico
await page.route('**/api/search**', async route => {
await new Promise(r => setTimeout(r, 3000)); // 3s delay
route.continue();
});
```
## Configuración en ExplorationConfig
```typescript
networkChaos: {
enabled: boolean; // default: false
profile: keyof typeof NETWORK_PROFILES; // default: 'none'
blockedEndpoints: string[]; // glob patterns — responden 503
slowEndpoints: Array<{
pattern: string; // glob
delayMs: number;
}>;
}
```
## Anomalías específicas de network chaos
Añadir tipos al AnomalyDetector:
- `offline_handling_missing` — app muestra pantalla en blanco o error no controlado cuando está offline
- `slow_network_no_feedback` — con slow-3g, la app no muestra loading indicator (detectado si CLS=0 pero LCP>5000ms y no hay elemento con rol 'progressbar' o 'status')
- `external_service_crash` — cuando un endpoint bloqueado causa error 500 en el frontend
## Integración con el flujo de exploración
NetworkChaos se aplica de forma secuencial, no simultánea:
1. Primera pasada: exploración normal (baseline)
2. Segunda pasada (si networkChaos.enabled): misma seed, con perfil de red aplicado
3. Comparar resultados: nuevas anomalías que aparecen solo en la segunda pasada son network-related
## Frontend — Network Chaos Config
En NewSessionForm, añadir sección collapsible "Network Chaos":
- Toggle "Enable network chaos"
- Select perfil: Fast 3G / Slow 3G / 2G / Offline
- Textarea "Blocked endpoints" (uno por línea, glob patterns)
- Lista "Slow endpoints" con campo pattern + delay ms

View File

@@ -0,0 +1,64 @@
# ABE — Notifications Specification
## Purpose
When ABE finds an anomaly autonomously, notify the team immediately.
## Supported Channels
### 1. Slack Webhook
Environment variable: `ABE_SLACK_WEBHOOK_URL`
Payload sent to Slack on anomaly:detected:
```json
{
"text": "🐛 ABE found a bug!",
"blocks": [
{
"type": "section",
"text": {
"type": "mrkdwn",
"text": "*ABE Bug Report*\n*Severity:* 🔴 HIGH\n*Type:* http_error\n*Description:* Form returns HTTP 500 on empty email\n*Session:* sess_abc123\n*Target:* http://localhost:3000"
}
},
{
"type": "actions",
"elements": [
{
"type": "button",
"text": { "type": "plain_text", "text": "View Report" },
"url": "http://localhost:5173/anomalies/anom_abc123"
}
]
}
]
}
```
Only send for severity: high or critical (configurable via `ABE_NOTIFY_MIN_SEVERITY`).
### 2. Generic Webhook
Environment variable: `ABE_WEBHOOK_URL`
POST request with the full IAnomaly object as JSON body.
Includes header: `X-ABE-Event: anomaly.detected`
## Implementation
Create `src/server/notifications/`:
- `NotificationService.ts` — main service, called after anomaly is persisted to DB
- `SlackNotifier.ts` — implements Slack webhook
- `WebhookNotifier.ts` — implements generic webhook
NotificationService.notify(anomaly) is called from the API server
after every anomaly:detected event from the engine.
## Configuration (environment variables)
```
ABE_SLACK_WEBHOOK_URL=https://hooks.slack.com/services/xxx/yyy/zzz
ABE_WEBHOOK_URL=https://myapp.com/webhooks/abe
ABE_NOTIFY_MIN_SEVERITY=high # low | medium | high | critical
```
## Notification record
Every notification attempt (success or failure) is saved to the notifications table in SQLite.
Failed notifications are retried once after 60 seconds.

View File

@@ -0,0 +1,130 @@
# ABE — Output Format Specification
Cada anomalía genera DOS archivos en `reports/{anomaly-id}/`:
---
## 1. report.json — Para consumo por AI y tooling
```json
{
"version": "1.0",
"generated_at": "2025-01-15T10:30:00.000Z",
"environment": {
"target_url": "http://localhost:3000",
"abe_version": "0.1.0",
"os": "linux",
"node_version": "20.x"
},
"anomaly": {
"id": "anom_a1b2c3d4",
"type": "http_error",
"severity": "high",
"description": "Form submission returns HTTP 500 on empty email field",
"timestamp": 1705312200000
},
"reproduction": {
"seed": 42,
"steps": [
{
"step": 1,
"action_type": "navigate",
"url": "http://localhost:3000/register",
"timestamp": 1705312195000
},
{
"step": 2,
"action_type": "fill",
"selector": "input[name='email']",
"value": "",
"timestamp": 1705312196000
},
{
"step": 3,
"action_type": "click",
"selector": "button[type='submit']",
"timestamp": 1705312197000
}
]
},
"evidence": {
"screenshot": "screenshot.png",
"dom_snapshot": "dom.html",
"http_log": [
{
"url": "http://localhost:3000/api/register",
"method": "POST",
"status": 500,
"duration_ms": 234
}
],
"console_errors": [],
"js_exceptions": []
}
}
```
---
## 2. report.md — Para lectura humana
El archivo Markdown debe tener exactamente esta estructura:
```markdown
# Bug Report — [tipo de anomalía] — [fecha]
## Summary
[Una frase describiendo qué pasó y dónde]
## Severity
[low | medium | high | critical] — [justificación en una frase]
## Reproduction Steps
1. Navigate to `[url]`
2. [acción 2]
3. [acción 3]
...
**Seed used**: `42`
**Replay command**: `npm run replay -- --report reports/anom_a1b2c3d4/report.json`
## Observed Behavior
[Qué ocurrió exactamente — errores, respuestas HTTP, mensajes]
## Evidence
- Screenshot: `reports/anom_a1b2c3d4/screenshot.png`
- DOM Snapshot: `reports/anom_a1b2c3d4/dom.html`
- HTTP Log: [tabla con las requests relevantes]
## Raw Errors
\`\`\`
[errores textuales tal cual aparecieron]
\`\`\`
```
---
## Estructura de carpetas de salida
```
reports/
└── anom_a1b2c3d4/
├── report.json ← estructurado para AI
├── report.md ← legible para humanos
├── screenshot.png ← captura en el momento de la anomalía
└── dom.html ← snapshot completo del DOM
logs/
└── session_20250115_103000.jsonl ← una línea JSON por evento
```
---
## Formato del log de sesión (.jsonl)
Cada línea es un objeto JSON independiente:
```jsonl
{"event":"session_start","timestamp":1705312190000,"seed":42,"target":"http://localhost:3000"}
{"event":"state_discovered","timestamp":1705312191000,"state_id":"s_abc123","url":"/","title":"Home"}
{"event":"action_executed","timestamp":1705312196000,"action_id":"act_xyz","type":"fill","selector":"input[name='email']","value":""}
{"event":"anomaly_detected","timestamp":1705312197000,"anomaly_id":"anom_a1b2c3d4","type":"http_error","severity":"high"}
{"event":"session_end","timestamp":1705312210000,"states_visited":3,"anomalies_found":1}
```

View File

@@ -0,0 +1,124 @@
# ABE — Performance Metrics Specification
## Concepto
Durante la exploración, ABE captura métricas de rendimiento de cada
estado visitado. Inspirado en Checkly y Datadog RUM.
Esto permite detectar anomalías de rendimiento además de errores funcionales.
## Métricas capturadas por estado
```typescript
interface IPerformanceMetrics {
stateId: string;
url: string;
timestamp: number;
// Navigation Timing (disponibles via Playwright)
ttfb: number; // Time to First Byte (ms)
domContentLoaded: number; // DOMContentLoaded event (ms)
loadComplete: number; // Load event (ms)
// Core Web Vitals (via web-vitals library injected)
lcp: number | null; // Largest Contentful Paint (ms)
cls: number | null; // Cumulative Layout Shift (score)
fid: number | null; // First Input Delay (ms) - solo tras interacción
inp: number | null; // Interaction to Next Paint (ms)
// Resource counts
totalRequests: number;
failedRequests: number;
totalTransferSize: number; // bytes
}
```
## Implementación
### TTFB, DOMContentLoaded, Load
Via `page.evaluate()` usando `performance.timing` después de navigation:
```typescript
const timing = await page.evaluate(() => ({
ttfb: performance.timing.responseStart - performance.timing.requestStart,
domContentLoaded: performance.timing.domContentLoadedEventEnd - performance.timing.navigationStart,
loadComplete: performance.timing.loadEventEnd - performance.timing.navigationStart,
}));
```
### Core Web Vitals
Inyectar el script de `web-vitals` (npm) en la página:
```typescript
await page.addScriptTag({ url: 'https://unpkg.com/web-vitals/dist/web-vitals.iife.js' });
const vitals = await page.evaluate(() => new Promise(resolve => {
const result = {};
webVitals.getLCP(m => result.lcp = m.value);
webVitals.getCLS(m => result.cls = m.value);
webVitals.getINP(m => result.inp = m.value);
setTimeout(() => resolve(result), 3000); // wait 3s for vitals
}));
```
## Anomalías de rendimiento (nuevos tipos)
Añadir al AnomalyDetector con umbrales basados en Core Web Vitals de Google:
| Métrica | Good | Needs Improvement | Poor (anomalía) |
|---------|---------|-------------------|-----------------|
| LCP | <2500ms | 2500-4000ms | >4000ms → high |
| CLS | <0.1 | 0.1-0.25 | >0.25 → medium |
| INP | <200ms | 200-500ms | >500ms → high |
| TTFB | <800ms | 800-1800ms | >1800ms → medium|
Tipo de anomalía: `performance_degradation`
## Modelo de datos — añadir a SQLite
### Table: performance_metrics
```sql
CREATE TABLE IF NOT EXISTS performance_metrics (
id TEXT PRIMARY KEY,
session_id TEXT NOT NULL,
state_id TEXT NOT NULL,
url TEXT NOT NULL,
ttfb INTEGER,
dom_content_loaded INTEGER,
load_complete INTEGER,
lcp INTEGER,
cls REAL,
fid INTEGER,
inp INTEGER,
total_requests INTEGER,
failed_requests INTEGER,
total_transfer_size INTEGER,
captured_at INTEGER NOT NULL
);
```
## Frontend — Performance tab
Añadir tab "Performance" en SessionDetail:
- Tabla con todos los estados visitados y sus métricas
- Columnas con color coded: verde/amarillo/rojo según umbrales de Google
- Gráfico de barras: LCP por estado (para identificar páginas lentas)
- Summary cards: peor LCP, peor CLS, peor TTFB de la sesión
## En el bug report
Si hay anomalía performance_degradation, añadir sección en report.md:
```
## Performance Issue
- LCP: 5200ms (threshold: 4000ms) ❌
- CLS: 0.08 ✅
- TTFB: 2100ms (threshold: 1800ms) ❌
- Total page size: 4.2MB
```
## Configuración
Añadir a ExplorationConfig:
```typescript
performance: {
enabled: boolean; // default: true
lcpThresholdMs: number; // default: 4000
clsThreshold: number; // default: 0.25
inpThresholdMs: number; // default: 500
ttfbThresholdMs: number; // default: 1800
}
```

View File

@@ -0,0 +1,77 @@
# ABE — Production Hardening Specification
## Health Endpoints (no auth required)
### GET /health
Returns 200 if server is up.
```json
{ "status": "ok", "version": "0.1.0", "uptime_seconds": 3600 }
```
### GET /ready
Returns 200 if server is ready to accept requests (DB connected, no critical errors).
Returns 503 if not ready.
```json
{ "status": "ready", "db": "connected", "active_sessions": 2 }
```
Used by Docker HEALTHCHECK and Kubernetes readiness probes.
## Docker improvements
### Backend Dockerfile
Add HEALTHCHECK:
```dockerfile
HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
CMD curl -f http://localhost:3001/health || exit 1
```
### docker-compose.yml updates
- Add healthcheck to backend service
- Add `restart: unless-stopped` to both services
- Add `data/` volume for SQLite persistence
- Load `.env` file: `env_file: .env`
- Add `depends_on: backend: condition: service_healthy` to frontend
### .env.example file
Create `.env.example` in repo root with all variables and example values.
`.env` added to `.gitignore`.
## Error handling improvements
Global Express error handler in `src/server/index.ts`:
- Catch all unhandled errors
- Log with timestamp and stack trace
- Return consistent JSON error format:
```json
{ "error": "Internal server error", "code": "INTERNAL_ERROR", "timestamp": 1705312200000 }
```
Never expose stack traces in production (NODE_ENV=production).
## Graceful shutdown
On SIGTERM/SIGINT:
1. Stop accepting new sessions
2. Wait for active sessions to finish (max 30s)
3. Close DB connection
4. Exit 0
## Concurrency limits
- Max concurrent exploration sessions: configurable via `ABE_MAX_CONCURRENT_SESSIONS` (default: 3)
- If limit reached, POST /api/sessions returns 429 with:
```json
{ "error": "Max concurrent sessions reached", "active": 3, "limit": 3 }
```
## Logging improvements
Replace console.log with structured logger (use `pino`):
```typescript
log.info({ sessionId, url, event: 'session_started' }, 'Session started')
log.error({ anomalyId, error }, 'Failed to capture screenshot')
```
All logs go to stdout (Docker captures them).
Log level configurable via `ABE_LOG_LEVEL` env var (default: 'info').

View File

@@ -0,0 +1,138 @@
# ABE — Project Structure Specification
## Árbol completo de archivos a crear
```
abe/
├── src/
│ ├── core/
│ │ ├── interfaces.ts ← TODAS las interfaces (IState, IAction, etc.)
│ │ ├── StateGraph.ts ← implementación del grafo de estados
│ │ ├── ExplorationEngine.ts ← loop principal de exploración
│ │ └── AnomalyDetector.ts ← reglas heurísticas de detección
│ ├── plugins/
│ │ ├── agents/
│ │ │ └── PlaywrightAgent.ts ← implementa IInteractionAgent
│ │ ├── collectors/
│ │ │ ├── ScreenshotCollector.ts
│ │ │ ├── NetworkCollector.ts
│ │ │ └── DOMSnapshotCollector.ts
│ │ ├── exporters/
│ │ │ ├── MarkdownExporter.ts
│ │ │ └── JSONExporter.ts
│ │ └── reproducers/
│ │ └── PlaywrightReproducer.ts
│ └── index.ts ← punto de entrada, conecta todo
├── tests/
│ ├── core/
│ │ ├── StateGraph.test.ts
│ │ ├── ExplorationEngine.test.ts
│ │ └── AnomalyDetector.test.ts
│ └── plugins/
│ ├── agents/
│ │ └── PlaywrightAgent.test.ts
│ └── exporters/
│ ├── MarkdownExporter.test.ts
│ └── JSONExporter.test.ts
├── reports/ ← generado en runtime, ignorado por git
├── logs/ ← generado en runtime, ignorado por git
├── package.json
├── tsconfig.json
├── jest.config.ts
└── CLAUDE.md
```
---
## Reglas de importación — MUY IMPORTANTE
```
✅ PERMITIDO:
src/core/ExplorationEngine.ts → importa de src/core/interfaces.ts
src/plugins/agents/PlaywrightAgent.ts → importa de src/core/interfaces.ts
src/index.ts → importa de src/core/ Y src/plugins/
❌ PROHIBIDO:
src/core/ExplorationEngine.ts → importa de src/plugins/ (rompe el desacoplamiento)
src/plugins/agents/A.ts → importa de src/plugins/exporters/B.ts (plugins no se conocen entre sí)
```
---
## Cómo se conecta todo en src/index.ts
El archivo de entrada debe seguir este patrón:
```typescript
// src/index.ts
import { ExplorationEngine } from './core/ExplorationEngine';
import { StateGraph } from './core/StateGraph';
import { PlaywrightAgent } from './plugins/agents/PlaywrightAgent';
import { ScreenshotCollector } from './plugins/collectors/ScreenshotCollector';
import { NetworkCollector } from './plugins/collectors/NetworkCollector';
import { DOMSnapshotCollector } from './plugins/collectors/DOMSnapshotCollector';
import { JSONExporter } from './plugins/exporters/JSONExporter';
import { MarkdownExporter } from './plugins/exporters/MarkdownExporter';
import { PlaywrightReproducer } from './plugins/reproducers/PlaywrightReproducer';
const graph = new StateGraph();
const agent = new PlaywrightAgent();
const collectors = [new ScreenshotCollector(), new NetworkCollector(), new DOMSnapshotCollector()];
const exporters = [new JSONExporter(), new MarkdownExporter()];
const reproducer = new PlaywrightReproducer();
const engine = new ExplorationEngine({ graph, agent, collectors, exporters, reproducer });
engine.run({ url: process.argv[2] || 'http://localhost:3000', seed: 42 });
```
---
## package.json — scripts obligatorios
```json
{
"name": "abe",
"version": "0.1.0",
"scripts": {
"build": "tsc",
"test": "jest",
"typecheck": "tsc --noEmit",
"lint": "eslint src/ tests/",
"explore": "ts-node src/index.ts",
"replay": "ts-node src/replay.ts"
}
}
```
---
## tsconfig.json — configuración base
```json
{
"compilerOptions": {
"target": "ES2020",
"module": "commonjs",
"lib": ["ES2020"],
"outDir": "./dist",
"rootDir": "./src",
"strict": true,
"esModuleInterop": true,
"skipLibCheck": true,
"forceConsistentCasingInFileNames": true
},
"include": ["src/**/*"],
"exclude": ["node_modules", "dist", "tests"]
}
```
---
## jest.config.ts — configuración base
```typescript
export default {
preset: 'ts-jest',
testEnvironment: 'node',
roots: ['<rootDir>/tests'],
testMatch: ['**/*.test.ts'],
};
```

View File

@@ -0,0 +1,79 @@
# ABE — Scheduled Monitoring Specification
## Concepto
ABE puede ejecutar exploraciones de forma automática en intervalos definidos,
sin intervención humana. Esto convierte ABE de una herramienta manual
en un sistema de monitorización continua, al estilo Checkly.
## Modelo de datos — añadir a SQLite
### Table: schedules
```sql
CREATE TABLE IF NOT EXISTS schedules (
id TEXT PRIMARY KEY,
name TEXT NOT NULL,
url TEXT NOT NULL,
config_json TEXT NOT NULL,
cron_expression TEXT NOT NULL, -- e.g. "0 */6 * * *" (every 6h)
enabled INTEGER NOT NULL DEFAULT 1,
last_run_at INTEGER,
next_run_at INTEGER,
created_at INTEGER NOT NULL
);
```
## Expresiones cron soportadas (presets en la UI)
| Label | Cron |
|------------------|----------------|
| Every 15 minutes | */15 * * * * |
| Every hour | 0 * * * * |
| Every 6 hours | 0 */6 * * * |
| Every day at 2am | 0 2 * * * |
| Every Monday 9am | 0 9 * * 1 |
## Implementación
Usar `node-cron` para el scheduler.
Crear `src/server/scheduler/SchedulerService.ts`:
- En startup, carga todos los schedules con enabled=1 de la DB
- Registra un cron job por cada schedule
- Cuando dispara, llama internamente a POST /api/sessions con la config guardada
- Actualiza last_run_at y next_run_at en la DB después de cada disparo
- Si la sesión anterior sigue running, skip este tick y log warning
## API endpoints nuevos
### GET /api/schedules
Lista todos los schedules.
### POST /api/schedules
Crea un nuevo schedule.
Body:
```json
{
"name": "Production daily check",
"url": "https://myapp.com",
"config": { ... mismo ExplorationConfig ... },
"cronExpression": "0 2 * * *",
"enabled": true
}
```
### PATCH /api/schedules/:id
Actualiza o activa/desactiva un schedule.
### DELETE /api/schedules/:id
Elimina un schedule.
## Frontend — nueva sección en Settings
Añadir tab "Schedules" en /settings:
- Lista de schedules activos con: nombre, URL, cron, última ejecución, próxima ejecución, toggle activo/inactivo
- Botón "New Schedule" abre modal con: nombre, URL, config de exploración, selector de frecuencia (presets + custom cron)
- Badge "Running" si hay una sesión activa del schedule en este momento
## Notificaciones específicas de schedules
Cuando un schedule dispara una exploración y encuentra anomalías high/critical,
enviar notificación con el subject: "[SCHEDULED] ABE found bugs in {url}"

View File

@@ -0,0 +1,124 @@
# ABE — Visual Regression Testing Specification
## Concepto
ABE toma screenshots durante la exploración. En vez de solo guardarlos,
los compara contra una baseline aprobada para detectar cambios visuales
inesperados entre ejecuciones. Inspirado en Percy y Chromatic,
pero integrado directamente en el flujo de exploración autónoma.
## Cómo funciona
### Primera ejecución (sin baseline)
1. ABE explora el app, toma screenshots de cada estado descubierto
2. Todos los screenshots se marcan como "pending review" en la UI
3. El usuario aprueba o rechaza cada uno desde la GUI
4. Los aprobados se convierten en la BASELINE
### Ejecuciones posteriores
1. ABE explora el app, toma screenshots de cada estado
2. Para cada screenshot, busca la baseline correspondiente por state_id (hash DOM+URL)
3. Si no hay baseline: marcar como "new state", notificar
4. Si hay baseline: comparar usando pixelmatch (npm library)
5. Si diff > threshold (default 0.1%): crear anomalía tipo visual_regression
6. Si diff <= threshold: marcar como "passed"
## Librería de comparación
Usar `pixelmatch` (npm) para comparación pixel a pixel.
Usar `sharp` para resize y normalización de imágenes antes de comparar.
```typescript
import pixelmatch from 'pixelmatch';
import sharp from 'sharp';
async function compareScreenshots(
baselinePath: string,
currentPath: string,
diffOutputPath: string,
threshold: number = 0.1
): Promise<{ diffPixels: number; diffPercent: number; hasDiff: boolean }> {
// resize both to same dimensions, compare, generate diff image
}
```
## Modelo de datos — añadir a SQLite
### Table: visual_baselines
```sql
CREATE TABLE IF NOT EXISTS visual_baselines (
id TEXT PRIMARY KEY,
state_id TEXT NOT NULL,
url TEXT NOT NULL,
screenshot_path TEXT NOT NULL,
approved_at INTEGER NOT NULL,
approved_by TEXT DEFAULT 'user',
width INTEGER NOT NULL,
height INTEGER NOT NULL
);
```
### Table: visual_comparisons
```sql
CREATE TABLE IF NOT EXISTS visual_comparisons (
id TEXT PRIMARY KEY,
session_id TEXT NOT NULL,
state_id TEXT NOT NULL,
baseline_id TEXT,
current_screenshot_path TEXT NOT NULL,
diff_screenshot_path TEXT,
diff_pixels INTEGER,
diff_percent REAL,
status TEXT NOT NULL, -- 'passed' | 'failed' | 'new_state' | 'pending'
created_at INTEGER NOT NULL
);
```
## Nuevo tipo de anomalía
Añadir a AnomalyDetector:
- type: `visual_regression`
- severity: calculado por diff_percent:
- < 1% → low
- 1-5% → medium
- 5-15% → high
- > 15% → critical
- description: "Visual regression detected: X% of pixels changed"
- evidence: baseline screenshot + current screenshot + diff image (highlighted in red)
## Nuevo endpoint de API
### GET /api/visual/comparisons
Lista todas las comparaciones de la sesión más reciente.
Query: ?status=failed&sessionId=xxx
### POST /api/visual/baselines/:comparisonId/approve
Aprueba un screenshot como nueva baseline.
### POST /api/visual/baselines/:comparisonId/reject
Rechaza (anomalía confirmada, no actualizar baseline).
### POST /api/visual/baselines/approve-all
Aprueba todos los "new_state" pendientes de una sesión.
## Frontend — nueva sección Visual Review
Nueva página /visual-review:
- Grid de cards, cada una muestra: URL del estado, thumbnail del screenshot actual
- Filtros: passed | failed | new_state | pending
- Click en una card abre modal con:
- Vista lado a lado: baseline izquierda, actual derecha
- Vista diff: imagen con píxeles cambiados en rojo
- Porcentaje de cambio
- Botones: Approve as new baseline | Mark as bug | Ignore
- Bulk actions: "Approve all new states", "Mark all failed as bugs"
## Configuración
Añadir a ExplorationConfig:
```typescript
visualRegression: {
enabled: boolean; // default: true
threshold: number; // default: 0.001 (0.1%)
screenshotFullPage: boolean; // default: false (solo viewport)
ignoreSelectors: string[]; // e.g. [".timestamp", ".ad-banner"] — excluir zonas dinámicas
}
```