Files
Autonomous-Bug-Explorer/.ralph/specs/legacy/production-hardening.md

2.2 KiB

ABE — Production Hardening Specification

Health Endpoints (no auth required)

GET /health

Returns 200 if server is up.

{ "status": "ok", "version": "0.1.0", "uptime_seconds": 3600 }

GET /ready

Returns 200 if server is ready to accept requests (DB connected, no critical errors). Returns 503 if not ready.

{ "status": "ready", "db": "connected", "active_sessions": 2 }

Used by Docker HEALTHCHECK and Kubernetes readiness probes.

Docker improvements

Backend Dockerfile

Add HEALTHCHECK:

HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
  CMD curl -f http://localhost:3001/health || exit 1

docker-compose.yml updates

  • Add healthcheck to backend service
  • Add restart: unless-stopped to both services
  • Add data/ volume for SQLite persistence
  • Load .env file: env_file: .env
  • Add depends_on: backend: condition: service_healthy to frontend

.env.example file

Create .env.example in repo root with all variables and example values. .env added to .gitignore.

Error handling improvements

Global Express error handler in src/server/index.ts:

  • Catch all unhandled errors
  • Log with timestamp and stack trace
  • Return consistent JSON error format:
{ "error": "Internal server error", "code": "INTERNAL_ERROR", "timestamp": 1705312200000 }

Never expose stack traces in production (NODE_ENV=production).

Graceful shutdown

On SIGTERM/SIGINT:

  1. Stop accepting new sessions
  2. Wait for active sessions to finish (max 30s)
  3. Close DB connection
  4. Exit 0

Concurrency limits

  • Max concurrent exploration sessions: configurable via ABE_MAX_CONCURRENT_SESSIONS (default: 3)
  • If limit reached, POST /api/sessions returns 429 with:
{ "error": "Max concurrent sessions reached", "active": 3, "limit": 3 }

Logging improvements

Replace console.log with structured logger (use pino):

log.info({ sessionId, url, event: 'session_started' }, 'Session started')
log.error({ anomalyId, error }, 'Failed to capture screenshot')

All logs go to stdout (Docker captures them). Log level configurable via ABE_LOG_LEVEL env var (default: 'info').