Skip to content

Architecture

Tripwire is two standalone apps over a shared engine, plus an MCP server and a CLI that reuse the same engine: a FastAPI backend (the JSON API + the execution engine) and a Vite + React dashboard.

Cursor / Claude Code ─► MCP server ─┐
CI / terminal ───────► CLI ─────────┤   (both import the engine directly)

Browser ──HTTP──► Dashboard (Vite/React SPA)
                      │  fetch /api/v1

                 Backend (FastAPI)
   api/v1/routes ─► services ─► store (file repos)
        │               │
        │               └─ engine ── Claude (brain) ─► Playwright (hands) ─► REAL browser
        ▼                                │
   suites · runs · issues ·              ▼  network capture → trace_id
   settings · generate ·        failure → server-log correlation → Claude → root cause
   analytics · export · health

Claude is the brain, Playwright is the hands

The execution core (engine/pw_agent.py + engine/pw.py) runs plain-English steps in a real browser on any OS — Windows, macOS, or Linux — headless or headed. No AppleScript, no Quartz, no OS permission prompts:

  • Claude is the brain. For each step it reads the live DOM and turns the plain-English intent into concrete browser actions.
  • Playwright is the universal hands. It launches and drives a headless Chromium that executes those actions, and captures the network so a UI failure still yields a trace_id.

There is also a legacy macOS desktop path (structure.py + mac_control.py) that drives the local browser via native Quartz events. It's optional and selected with TRIPWIRE_DRIVER=desktop. The default driver is playwright — cross-platform — and is what the CLI, MCP server, and GitHub Action all use.

The step toolbox

On the Playwright path, each step is driven by a small, explicit toolset the model calls:

ToolWhat it does
read_domRead the page: title, URL, visible text, and numbered interactive elements (#ref).
gotoNavigate to a path or URL.
click_elementClick an element by its #ref.
set_fieldSet an input / textarea / select by #ref (option text for selects); optionally submit.
press_keyPress a key or chord (Enter, Control+a).
create_fileCreate a file with given content — to produce a file to upload.
upload_fileUpload a file at a path into a file input by #ref.
doneThe step is complete.

Because the model both creates and uploads files, an import/upload flow needs no pre-made fixture file: "Create a small CSV and upload it as your import" just works.

Backend (backend/app)

Layered so each concern is swappable and testable in isolation.

LayerModuleResponsibility
corecore/config.py (pydantic-settings + data paths), errors.py, logging.py.
schemasschemas/models.pyPydantic request models.
storestore/db.py, store/models.py, store/repos.pySQLAlchemy-backed repositories — suites, runs, issues, plans, settings, users/API tokens — behind a clean interface. SQLite by default, Postgres in production.
servicesservices/runner (a durable, parallel worker pool that claims queued runs and executes them, streaming progress + persisting), auth (users, sessions, API tokens), and issues (the native File-to-Tripwire tracker + dedup, and dispatch to GitHub / GitLab / Jira).
api/v1api/v1/routes/The HTTP surface — see below.
engineengine/The execution core.

The engine modules

ModuleRole
pw.pyThe Playwright Driver — launch, navigate, page state, network capture, root cause.
pw_agent.pyThe cross-platform runner: per-step LLM loop + suite executor (the default path).
e2e.pySuite loading, ${var} substitution, fixtures, deterministic checks, model adjudication.
serverlogs.pyServer-log correlation + LLM synthesis: trace_id → backend error + cause + fix.
generate.pyURL → suite generation (crawl, read structure, propose a suite).
export_playwright.pyCompile a suite to a portable Playwright .spec.ts.
record.pyRecord-from-clicks: a recorded action trace → a plain-English suite.
analytics.pyFlake / run analytics over persisted runs.
reports.pyreport.json · junit.xml · report.html + provider-agnostic issues (fingerprinted).
trackers.pyLive issue posting to GitHub / GitLab / Jira, deduped by fingerprint.
structure.py, mac_control.py, agent.py, cdp.py, logs.pyThe legacy macOS desktop path + shared DOM-read JS and network/log capture.

The HTTP surface (/api/v1)

health · suites (CRUD + {name}/export) · runs (queue, inspect, queue state) · issues (the native tracker: list, transition, comment) · settings (secret-masked) · generate (URL → suite) · analytics (flake/run trends). Full details in the API reference.

Storage

Structured data lives in a SQL database via SQLAlchemy — SQLite by default (a single file under data/), Postgres in production (TRIPWIRE_DATABASE_URL). The tables: suites, runs, issues, plans, settings (secret values encrypted at rest), and users / api_tokens. Tables are created on startup; on first boot any legacy file-based data is imported once.

Run artifacts stay on the filesystem under data/ (served at /api/v1/artifacts):

data/
  tripwire.db    the default SQLite database (suites, runs, issues, plans, settings, users)
  artifacts/     report.json / junit.xml / report.html / per-step screenshots / issues
  baselines/     visual-diff baselines (screenshot_region checks)

The repository interface in store/repos.py keeps the same method signatures whether the backing store is SQLite or Postgres, so the services and API above it are unchanged.

Frontend (frontend/src)

A Vite + React + Tailwind SPA with TanStack Query as the data layer. It talks only to /api/v1, so the dashboard and any other client (scripts, CI, the MCP server) share exactly the same contract.

How a run flows

  1. The dashboard (or an API client) POSTs to /runs with a suite (or several). Each run is persisted as a queued row — a durable queue that survives a restart.
  2. The runner service runs a worker pool (TRIPWIRE_MAX_CONCURRENT_RUNS, default 2) that claims queued rows transactionally and executes them in parallel, streaming progress. A crash mid-run is reconciled on the next startup.
  3. The engine opens one Playwright browser for the whole suite; Claude drives each step; assertions are adjudicated (deterministically where a check exists).
  4. On a failure, the network buffer yields a trace_id; if a log backend is configured, serverlogs.py correlates it and Claude synthesizes the backend root cause.
  5. reports.py writes the artifacts and builds a deduped issue; services/issues or trackers.py files it. Results persist to store, and the dashboard renders live.

Continue to Runs & assertions.

Tripwire — AI-native, self-healing E2E testing. Terms · Privacy · Legal Notice