Architecture
Tripwire is two standalone apps over a shared engine, plus an MCP server and a CLI that reuse the same engine: a FastAPI backend (the JSON API + the execution engine) and a Vite + React dashboard.
Cursor / Claude Code ─► MCP server ─┐
CI / terminal ───────► CLI ─────────┤ (both import the engine directly)
│
Browser ──HTTP──► Dashboard (Vite/React SPA)
│ fetch /api/v1
▼
Backend (FastAPI)
api/v1/routes ─► services ─► store (file repos)
│ │
│ └─ engine ── Claude (brain) ─► Playwright (hands) ─► REAL browser
▼ │
suites · runs · issues · ▼ network capture → trace_id
settings · generate · failure → server-log correlation → Claude → root cause
analytics · export · healthClaude is the brain, Playwright is the hands
The execution core (engine/pw_agent.py + engine/pw.py) runs plain-English steps in a real browser on any OS — Windows, macOS, or Linux — headless or headed. No AppleScript, no Quartz, no OS permission prompts:
- Claude is the brain. For each step it reads the live DOM and turns the plain-English intent into concrete browser actions.
- Playwright is the universal hands. It launches and drives a headless Chromium that executes those actions, and captures the network so a UI failure still yields a
trace_id.
There is also a legacy macOS desktop path (structure.py + mac_control.py) that drives the local browser via native Quartz events. It's optional and selected with TRIPWIRE_DRIVER=desktop. The default driver is playwright — cross-platform — and is what the CLI, MCP server, and GitHub Action all use.
The step toolbox
On the Playwright path, each step is driven by a small, explicit toolset the model calls:
| Tool | What it does |
|---|---|
read_dom | Read the page: title, URL, visible text, and numbered interactive elements (#ref). |
goto | Navigate to a path or URL. |
click_element | Click an element by its #ref. |
set_field | Set an input / textarea / select by #ref (option text for selects); optionally submit. |
press_key | Press a key or chord (Enter, Control+a). |
create_file | Create a file with given content — to produce a file to upload. |
upload_file | Upload a file at a path into a file input by #ref. |
done | The step is complete. |
Because the model both creates and uploads files, an import/upload flow needs no pre-made fixture file: "Create a small CSV and upload it as your import" just works.
Backend (backend/app)
Layered so each concern is swappable and testable in isolation.
| Layer | Module | Responsibility |
|---|---|---|
| core | core/ | config.py (pydantic-settings + data paths), errors.py, logging.py. |
| schemas | schemas/models.py | Pydantic request models. |
| store | store/db.py, store/models.py, store/repos.py | SQLAlchemy-backed repositories — suites, runs, issues, plans, settings, users/API tokens — behind a clean interface. SQLite by default, Postgres in production. |
| services | services/ | runner (a durable, parallel worker pool that claims queued runs and executes them, streaming progress + persisting), auth (users, sessions, API tokens), and issues (the native File-to-Tripwire tracker + dedup, and dispatch to GitHub / GitLab / Jira). |
| api/v1 | api/v1/routes/ | The HTTP surface — see below. |
| engine | engine/ | The execution core. |
The engine modules
| Module | Role |
|---|---|
pw.py | The Playwright Driver — launch, navigate, page state, network capture, root cause. |
pw_agent.py | The cross-platform runner: per-step LLM loop + suite executor (the default path). |
e2e.py | Suite loading, ${var} substitution, fixtures, deterministic checks, model adjudication. |
serverlogs.py | Server-log correlation + LLM synthesis: trace_id → backend error + cause + fix. |
generate.py | URL → suite generation (crawl, read structure, propose a suite). |
export_playwright.py | Compile a suite to a portable Playwright .spec.ts. |
record.py | Record-from-clicks: a recorded action trace → a plain-English suite. |
analytics.py | Flake / run analytics over persisted runs. |
reports.py | report.json · junit.xml · report.html + provider-agnostic issues (fingerprinted). |
trackers.py | Live issue posting to GitHub / GitLab / Jira, deduped by fingerprint. |
structure.py, mac_control.py, agent.py, cdp.py, logs.py | The legacy macOS desktop path + shared DOM-read JS and network/log capture. |
The HTTP surface (/api/v1)
health · suites (CRUD + {name}/export) · runs (queue, inspect, queue state) · issues (the native tracker: list, transition, comment) · settings (secret-masked) · generate (URL → suite) · analytics (flake/run trends). Full details in the API reference.
Storage
Structured data lives in a SQL database via SQLAlchemy — SQLite by default (a single file under data/), Postgres in production (TRIPWIRE_DATABASE_URL). The tables: suites, runs, issues, plans, settings (secret values encrypted at rest), and users / api_tokens. Tables are created on startup; on first boot any legacy file-based data is imported once.
Run artifacts stay on the filesystem under data/ (served at /api/v1/artifacts):
data/
tripwire.db the default SQLite database (suites, runs, issues, plans, settings, users)
artifacts/ report.json / junit.xml / report.html / per-step screenshots / issues
baselines/ visual-diff baselines (screenshot_region checks)The repository interface in store/repos.py keeps the same method signatures whether the backing store is SQLite or Postgres, so the services and API above it are unchanged.
Frontend (frontend/src)
A Vite + React + Tailwind SPA with TanStack Query as the data layer. It talks only to /api/v1, so the dashboard and any other client (scripts, CI, the MCP server) share exactly the same contract.
How a run flows
- The dashboard (or an API client)
POSTs to/runswith a suite (or several). Each run is persisted as a queued row — a durable queue that survives a restart. - The
runnerservice runs a worker pool (TRIPWIRE_MAX_CONCURRENT_RUNS, default 2) that claims queued rows transactionally and executes them in parallel, streaming progress. A crash mid-run is reconciled on the next startup. - The engine opens one Playwright browser for the whole suite; Claude drives each step; assertions are adjudicated (deterministically where a
checkexists). - On a failure, the network buffer yields a
trace_id; if a log backend is configured,serverlogs.pycorrelates it and Claude synthesizes the backend root cause. reports.pywrites the artifacts and builds a deduped issue;services/issuesortrackers.pyfiles it. Results persist tostore, and the dashboard renders live.
Continue to Runs & assertions.