possibilities/claudetests-inventory.md

## claudetests-inventory.md

      
    Raw
  

              claudetests-inventory.md
            
          
    claudetests — Repo Inventory

E2E test harness for Claude Code. Drives real Claude sessions in tmux panes and asserts on their behavior.
File Tree

claudetests/
├── e2e.toml                                    # Config: claude binary path + config dir
├── .gitignore                                  # Ignores .test-config/
├── hook-inventory.md                           # Reference doc: yolo mode & permission hooks
│
├── lib/
│   └── e2e_harness.py                          # Core harness library (476 lines)
│
├── hooks/
│   └── session-state.py                        # Notification hook → JSONL state files
│
├── scripts/
│   ├── setup-auth.py                           # Manual auth: tmux pane for human OAuth
│   └── claude-auth-login-agent-browser.py      # Automated auth: pexpect + agent-browser + local callback replay
│
├── tests/
│   └── e2e/
│       └── test_noop.py                        # Smoke test: launch → idle → teardown
│
└── .test-config/                               # Isolated Claude config (gitignored)

Components

lib/e2e_harness.py — The Engine

Public API:


Function
What it does


create_pane(name)
Splits tmux, creates temp workdir with git init + settings.local.json hook config, seeds workspace trust


launch_claude(pane_id)
Sends claude --dangerously-skip-permissions with CLAUDE_CONFIG_DIR, waits for first idle


send_prompt(pane_id, text)
Types prompt into pane, waits for idle, returns captured output


wait_for_idle(pane_id)
Primary: polls JSONL session-state files for idle_prompt. Fallback: tmux regex (looks for % ctx + ❯ prompt, no spinner). 2-poll debounce at 0.5s


check_auth()
Checks claude auth status; if not logged in, runs tmux-based OAuth flow with agent-browser


capture_pane(pane_id)
Reads tmux pane content


kill_pane(pane_id)
Cleanup


Config bootstrapping force-merges hasCompletedOnboarding, theme: dark, autoUpdates: false, isTrusted into .test-config/.claude.json on every launch so Claude never shows onboarding dialogs.
hooks/session-state.py — Idle Detection Bridge

Registered per-test as a project-local Notification hook. Appends JSONL records ({ts, type, session_id, message, tmux_pane}) to ~/.local/state/claude/session-state/{session_id}. This is how the harness detects idle without scraping the terminal.
scripts/claude-auth-login-agent-browser.py — Automated OAuth

The sophisticated auth script (308 lines, committed). Uses pexpect to run claude auth login in a PTY, captures the OAuth URL, intercepts the BROWSER env var with a temp shell helper to grab the auto-auth URL Claude opens, opens that in agent-browser, clicks consent, polls browser navigation entries for the localhost/callback redirect, extracts code/state params, uses lsof to find Claude's local listener port, then replays the callback via curl. The whole OAuth handshake without a human.
scripts/setup-auth.py — Manual Auth Fallback

Opens a tmux pane running claude auth login so a human can complete the flow manually.
tests/e2e/test_noop.py — Smoke Test

check_auth() → create_pane("noop") → launch_claude() → assert "% ctx" in output → kill_pane(). 180s SIGALRM hard timeout.
hook-inventory.md — Permission System Reference

Catalogs the yolo-mode permission hook stack from the main arthack setup (6 layers: settings foundation → yolo auto-allow → protection hooks → ExitPlanMode auto-approval → side effects → command templates). Context documentation, not test infrastructure.
Git State


4 commits on main (1 unpushed: f6ca958 — the pexpect auth script)
docs/e2e-test-harness.md deleted (design spec superseded by implementation)
Most code is untracked: .gitignore, e2e.toml, hooks/, lib/, scripts/setup-auth.py, tests/
Only committed code beyond docs: scripts/claude-auth-login-agent-browser.py
No pyproject.toml — all scripts use inline uv run --script dependency metadata
One test exists. No test runner config.
Function	What it does
`create_pane(name)`	Splits tmux, creates temp workdir with git init + settings.local.json hook config, seeds workspace trust
`launch_claude(pane_id)`	Sends `claude --dangerously-skip-permissions` with `CLAUDE_CONFIG_DIR`, waits for first idle
`send_prompt(pane_id, text)`	Types prompt into pane, waits for idle, returns captured output
`wait_for_idle(pane_id)`	Primary: polls JSONL session-state files for `idle_prompt`. Fallback: tmux regex (looks for `% ctx` + `❯` prompt, no spinner). 2-poll debounce at 0.5s
`check_auth()`	Checks `claude auth status`; if not logged in, runs tmux-based OAuth flow with `agent-browser`
`capture_pane(pane_id)`	Reads tmux pane content
`kill_pane(pane_id)`	Cleanup
No results found