Skip to content

Instantly share code, notes, and snippets.

@frankdilo
Last active January 18, 2026 08:40
Show Gist options
  • Select an option

  • Save frankdilo/0ed54f4157e37694f4cdaf4437bfc03d to your computer and use it in GitHub Desktop.

Select an option

Save frankdilo/0ed54f4157e37694f4cdaf4437bfc03d to your computer and use it in GitHub Desktop.
Accountability development style report

Accountability repo development style (detailed) — how to implement it in your repos

Purpose

This report distills the development style used in https://github.com/mikearnaldi/accountability into a reproducible playbook you can apply in your own repositories. It focuses on the workflow, artifacts, automation, and guardrails that make the "Ralph" agent loop practical and consistent.

High-level operating model (what the repo actually does)

  1. Spec-first execution. Work is defined in specs/ documents, each with tasks/phases and status. The agent is expected to read specs, pick a task, implement, and update the spec.
  2. Agent loop orchestration. A root script (ralph-auto.sh) runs the agent, feeds it a focused prompt, runs CI checks, and commits. The agent does not commit; the script does.
  3. Strict CI gating. The loop requires green typecheck, lint, build, test (and optionally E2E) before commit. CI mirrors this with separate build and E2E jobs.
  4. Full-stack alignment. CLAUDE.md mandates backend and frontend changes stay aligned and forbids frontend-only shortcuts.
  5. Heavy testing and regression discipline. Frequent E2E fixes and explicit test counts in commits indicate an emphasis on stabilizing tests and tracking coverage.
  6. Strong guardrails. Domain-specific lint rules and best-practice docs prevent certain classes of mistakes (e.g., any, direct fetch, localStorage).

Evidence and concrete signals

  • ralph-auto.sh implements the agent loop and auto-commit behavior.
  • RALPH_AUTO_PROMPT.md defines focus mode, one-task-per-iteration, CI requirements, and completion signals.
  • progress-auto.txt records iterations and tasks, showing repeated automated execution.
  • CLAUDE.md hard-codes architecture rules and prohibits certain tools (Docker).
  • .github/workflows/ci.yml runs typecheck + tests and Playwright E2E.
  • Commit history shows a significant number of auto commits:
    • Total commits: 437
    • feat(auto): commits: 81 (~18.5%)
    • Ralph-Auto-Iteration metadata: 80 commits

Core artifacts to implement in your repos

Create these files/directories as the minimum foundation.

1) Agent guide (top-level)

Purpose: single source of truth for architecture rules, boundaries, and non-negotiables. In repo: CLAUDE.md (could be AGENTS.md in your repo). What it contains (based on CLAUDE.md):

  • Architecture overview, data flow, package boundaries
  • Critical rules (full-stack alignment, no frontend-only hacks, etc.)
  • Must-run test commands
  • Tool bans (e.g., no Docker in their repo)
  • Pointers to specs and best practices

2) Specs directory

Purpose: canonical tasks and best-practices library. In repo: specs/ Observed patterns:

  • Task specs with phases and checklists (see specs/E2E_TEST_COVERAGE.md)
  • Best-practices specs per layer (EFFECT_BEST_PRACTICES.md, REACT_BEST_PRACTICES.md, etc.)
  • Architecture guidance (UI_ARCHITECTURE.md, HTTP_API_TANSTACK.md, etc.)

3) Agent prompt template

Purpose: enforce workflow in every agent run. In repo: RALPH_AUTO_PROMPT.md Core rules from template:

  • Focus mode (only work on the user-specified prompt)
  • One task per iteration
  • Must update specs
  • Must pass CI before signaling completion
  • Output signals: TASK_COMPLETE and NOTHING_LEFT_TO_DO

4) Agent loop script

Purpose: run the agent, gate on CI, commit consistently. In repo: ralph-auto.sh Behavior observed:

  • Builds prompt by injecting spec list, progress log, and CI errors
  • Runs the agent in stream-json mode and filters output
  • Checks for TASK_COMPLETE or NOTHING_LEFT_TO_DO
  • Runs CI checks before commit
  • Auto-commits with standard message and iteration metadata

5) Progress log

Purpose: track iteration-level outputs and make history visible. In repo: progress-auto.txt Behavior observed:

  • Updated by the loop script before committing
  • Contains iteration number, timestamp, task summary, status

6) CI workflows

Purpose: keep the agent aligned with the same checks as CI. In repo: .github/workflows/ci.yml Observed setup:

  • Job 1: typecheck + unit tests
  • Job 2: Playwright E2E
  • Mirrors the loop's checks

7) Lint guardrails

Purpose: enforce architectural and security constraints in code. In repo: eslint.config.mjs Observed custom rules:

  • Enforce .ts/.tsx extensions for relative imports; no extensions for package imports
  • Ban disableValidation: true
  • Ban sql<Type>\...``
  • Prefer Option.fromNullable
  • Ban localStorage
  • Ban direct fetch
  • Ban window.location.href navigation

The Ralph Auto loop (how it works, step by step)

Runtime flow (from ralph-auto.sh)

  1. Prereq checks: validate agent CLI, git repo, specs/, RALPH_AUTO_PROMPT.md.
  2. Initial CI run: establishes baseline; failures are passed into next prompt.
  3. Iteration loop:
    • Build prompt (spec list + focus + progress + CI errors).
    • Run agent.
    • Parse output for TASK_COMPLETE or NOTHING_LEFT_TO_DO.
    • If task complete: run CI, update progress-auto.txt, commit.
    • If no explicit completion but code changes exist: run CI and commit as partial.
  4. Exit: print recent Ralph Auto commits and clean temp logs.

Commit format (from auto-commit function)

feat(auto): <task summary>

Ralph-Auto-Iteration: <n>

Automated commit by Ralph Auto loop.

Recommendation: preserve this convention in your repos for filtering and auditing.

Spec format patterns to copy

The style works best when specs are explicit about tasks and status.

Example (from specs/E2E_TEST_COVERAGE.md):

  • A "Current State" section with numbers
  • A "Coverage Gaps" section with explicit items
  • A multi-phase implementation plan with checkboxes
  • Detailed test patterns and examples
  • Notes on known limitations

Suggested template for your specs:

  1. Context / background
  2. Current state (metrics, baseline)
  3. Goals and non-goals
  4. Phased task list with checkboxes
  5. Tests required per phase
  6. Risks and dependencies
  7. Status log / completion notes

Guardrails and conventions (how they enforce style)

The repo encodes its rules in both docs and tooling.

Docs (human rules)

  • CLAUDE.md: hard requirements (full-stack alignment, no Docker, etc.)
  • specs/EFFECT_BEST_PRACTICES.md: typed errors, no any, no casts, no catchAllCause
  • specs/TYPESCRIPT_CONVENTIONS.md: no barrels, .ts extensions, no /src/ imports

Tooling (automatic enforcement)

  • ESLint custom rules for architecture and security
  • TypeScript project references for incremental builds
  • CI checks that mirror local gating

Implementation advice:

  • Start with docs to define the rules.
  • Encode the most important rules in linting.
  • Fail CI on lint, typecheck, and tests.

How to implement this style in your repos

Below is a practical, step-by-step adoption plan.

Phase 0 — baseline hygiene

  • Define your project boundaries (backend vs frontend vs shared)
  • Write an agent guide (AGENTS.md or CLAUDE.md) with hard rules
  • Add lint, typecheck, and test scripts that can run headless

Phase 1 — specs as source of truth

  • Create specs/ with:
    • Feature specs that include tasks and phases
    • Best-practice docs for each layer
  • Require that specs be updated when tasks are completed

Phase 2 — agent loop automation

  • Add RALPH_AUTO_PROMPT.md with:
    • Focus mode
    • One task per iteration
    • CI gating requirement
    • Completion signals
  • Add ralph-auto.sh that:
    • Builds prompt from spec list + progress + CI errors
    • Runs the agent
    • Runs CI and commits
  • Add progress-auto.txt and update it automatically

Phase 3 — guardrails and enforcement

  • Add lint rules for your non-negotiables
  • Document any forbidden patterns (e.g., direct fetch, any)
  • Add an ESLint rule or code mod for each high-risk pattern

Phase 4 — testing discipline

  • Keep a specs/E2E_TEST_COVERAGE.md that explicitly tracks coverage
  • Require tests in the loop before completion
  • Add E2E to CI (preferably in a dedicated job)

Phase 5 — reference patterns (optional but powerful)

  • Vendor reference repos in repos/ for offline lookup
  • Document common search patterns for engineers and agents

Sample workflows you can copy

Workflow A — feature implementation

  1. Write/extend a spec with phases and tests.
  2. Run: ./ralph-auto.sh "Implement <feature>" --max-iterations 3
  3. Review auto commits and spec updates.
  4. Merge when CI is green.

Workflow B — test hardening

  1. Add a coverage gap to specs/E2E_TEST_COVERAGE.md.
  2. Run: ./ralph-auto.sh "Add E2E coverage for <module>" --e2e
  3. Verify new tests and update coverage spec.

Workflow C — refactor with guardrails

  1. Add a best-practice spec describing the target style.
  2. Run: ./ralph-auto.sh "Standardize <component> to match spec"
  3. Ensure lint rules enforce the new style.

Risks and mitigations

  • Risk: scope creep in agent runs.
    • Mitigation: strict focus prompts and one-task-per-iteration rules.
  • Risk: flaky E2E tests blocking automation.
    • Mitigation: bake stabilization tasks into specs; track flakiness explicitly.
  • Risk: spec drift (docs not updated).
    • Mitigation: enforce spec updates in prompt and reject completion without it.
  • Risk: CI cost too high for every iteration.
    • Mitigation: allow a --max-iterations flag and optional --e2e gating.

Suggested file layout for a new repo

.
├── AGENTS.md (or CLAUDE.md)
├── RALPH_AUTO_PROMPT.md
├── ralph-auto.sh
├── progress-auto.txt
├── specs/
│   ├── FEATURE_X.md
│   ├── BEST_PRACTICES_BACKEND.md
│   ├── BEST_PRACTICES_FRONTEND.md
│   └── E2E_TEST_COVERAGE.md
├── .github/workflows/ci.yml
└── eslint.config.mjs

What to copy verbatim vs adapt

  • Copy verbatim:
    • Prompt structure with focus mode and completion signals
    • Auto-commit metadata (Ralph-Auto-Iteration)
    • Progress log format
  • Adapt:
    • CI checks (match your tech stack)
    • Lint rules (your domain constraints)
    • Spec taxonomy and coverage docs

Bottom line

If you want the same development style, the core requirement is process automation + documentation discipline + strict CI gating. The accountability repo shows that agents can safely auto-commit when tasks are tightly scoped, specs are the source of truth, and guardrails are enforced by tooling.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment