frankdilo/accountability-development-style-report.md

## accountability-development-style-report.md

      
    Raw
  

              accountability-development-style-report.md
            
          
    Accountability repo development style (detailed) — how to implement it in your repos

Purpose

This report distills the development style used in https://github.com/mikearnaldi/accountability into a reproducible playbook you can apply in your own repositories. It focuses on the workflow, artifacts, automation, and guardrails that make the "Ralph" agent loop practical and consistent.
High-level operating model (what the repo actually does)


Spec-first execution. Work is defined in specs/ documents, each with tasks/phases and status. The agent is expected to read specs, pick a task, implement, and update the spec.
Agent loop orchestration. A root script (ralph-auto.sh) runs the agent, feeds it a focused prompt, runs CI checks, and commits. The agent does not commit; the script does.
Strict CI gating. The loop requires green typecheck, lint, build, test (and optionally E2E) before commit. CI mirrors this with separate build and E2E jobs.
Full-stack alignment. CLAUDE.md mandates backend and frontend changes stay aligned and forbids frontend-only shortcuts.
Heavy testing and regression discipline. Frequent E2E fixes and explicit test counts in commits indicate an emphasis on stabilizing tests and tracking coverage.
Strong guardrails. Domain-specific lint rules and best-practice docs prevent certain classes of mistakes (e.g., any, direct fetch, localStorage).

Evidence and concrete signals


ralph-auto.sh implements the agent loop and auto-commit behavior.
RALPH_AUTO_PROMPT.md defines focus mode, one-task-per-iteration, CI requirements, and completion signals.
progress-auto.txt records iterations and tasks, showing repeated automated execution.
CLAUDE.md hard-codes architecture rules and prohibits certain tools (Docker).
.github/workflows/ci.yml runs typecheck + tests and Playwright E2E.
Commit history shows a significant number of auto commits:

Total commits: 437
feat(auto): commits: 81 (~18.5%)
Ralph-Auto-Iteration metadata: 80 commits


Core artifacts to implement in your repos

Create these files/directories as the minimum foundation.
1) Agent guide (top-level)

Purpose: single source of truth for architecture rules, boundaries, and non-negotiables.
In repo: CLAUDE.md (could be AGENTS.md in your repo).
What it contains (based on CLAUDE.md):

Architecture overview, data flow, package boundaries
Critical rules (full-stack alignment, no frontend-only hacks, etc.)
Must-run test commands
Tool bans (e.g., no Docker in their repo)
Pointers to specs and best practices

2) Specs directory

Purpose: canonical tasks and best-practices library.
In repo: specs/
Observed patterns:

Task specs with phases and checklists (see specs/E2E_TEST_COVERAGE.md)
Best-practices specs per layer (EFFECT_BEST_PRACTICES.md, REACT_BEST_PRACTICES.md, etc.)
Architecture guidance (UI_ARCHITECTURE.md, HTTP_API_TANSTACK.md, etc.)

3) Agent prompt template

Purpose: enforce workflow in every agent run.
In repo: RALPH_AUTO_PROMPT.md
Core rules from template:

Focus mode (only work on the user-specified prompt)
One task per iteration
Must update specs
Must pass CI before signaling completion
Output signals: TASK_COMPLETE and NOTHING_LEFT_TO_DO

4) Agent loop script

Purpose: run the agent, gate on CI, commit consistently.
In repo: ralph-auto.sh
Behavior observed:

Builds prompt by injecting spec list, progress log, and CI errors
Runs the agent in stream-json mode and filters output
Checks for TASK_COMPLETE or NOTHING_LEFT_TO_DO
Runs CI checks before commit
Auto-commits with standard message and iteration metadata

5) Progress log

Purpose: track iteration-level outputs and make history visible.
In repo: progress-auto.txt
Behavior observed:

Updated by the loop script before committing
Contains iteration number, timestamp, task summary, status

6) CI workflows

Purpose: keep the agent aligned with the same checks as CI.
In repo: .github/workflows/ci.yml
Observed setup:

Job 1: typecheck + unit tests
Job 2: Playwright E2E
Mirrors the loop's checks

7) Lint guardrails

Purpose: enforce architectural and security constraints in code.
In repo: eslint.config.mjs
Observed custom rules:

Enforce .ts/.tsx extensions for relative imports; no extensions for package imports
Ban disableValidation: true
Ban sql<Type>\...``
Prefer Option.fromNullable
Ban localStorage
Ban direct fetch
Ban window.location.href navigation

The Ralph Auto loop (how it works, step by step)

Runtime flow (from ralph-auto.sh)


Prereq checks: validate agent CLI, git repo, specs/, RALPH_AUTO_PROMPT.md.
Initial CI run: establishes baseline; failures are passed into next prompt.
Iteration loop:

Build prompt (spec list + focus + progress + CI errors).
Run agent.
Parse output for TASK_COMPLETE or NOTHING_LEFT_TO_DO.
If task complete: run CI, update progress-auto.txt, commit.
If no explicit completion but code changes exist: run CI and commit as partial.


Exit: print recent Ralph Auto commits and clean temp logs.

Commit format (from auto-commit function)

feat(auto): <task summary>

Ralph-Auto-Iteration: <n>

Automated commit by Ralph Auto loop.

Recommendation: preserve this convention in your repos for filtering and auditing.
Spec format patterns to copy

The style works best when specs are explicit about tasks and status.
Example (from specs/E2E_TEST_COVERAGE.md):

A "Current State" section with numbers
A "Coverage Gaps" section with explicit items
A multi-phase implementation plan with checkboxes
Detailed test patterns and examples
Notes on known limitations

Suggested template for your specs:

Context / background
Current state (metrics, baseline)
Goals and non-goals
Phased task list with checkboxes
Tests required per phase
Risks and dependencies
Status log / completion notes

Guardrails and conventions (how they enforce style)

The repo encodes its rules in both docs and tooling.
Docs (human rules)


CLAUDE.md: hard requirements (full-stack alignment, no Docker, etc.)
specs/EFFECT_BEST_PRACTICES.md: typed errors, no any, no casts, no catchAllCause
specs/TYPESCRIPT_CONVENTIONS.md: no barrels, .ts extensions, no /src/ imports

Tooling (automatic enforcement)


ESLint custom rules for architecture and security
TypeScript project references for incremental builds
CI checks that mirror local gating

Implementation advice:

Start with docs to define the rules.
Encode the most important rules in linting.
Fail CI on lint, typecheck, and tests.

How to implement this style in your repos

Below is a practical, step-by-step adoption plan.
Phase 0 — baseline hygiene


Define your project boundaries (backend vs frontend vs shared)
Write an agent guide (AGENTS.md or CLAUDE.md) with hard rules
Add lint, typecheck, and test scripts that can run headless

Phase 1 — specs as source of truth


Create specs/ with:

Feature specs that include tasks and phases
Best-practice docs for each layer


Require that specs be updated when tasks are completed

Phase 2 — agent loop automation


Add RALPH_AUTO_PROMPT.md with:

Focus mode
One task per iteration
CI gating requirement
Completion signals


Add ralph-auto.sh that:

Builds prompt from spec list + progress + CI errors
Runs the agent
Runs CI and commits


Add progress-auto.txt and update it automatically

Phase 3 — guardrails and enforcement


Add lint rules for your non-negotiables
Document any forbidden patterns (e.g., direct fetch, any)
Add an ESLint rule or code mod for each high-risk pattern

Phase 4 — testing discipline


Keep a specs/E2E_TEST_COVERAGE.md that explicitly tracks coverage
Require tests in the loop before completion
Add E2E to CI (preferably in a dedicated job)

Phase 5 — reference patterns (optional but powerful)


Vendor reference repos in repos/ for offline lookup
Document common search patterns for engineers and agents

Sample workflows you can copy

Workflow A — feature implementation


Write/extend a spec with phases and tests.
Run: ./ralph-auto.sh "Implement <feature>" --max-iterations 3
Review auto commits and spec updates.
Merge when CI is green.

Workflow B — test hardening


Add a coverage gap to specs/E2E_TEST_COVERAGE.md.
Run: ./ralph-auto.sh "Add E2E coverage for <module>" --e2e
Verify new tests and update coverage spec.

Workflow C — refactor with guardrails


Add a best-practice spec describing the target style.
Run: ./ralph-auto.sh "Standardize <component> to match spec"
Ensure lint rules enforce the new style.

Risks and mitigations


Risk: scope creep in agent runs.

Mitigation: strict focus prompts and one-task-per-iteration rules.


Risk: flaky E2E tests blocking automation.

Mitigation: bake stabilization tasks into specs; track flakiness explicitly.


Risk: spec drift (docs not updated).

Mitigation: enforce spec updates in prompt and reject completion without it.


Risk: CI cost too high for every iteration.

Mitigation: allow a --max-iterations flag and optional --e2e gating.


Suggested file layout for a new repo

.
├── AGENTS.md (or CLAUDE.md)
├── RALPH_AUTO_PROMPT.md
├── ralph-auto.sh
├── progress-auto.txt
├── specs/
│   ├── FEATURE_X.md
│   ├── BEST_PRACTICES_BACKEND.md
│   ├── BEST_PRACTICES_FRONTEND.md
│   └── E2E_TEST_COVERAGE.md
├── .github/workflows/ci.yml
└── eslint.config.mjs

What to copy verbatim vs adapt


Copy verbatim:

Prompt structure with focus mode and completion signals
Auto-commit metadata (Ralph-Auto-Iteration)
Progress log format


Adapt:

CI checks (match your tech stack)
Lint rules (your domain constraints)
Spec taxonomy and coverage docs


Bottom line

If you want the same development style, the core requirement is process automation + documentation discipline + strict CI gating. The accountability repo shows that agents can safely auto-commit when tasks are tightly scoped, specs are the source of truth, and guardrails are enforced by tooling.
No results found