This document defines a machine-readable, state-tracked workflow for AI-driven code refactoring using the Test-Driven Development (TDD) Red → Green → Refactor cycle.
It enforces:
- One branch / one PR per issue
- Strict RED/GREEN/REFACTOR phases with mandatory relevant test passing at each step
- Automatic detection and correction of violations (root-cause fixes only, no shortcuts)
- Explicitly forbidden shortcuts that mask failing tests
- Iterative cycles within a single work stream, with cycle numbers tracked
- State logging of every action, commit, test result, and violation for auditability
- PR descriptions containing a full cycle trace and state log link
- This is designed for AI agents that must remain fully compliant, auditable, and iterative during software changes, preventing premature phase completion or skipped tests.
I currently use it for curosr rules but you could also use this prompt for claude code
BEGIN REFACTOR PROCEDURE — STRICT TEST ENFORCEMENT
1. PRINCIPLES
1.1 All work is done per-issue (one issue → one branch → one PR).
1.2 Follow Test-First TDD exactly: RED → GREEN → REFACTOR.
1.3 Never take shortcuts that mask failing tests (see disallowed actions).
2. RED / GREEN / REFACTOR (Definitions & Enforcement)
2.1 RED:
- Add a new automated test that currently FAILS because the feature/behavior is not implemented or is broken.
- Commit immediately with a single-line Conventional Commit message that includes the phase tag.
Example: `test(auth): add failing test for password reset [RED]`
2.2 GREEN:
- Implement the minimal code necessary to make the failing test(s) introduced in RED pass.
- **Requirement:** Before declaring GREEN complete, run the full set of "relevant tests" (see §5). **All** relevant tests must pass.
- If any relevant test still fails, do not mark GREEN. Follow the auto-correction root-cause steps in §3.
- Commit immediately after GREEN change with a single-line message that includes the phase tag.
Example: `fix(auth): implement password reset to satisfy tests [GREEN]`
2.3 REFACTOR:
- Improve code quality/structure without changing externally visible behavior.
- After REFACTOR, **all relevant tests must still pass** before marking REFACTOR complete.
- Commit with a single-line message including the phase tag.
Example: `refactor(auth): simplify password reset flow [REFACTOR]`
2.4 ITERATIVE NATURE OF TDD:
- Treat RED → GREEN → REFACTOR as a continuous, repeatable cycle.
- Within the same branch, PR, or work stream, expect to perform multiple TDD cycles for different behaviors, edge cases, or refinements.
- Each new test that introduces a failing condition (RED) begins a new cycle.
- After completing GREEN and REFACTOR for that failing test, return to RED for the next desired behavior or scenario.
- Maintain state log entries for each individual cycle so they can be independently reviewed.
3. VIOLATION HANDLING — AUTO-CORRECTION & ROOT-CAUSE RULES
3.1 Pre-execution check: before any step, compute relevant tests (see §5) and confirm they are runnable.
3.2 If a GREEN attempt would leave any relevant test failing:
- Reproduce the failing test(s) locally and capture the stack trace.
- Perform root-cause analysis (trace failing assertion → code path → commit/diff) and implement the actual fix that addresses the underlying defect.
- Re-run relevant tests. Repeat until they pass or the issue is provably unresolvable.
- Log each attempt and the fix applied.
3.3 If the agent attempts one of the disallowed shortcuts (see §6), treat this as a violation. Attempt to revert the shortcut and fix root cause. If unable, STOP and emit `UNRESOLVABLE RULE VIOLATION` with details.
3.4 Outputs on any violation attempt:
- If auto-fixed: `RULE VIOLATION DETECTED → AUTO-CORRECTED: <step> — <short reason>`
- If cannot auto-fix: `UNRESOLVABLE RULE VIOLATION: <step> — <reason>`
4. COMMITS / MESSAGES (format + required content)
4.1 Every commit must be a single line and follow Conventional Commits.
4.2 Every commit message must include the TDD phase in brackets: `[RED]`, `[GREEN]`, or `[REFACTOR]`.
4.3 If tests are modified, the state log must include a human-readable justification (see §7). Example commit: `fix(api): correct response shape to satisfy tests [GREEN]`
5. RELEVANT TEST SELECTION (how to determine what must pass)
5.1 Construct the set of relevant tests for this change by:
a. Listing files modified in the branch (diff vs base).
b. Selecting tests that:
- Import or reference any modified file/module (static-search for imports, symbols).
- Reside in the same package/directory as modified files.
- Were added/updated in this branch.
c. If available, include tests that coverage tools show exercise the modified code.
d. If uncertain, widen to run all tests in the same module/package.
5.2 The agent must run the relevant-test set after RED, after GREEN, and after REFACTOR for **each cycle**.
5.3 It is acceptable to skip running unrelated test suites for speed, **but not** the relevant tests defined above.
6. DISALLOWED SHORTCUTS (explicitly forbidden actions)
6.1 Never do any of the following to make tests "pass":
- Commenting out or deleting failing asserts in tests.
- Adding test skips/xfail markers to suppress failures as a permanent fix.
- Catching and silencing exceptions in production code purely to hide test failures.
- Weakening assertions (e.g., changing `== expected` to `in` or `is not None`) to accept incorrect behavior.
- Changing test expectations to match buggy production behavior without fixing the root cause.
- Permanently altering CI/test-runner configuration to avoid running the failing tests.
6.2 Allowed exception: modifying a test is permitted **only** if the test itself is proven incorrect (e.g., wrong expected value or outdated spec). In that case:
- Document the justification in the state log (see §7).
- The code causing the test to be wrong must not be unchanged for convenience; the test change must be minimal and clearly justified.
7. STATE TRACKING (mandatory, expanded)
7.1 After every action update the state log (exportable JSON). Each entry must include:
- Timestamp.
- Cycle number.
- Branch name and issue ID.
- Action performed (RED/GREEN/REFACTOR/TEST-RUN/COMMIT/PR).
- Files changed (with diffs or git hashes).
- Commit hash and single-line commit message.
- Current phase.
- List of relevant tests run, results (pass/fail) and full failing stack traces.
- If a failing test was fixed, record whether the fix was:
• root-cause code fix, OR
• test modification (include TEST_CHANGE_JUSTIFICATION text).
- Any RULE VIOLATION DETECTED entries and correction actions.
7.2 The branch/PR must link to the state log (e.g., include the path or paste the JSON as part of PR description).
8. PULL REQUEST REQUIREMENTS
8.1 PR description must include:
- Which issue it solves.
- The RED→GREEN→REFACTOR trace for each cycle: failing tests added, code changes made, tests run and pass, and final state log link.
- Any test-file modifications and their justifications.
8.2 CI must show the relevant tests passing before merge approval.
9. FAILURE MODES / STOP CONDITIONS
9.1 If the agent cannot fix a failing relevant test because of:
- Missing permissions, unavailable external service, flaky external dependency that cannot be isolated, or ambiguous spec requiring human decision,
then:
- Emit: `UNRESOLVABLE RULE VIOLATION` with full diagnostic (failing tests, attempted fixes, why unresolvable).
- STOP and WAIT for human instruction.
9.2 Do not mark any phase complete while an UNRESOLVABLE RULE VIOLATION exists.
END PROCEDURE