Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Select an option

  • Save bkataru/45363fb8911565c87e037776e41d2d3c to your computer and use it in GitHub Desktop.

Select an option

Save bkataru/45363fb8911565c87e037776e41d2d3c to your computer and use it in GitHub Desktop.
SKILL: Spawning & Monitoring Background Coding Agents from Claude Code — phase pipeline pattern, prompt templates, model selection

Skill: Spawning & Monitoring Background Coding Agents from Claude Code

Patterns for orchestrating long-running AI coding agents from within a Claude Code session.


The Core Pattern

Use Bash tool with run_in_background: true to launch agents, redirect output to a file, and poll progress without blocking the main session.

npx oh-my-opencode run --agent hephaestus --directory /path/to/project \
  "$(cat /tmp/phase-prompt.txt)" > /tmp/phase-output.output 2>&1
# Returns a background task ID immediately

Then monitor with:

wc -l /tmp/phase-output.output && tail -80 /tmp/phase-output.output

Why Not Use the Task Tool?

The Claude Code Task tool spawns subagents but subagents cannot run opencode — they lack the shell permission by default, or it gets denied by permission hooks.

Solution: Run opencode/npx oh-my-opencode directly via the Bash tool in the main session with run_in_background: true.


Sequential Phase Pipeline Pattern

For multi-phase projects (e.g., Phase 5 → Phase 7 → Phase 6 → Phase 8):

  1. Write each phase's prompt to a temp file: /tmp/phaseN-prompt.txt
  2. Launch one agent at a time
  3. Wait for completion (poll or use task notification)
  4. Commit the phase's work
  5. Launch the next phase
# Phase N launch
npx oh-my-opencode run --agent hephaestus --directory /repo \
  "$(cat /tmp/phaseN-prompt.txt)" > /tmp/phaseN.output 2>&1

# Monitor
sleep 120 && wc -l /tmp/phaseN.output && tail -50 /tmp/phaseN.output

# After completion: check artifacts, run verification
cargo check --workspace

# Commit
git add -p && git commit -m "Phase N: description"

# Launch next phase
npx oh-my-opencode run --agent hephaestus --directory /repo \
  "$(cat /tmp/phaseN1-prompt.txt)" > /tmp/phaseN1.output 2>&1

Reading Agent Output

The output file contains the full event stream from the agent session. Key things to look for:

Agent started:

Starting server on port 4096
Session: ses_37007a5aeffeEP0RWYWMAk56Yp
  minimaxai/minimax-m2.1
  └─ Hephaestus (Deep Agent)

Tool calls (agent working):

→ Read /path/to/file.rs
└─ output ...

← Edit /path/to/file.rs
└─ output Updated

$ cargo check --workspace
└─ output Finished `dev` profile

Completion signal:

All tasks completed.
## Phase N Implementation Complete
...

Session cut off (mid-execution):

{
  "type": "tool",
  "state": { "status": "pending" }   ← incomplete tool call at end
}

Recovering a Cut-Off Session

When the background process exits early (exit code 0 but agent didn't finish):

  1. Get the session ID from the output file: grep "Session:" /tmp/output.file
  2. Check what files were created: ls crates/*/tests/ etc.
  3. Resume with a targeted follow-up prompt:
npx oh-my-opencode run \
  --agent hephaestus \
  --directory /repo \
  --session-id ses_37007a5aeffeEP0RWYWMAk56Yp \
  "Continue. You created X but still need: (1) add dev-deps to Cargo.toml (2) create Y tests (3) run cargo test"

Important: The session retains full context including file reads from the previous run, so the resume prompt can be brief — just specify what remains.


Diagnosing What an Agent Actually Did

# Export full session as JSON (includes all tool inputs/outputs)
opencode export ses_37007a5aeffeEP0RWYWMAk56Yp 2>&1 | tail -100

# List all sessions
opencode session list

# Check what files changed
git diff --stat HEAD
git status

Timing Heuristics

For a Rust workspace with ~5 crates:

  • Simple file creation (new test files): ~2–3 min for Hephaestus
  • Cargo.toml modifications + test writing + cargo test run: ~5–8 min
  • New crate + CLI integration + cargo check: ~8–12 min
  • README writing: ~2–3 min

Use sleep N before checking output. If output hasn't grown after 5 min, the agent is likely stuck.


Writing Effective Phase Prompts

Must-haves:

  1. File operation instructions — explicitly state cat > file <<'EOF' for new files, Edit tool for modifications
  2. Verification step — last step should be cargo check --workspace or cargo test --workspace with "fix errors and re-run"
  3. Exact API examples — for any library with tricky API (e.g., rmcp 0.10, specific Rust patterns)
  4. Numbered steps — OMO tracks these as todos; clear steps = clean loop termination

Avoid:

  • "Use Python scripts to write files" — agents go off-track writing the script
  • Vague steps like "update the CLI" — agent may interpret too narrowly or too broadly
  • Multiple verification commands without a priority — agents may stop at the first passing check

Template:

You are implementing Phase N of <project>: <description>.

## Project
<brief codebase description, relevant crate names>

## CRITICAL INSTRUCTIONS
- For NEW files: use `cat > /path <<'EOF' ... EOF`
- For EXISTING file modifications: use Edit tool (read first, then edit)
- Final step: run `cargo check --workspace 2>&1` and fix any errors
- Report final result

## Files to Create/Modify
<explicit list with exact paths and content>

## Steps
1. <step>
2. <step>
...
N. Run cargo check --workspace 2>&1, fix errors, report final result

Model Selection Guide (NVIDIA NIM via opencode)

Task Best Model Why
Complex multi-file implementation qwen3.5-397b-a17b (Sisyphus) Large, smart orchestrator
Autonomous deep codebase work minimaxai/minimax-m2.1 (Hephaestus) Reads deeply, self-corrects
Hard debugging / architecture qwen3-next-80b-a3b-thinking (Oracle) Reasoning model
Fast file creation / simple edits qwen3-next-80b-a3b-instruct (Explore) Fast, good instruction following
Planning before implementation z-ai/glm4.7 (Prometheus) Interview-mode planning

Observation from session: Hephaestus (minimax-m2.1) outperforms qwen3-next on tasks that require understanding existing code before modifying it. It reads the codebase extensively, adapts tests to match real behavior, and self-corrects Edit tool hash conflicts.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment