Skip to content

Instantly share code, notes, and snippets.

@hesreallyhim
Last active October 16, 2025 15:06
Show Gist options
  • Select an option

  • Save hesreallyhim/6b94808c9433c11e674d9ee2bf3e7ed9 to your computer and use it in GitHub Desktop.

Select an option

Save hesreallyhim/6b94808c9433c11e674d9ee2bf3e7ed9 to your computer and use it in GitHub Desktop.
A template/runbook to use when your agent gets stuck and is starting to flail about.

DEBUG SESSION TEMPLATE

Use this document as the structured workspace whenever a complex issue arises (e.g., “app fails to launch if it’s raining outside”). The goal is to capture observations, develop and test hypotheses systematically, and leave a durable record that future agents can build on. Follow the sections in order; update them as you work. Keep entries concise but specific.


1. Context & Scope

  • Problem statement: Describe the issue in one sentence. Include any user reports, reproducibility notes, or triggering conditions.
  • Environment: List OS, hardware, app version/commit, and any relevant runtime flags or services.
  • Guardrails: Note any actions that are out-of-bounds (e.g., no network access, no external APIs, approvals needed).

Agent prompt: “Summarize what we are debugging, where it happens, and any constraints before touching code.”


2. Observations

Document facts only (what you saw, not why you think it happened). Update this list as new evidence emerges.

  • Timestamped entries are encouraged (e.g., [2025-10-14 09:32] Launching app with rain condition triggers crash dialog.)
  • Note relevant logs, screenshots, stack traces, or CLI outputs (reference file paths or commands instead of pasting long dumps).

Agent prompt: “After each experiment, record the factual outcome here.”


3. Hypotheses

Maintain a running list of explanations ranked roughly by plausibility. Each hypothesis should map to one or more observations.

  • Include references to observation bullet numbers when possible.
  • Mark hypotheses as ✅ validated or ❌ rejected; explain briefly why.

Agent prompt: “For every observation cluster, what could explain it? Keep hypotheses mutually intelligible and revisit them after tests.”


4. Experiments & Results

For each experiment:

  1. Plan: Describe the change/test you intend to run. Link it to the hypothesis you’re testing.
  2. Action: Summarize what you actually did (commands run, code edited).
  3. Result: Record the outcome (pass/fail, new error, etc.).
  4. Impact on hypothesis: Mark whether the target hypothesis is strengthened, weakened, or resolved.

Use a bullet or subheading per experiment. Keep the sequence chronological. Reference git commits or diffs if applicable.

Agent prompt: “Before editing, state the experiment. Afterward, record exactly what happened and how it affects the hypothesis.”


5. Interim Conclusions

Summarize what you currently believe to be true about the issue. Highlight remaining unknowns or risks.

  • If the root cause is identified, describe it along with evidence.
  • If not, outline the next decision point (e.g., “Need to inspect weather API integration; blocked on credentials.”).

Agent prompt: “Pause periodically to synthesize. Are we closer to the root cause? What remains uncertain?”


6. Next Steps / Hand-off Notes

List actionable items for yourself or the next agent:

  • Tests to run, logs to capture, approvals to request.
  • Code areas to inspect or revert.
  • External follow-ups (e.g., “Confirm weather API availability with ops.”)

Include @mentions or owner names if handing off to a teammate.

Agent prompt: “If you had to stop now, what instructions would you leave so the next person can continue seamlessly?”


7. Appendix (Optional)

  • Links to supporting documents, design specs, or past incidents.
  • Snippets of logs or stack traces (if short). For long artifacts, store them separately and reference the path.
  • Glossary for domain-specific terms encountered during the session.

Methodology Enhancements for Agentic Work

  • Consistent prompts: Use the inline prompts to stay disciplined. Treat each section as a mini checklist.
  • Versioning: If the debugging session spans multiple days or agents, append the date to the filename (e.g., DEBUG_SESSION_2025-10-14.md) and link to prior sessions.
  • Tool usage log: When agents invoke tools (CLI commands, web queries, apply_patch), record the tool and purpose in Observations or Experiments to maintain traceability.
  • Outcome tagging: Consider tagging hypotheses or experiments with IDs (H1, E1...) to make cross-referencing easier.
  • Post-mortem-ready notes: Aim for content that could drop directly into a post-mortem or knowledge base article; future agents should understand the narrative without re-reading console history.
  • Document lifecycle: Keep one document per issue so the full history stays coherent. When the file grows unwieldy (rule of thumb: >500 lines or hard to scan), summarize the current state, archive the file (e.g., DEBUG_issue_2025-11-02_v1.md), and start a continuation (…_v2.md) that links back. Add a short recap at the top of the new file so readers don’t re-read the entire history.
  • Incremental reproduction: If progress stalls, identify the last known-good state (commit, build, or feature subset) and reintroduce functionality step-by-step. Note each increment in Experiments/Results; this binary-search mindset often surfaces the precise change that reintroduces the issue.

Remember: this template is a guide, not a straitjacket. Adjust sections as needed, but keep the core loop—observe → hypothesize → experiment → conclude—intact.

© 2025 Really Him

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment