| name | description |
|---|---|
log-triage-playbook |
Structured triage for long build, deployment, and runtime log threads. Use when users paste large logs and ask for root-cause analysis, recovery steps, or troubleshooting guidance. Produce a fixed response with ranked root causes, exact verify commands, minimal patch, and rollback. |
Use this skill to turn noisy failure logs into a repeatable triage result.
Capture before analysis:
- Failing command, pipeline/job/stage, and environment (dev/test/prod)
- First failing timestamp and first hard error line
- Relevant context window around the first hard error (about +/- 40 lines)
- Recent change context (branch, commit, PR, or work item)
If key inputs are missing, state exactly what is missing and continue with best-effort analysis.
- Find the first hard failure.
- Prioritize fatal markers (
ERROR,Exception, non-zero exit) over downstream noise.
- Prioritize fatal markers (
- Separate primary failure from cascade failures.
- Treat repeated follow-up errors as symptoms unless they have independent evidence.
- Build and rank root-cause hypotheses.
- Rank by evidence strength and blast radius.
- Add deterministic verification.
- Give copy/paste commands for each top hypothesis.
- Propose the smallest safe fix.
- Prefer one minimal patch over broad refactors.
- Provide rollback or containment.
- Include the fastest safe fallback if the patch fails.
Return this structure in order:
Failure point: first hard error (with line/time reference if available)Likely scope: component/service affectedRisk: low/medium/high
| Rank | Hypothesis | Evidence | Confidence |
|---|---|---|---|
| 1 | ... | ... | High/Medium/Low |
| 2 | ... | ... | High/Medium/Low |
| 3 | ... | ... | High/Medium/Low |
Provide exact commands, one block per hypothesis:
# Verify hypothesis 1
...# Verify hypothesis 2
...- Smallest code/config change to address the top verified hypothesis
- Affected files/components
- Why this is minimal
- Immediate fallback action
- Command or deployment action to revert safely
- Condition for triggering rollback
- If verify #1 is true -> apply minimal patch
- If verify #1 is false and #2 is true -> apply fallback patch for #2
- If all verifies are false -> request the next exact data slice needed
- Do not claim certainty without log evidence.
- Do not skip rollback guidance.
- Do not recommend large refactors during incident triage.
- Keep commands environment-aware and explicit (project, resource group, namespace, branch).