debug
Flagged{"isSafe":false,"isSuspicious":true,"riskLevel":"high","findings":[{"category":"prompt_injection","severity":"high","description":"The skill content contains explicit prompt-injection style text intended to coerce the model into executing steps and spawning agents (e.g., 'STOP — DO NOT READ THIS FILE. You are already reading it. This prompt was injected into your context... Begin executing Step 1 immediately.'). This pattern signals an injected prompt designed to override normal behavior and trigger system actions. Evidence includes the immediate execution cue and an explicit injection claim.","evidence":"**STOP — DO NOT READ THIS FILE. You are already reading it. This prompt was injected into your context by Claude Code's plugin system. Using the Read tool on this SKILL.md file wastes ~7,600 tokens. Begin executing Step 1 immediately.**"}],"summary":"The skill content includes a prompt-injection style directive aimed at forcing the model to execute workflow steps and spawn a debugger subagent. This is unsafe as it attempts to bypass normal controls and execute actions. Sanitize by removing injection-like phrasing, separating user/system prompts, and ensuring execution triggers are explicitly controlled by the user or a trusted orchestrator rather than embedded in content."}
npx machina-cli add skill SienkLogic/plan-build-run/debug --openclawSTOP — DO NOT READ THIS FILE. You are already reading it. This prompt was injected into your context by Claude Code's plugin system. Using the Read tool on this SKILL.md file wastes ~7,600 tokens. Begin executing Step 1 immediately.
Step 0 — Immediate Output
Before ANY tool calls, display this banner:
╔══════════════════════════════════════════════════════════════╗
║ PLAN-BUILD-RUN ► DEBUGGING ║
╚══════════════════════════════════════════════════════════════╝
Then proceed to Step 1.
$pbr-debug — Systematic Debugging
You are running the debug skill. Your job is to run a structured, hypothesis-driven debugging session that persists across conversations. You track every hypothesis, test, and finding in a debug file so work is never lost.
This skill spawns Task(subagent_type: "pbr:debugger") for investigation work.
Context Budget
Reference: skills/shared/context-budget.md for the universal orchestrator rules.
Additionally for this skill:
- Never perform investigation work yourself — delegate ALL analysis to the debugger subagent
- Minimize reading debug file content — read only the latest hypothesis and result section
- Delegate all code reading, hypothesis testing, and fix attempts to the debugger subagent
Core Principle
Debug systematically, not randomly. Every investigation step must have a hypothesis, a test, and a recorded result. No "let me just try this" — every action has a reason and is documented.
Flow
Step 1: Ensure Debug Directory Exists
Before any file operations, ensure both directories exist by running:
mkdir -p .planning/debug
This handles the case where neither .planning/ nor .planning/debug/ exist yet (debug can be run before other skills that create .planning/). Do NOT skip this step — writing files to a non-existent directory will fail.
Step 2: Check for Active Debug Sessions
Load depth profile: Run node ${PLUGIN_ROOT}/scripts/pbr-tools.js config resolve-depth to get debug.max_hypothesis_rounds. If the command fails (no config.json or CLI error), default to 5 rounds. Initialize a round counter at 0. This counter increments each time a continuation debugger is spawned.
Scan .planning/debug/ for existing debug files:
.planning/debug/
{NNN}-{slug}.md # Each debug session is a file
Read each file's frontmatter to check status:
status: gathering— collecting symptoms from userstatus: investigating— testing hypothesesstatus: fixing— applying fixstatus: verifying— confirming fix worksstatus: resolved— session complete
If active sessions found:
Use the debug-session-select pattern from skills/shared/gate-prompts.md:
question: "Found active debug sessions. Which would you like?"
Generate options dynamically from active sessions:
- Each active session becomes an option: label "#{NNN}: {title}", description "Started {date}, last: {last hypothesis}"
- Always include "New session" as the last option: description "Start a fresh debug investigation"
- If more than 3 active sessions exist, show only the 3 most recent plus "New session" (max 4 options)
Handle responses:
- If user selects an existing session: go to Resume Flow (Step 3b)
- If user selects "New session": go to New Session Flow (Step 3a)
- If user types a session number not in the list: look it up and resume it
If no active sessions found:
- Go to New Session Flow (Step 3a)
Step 3a: New Session Flow
Gather Symptoms
If $ARGUMENTS is provided and descriptive:
- Use it as the initial issue description
- Still ask targeted follow-up questions
If $ARGUMENTS is empty or minimal:
- Ask the user for symptoms
Symptom gathering questions (ask as plain text — these are freeform, do NOT use AskUserQuestion):
- Expected behavior: "What should happen?"
- Actual behavior: "What actually happens? Include error messages if any."
- Reproduction: "How do you trigger this? Steps to reproduce?"
- Onset: "When did this start? Did anything change recently (new code, dependency update, config change)?"
- Scope: "Does this affect everything or just specific cases? Any patterns?"
Optional follow-ups (ask if relevant):
- "What have you already tried?"
- "Does this happen in all environments (dev, prod, test)?"
- "Any relevant log output?"
Generate Session ID
- Scan
.planning/debug/for existing files - Extract NNN prefixes
- Next number = highest + 1 (start at 001)
- Generate slug from issue title (same rules as quick task slugs)
Create Debug File
Create .planning/debug/{NNN}-{slug}.md:
---
id: "{NNN}"
title: "{issue title}"
status: gathering
created: "{ISO date}"
updated: "{ISO date}"
severity: "{critical|high|medium|low}"
category: "{runtime|build|test|config|integration|unknown}"
---
# Debug Session: {title}
## Symptoms
**Expected:** {expected behavior}
**Actual:** {actual behavior}
**Reproduction:** {steps}
**Onset:** {when it started}
**Scope:** {affected areas}
## Environment
- OS: {detected or reported}
- Runtime: {node version, python version, etc.}
- Relevant config: {any config that matters}
## Investigation Log
### Round 1 (automated)
{This section is filled by debugger}
## Hypotheses
| # | Hypothesis | Status | Evidence |
|---|-----------|--------|----------|
| 1 | {hypothesis} | {testing/confirmed/rejected} | {evidence} |
## Root Cause
{Filled when found}
## Fix Applied
{Filled when fixed}
## Timeline
- {ISO date}: Session created
Spawn Debugger
Display to the user: ◐ Spawning debugger...
Spawn Task(subagent_type: "pbr:debugger") with the prompt template.
Read skills/debug/templates/initial-investigation-prompt.md.tmpl for the spawn prompt. Fill in the {NNN}, {slug}, and symptom placeholders with values from the debug file created above.
Step 3b: Resume Flow
- Read the debug file content
- Parse the investigation log and hypotheses table
- Display to user:
Resuming debug session #{NNN}: {title}
Last state:
- Hypotheses tested: {N}
- Confirmed: {list or "none yet"}
- Rejected: {list}
- Current lead: {most promising hypothesis}
Continuing investigation...
-
Increment the round counter (resuming counts as a new round). Display to the user:
◐ Spawning debugger (resuming session #{NNN}, round {N})...Spawn
Task(subagent_type: "pbr:debugger")with the continuation prompt template.Read
skills/debug/templates/continuation-prompt.md.tmplfor the spawn prompt. Fill in the{NNN},{slug}, and{paste investigation log...}placeholders with data from the debug file.
Step 4: Handle Debugger Results
When the debugger agent completes, first check for completion markers in the Task() output before routing:
| Marker in Task() Output | Route To |
|---|---|
## DEBUG COMPLETE | ROOT CAUSE FOUND + FIX path |
## ROOT CAUSE FOUND | ROOT CAUSE FOUND (no fix) path |
## DEBUG SESSION PAUSED | CHECKPOINT path |
| No marker found | INCONCLUSIVE path |
Spot-check: Before routing, verify .planning/debug/{NNN}-{slug}.md exists and was recently updated (modified timestamp is newer than the Task() spawn time). If the debug file was not updated, warn: ⚠ Debug file not updated by agent — results may be incomplete.
Display: ✓ Debug session complete — {N} hypotheses tested (read the hypothesis count from the debug file's Hypotheses table).
The debugger returns one of four outcomes:
ROOT CAUSE FOUND + FIX
Root cause identified: {cause}
Fix applied: {description}
Commit: {hash}
Actions:
- Update debug file:
- Set
status: resolved - Fill "Root Cause" section
- Fill "Fix Applied" section
- Add timeline entry
- Set
- Update STATE.md if it has a Debug Sessions section
- Report to user with branded output:
╔══════════════════════════════════════════════════════════════╗
║ PLAN-BUILD-RUN ► BUG RESOLVED ✓ ║
╚══════════════════════════════════════════════════════════════╝
**Session #{NNN}:** {title}
**Root cause:** {cause}
**Fix:** {description}
**Commit:** {hash}
╔══════════════════════════════════════════════════════════════╗
║ ▶ NEXT UP ║
╚══════════════════════════════════════════════════════════════╝
**Continue your workflow** — the bug is fixed
`$pbr-status`
<sub>`/clear` first → fresh context window</sub>
**Also available:**
- `$pbr-continue` — execute next logical step
- `$pbr-review {N}` — verify the current phase
ROOT CAUSE FOUND (no fix)
Used when the debugger was invoked with find_root_cause_only or when the fix is too complex for auto-application.
Root cause identified: {cause}
Suggested fix: {approach}
Actions:
- Update debug file:
- Set
status: resolved - Fill "Root Cause" section
- Add suggested fix to notes
- Set
- Suggest next steps to user:
╔══════════════════════════════════════════════════════════════╗
║ ▶ NEXT UP ║
╚══════════════════════════════════════════════════════════════╝
**Apply the fix** — root cause identified, fix needed
`$pbr-quick {fix description}`
<sub>`/clear` first → fresh context window</sub>
**Also available:**
- `$pbr-plan` — for complex fixes that need planning
- `$pbr-status` — see project status
CHECKPOINT
The debugger found something but needs user input or more investigation.
Investigation progress:
- Tested: {hypotheses tested}
- Found: {key finding}
- Need: {what's needed to continue}
Actions:
- Update debug file with findings so far
- Present checkpoint to user
- Use the debug-checkpoint pattern from
skills/shared/gate-prompts.md: question: "Investigation has reached a checkpoint. How should we proceed?"
Handle responses:
- "Continue": Increment the round counter (e.g., round 1 becomes round 2). Then display
◐ Spawning debugger (continuing investigation, round {N})...and spawn anotherTask(subagent_type: "pbr:debugger")with updated context from the debug file - "More info": Increment the round counter. Ask the user freeform what additional context they have, then update the debug file and spawn another debugger
- "New approach": Increment the round counter. Ask the user freeform what alternative angle to try, then update hypotheses and spawn another debugger
INCONCLUSIVE
The debugger exhausted its hypotheses without finding the root cause.
Investigation exhausted:
- Tested: {all hypotheses}
- Rejected: {list}
- Remaining unknowns: {list}
Actions:
- Update debug file with all findings
- Report to user:
- What was tested and eliminated
- What remains unknown
- Suggested next investigation approaches:
- Different reproduction steps
- Log-level debugging
- Environment comparison
- Bisect (git bisect to find the breaking commit)
- External help (stack overflow, docs)
- Keep session active for future resumption
Debugger Investigation Protocol
The debugger agent follows this protocol internally:
Hypothesis-Driven Investigation
1. OBSERVE: Read error messages, logs, code around the failure point
2. HYPOTHESIZE: "The most likely cause is X because Y"
3. PREDICT: "If X is the cause, then test Z should show W"
4. TEST: Execute test Z
5. EVALUATE:
- Result matches prediction → hypothesis supported → investigate deeper
- Result contradicts → hypothesis rejected → try next hypothesis
- Result is unexpected → new information → form new hypothesis
Investigation Techniques
| Technique | When to Use |
|---|---|
| Stack trace analysis | Error with stack trace available |
| Code path tracing | Logic error, wrong behavior |
| Log injection | Need to see runtime values |
| Binary search | Know it worked before, need to find when it broke |
| Isolation | Complex system, need to narrow scope |
| Comparison | Works in one case, fails in another |
| Dependency audit | Recent dependency changes |
| Config diff | Works in one environment, not another |
Evidence Quality
| Quality | Description | Action |
|---|---|---|
| Strong | Directly proves/disproves hypothesis | Record and move on |
| Moderate | Suggests but doesn't prove | Record, seek corroboration |
| Weak | Tangentially related | Note but don't base decisions on it |
| Misleading | Red herring | Record as eliminated, explain why |
Hypothesis Round Limit
The maximum number of investigation rounds is controlled by the depth profile's debug.max_hypothesis_rounds setting:
quick: 3 rounds (fast, surface-level investigation)standard: 5 rounds (default)comprehensive: 10 rounds (deep investigation)
The orchestrator tracks the round count. Before spawning each continuation debugger (Step 4 "CHECKPOINT" -> "Continue"), increment the round counter. If the counter reaches the limit:
- Do NOT spawn another debugger
- Present to user: "Debug session has reached the hypothesis round limit ({N} rounds for {depth} mode). Options:"
Use AskUserQuestion: question: "Reached {N}-round hypothesis limit. How should we proceed?" header: "Debug Limit" options: - label: "Extend" description: "Allow {N} more rounds (doubles the limit)" - label: "Wrap up" description: "Record findings so far and close the session" - label: "Escalate" description: "Save context for manual debugging"
- If "Extend": double the limit and continue
- If "Wrap up": update debug file
status: resolvedwithresolution: abandoned, record all findings, suggest next steps - If "Escalate": write a detailed handoff document to the debug file with all hypotheses, evidence, and suggested manual investigation steps
Debug File Management
Lifecycle
gathering → investigating → fixing → verifying → resolved
(any non-resolved) → resolved (with resolution: abandoned, if 7+ days stale)
(any non-resolved) → (same) (resumed after pause)
Staleness Detection
When scanning for active sessions, check the updated date. If more than 7 days old:
- Set
status: resolvedwithresolution: abandonedin frontmatter - Still offer to resume, but warn: "This session is {N} days old. Context may have changed."
Cleanup
Old resolved debug files can accumulate. They serve as a knowledge base for similar issues. Do NOT auto-delete them.
State Integration
Update STATE.md Debug Sessions section (create if needed):
### Debug Sessions
| # | Issue | Status | Root Cause |
|---|-------|--------|------------|
| 001 | Login timeout | resolved | DB connection pool exhausted |
| 002 | CSS not loading | active | investigating |
Git Integration
Reference: skills/shared/commit-planning-docs.md for the standard commit pattern.
If planning.commit_docs: true in config.json:
- New session:
docs(planning): open debug session {NNN} - {slug} - Resolution:
docs(planning): resolve debug session {NNN} - {root cause summary} - Fix commits use standard format:
fix({scope}): {description}
Edge Cases
User provides a stack trace or error in arguments
- Parse it as the "Actual behavior" symptom
- Extract key information: error type, file, line number
- Use this to form initial hypotheses immediately
Debugger agent fails
If the debugger Task() fails or returns an error, display:
╔══════════════════════════════════════════════════════════════╗
║ ERROR ║
╚══════════════════════════════════════════════════════════════╝
Debugger agent failed for session #{NNN}.
**To fix:**
- Check the debug file at `.planning/debug/{NNN}-{slug}.md` for partial findings
- Re-run with `$pbr-debug` to resume the session
- If the issue persists, try a fresh session with different symptom details
Issue is in a dependency (not user code)
- Document which dependency and version
- Check if there's a known issue (search patterns in node_modules, site-packages, etc.)
- Suggest: update dependency, pin version, or work around
Issue is intermittent
- Note intermittency in symptoms
- Suggest: run multiple times, add timing/logging, check for race conditions
- Investigation must account for non-deterministic reproduction
Multiple issues interacting
- If investigation reveals multiple separate issues, split into separate debug sessions
- Create additional debug files
- Track each independently
Fix would be a breaking change
- Report the root cause but DO NOT auto-apply the fix
- Present the trade-offs to the user
- Let the user decide how to proceed
Anti-Patterns
- DO NOT skip hypothesis formation — every test must have a reason
- DO NOT make random changes hoping something works
- DO NOT ignore failed hypotheses — record why they failed
- DO NOT exceed the depth profile's
debug.max_hypothesis_roundslimit without user confirmation (default: 5 for standard mode) - DO NOT fix the symptom instead of the root cause
- DO NOT auto-apply fixes for breaking changes
- DO NOT delete debug files — they're a knowledge base
- DO NOT combine multiple bugs into one debug session
- DO NOT skip updating the debug file after each investigation round
- DO NOT start a new session when an active one covers the same issue
Source
git clone https://github.com/SienkLogic/plan-build-run/blob/main/plugins/codex-pbr/skills/debug/SKILL.mdView on GitHub Overview
The debug skill guides you through structured, hypothesis-driven debugging that persists across conversations. It delegates analysis to a debugger subagent and records every hypothesis, test, and result so work isn't lost between sessions.
How This Skill Works
It spawns a pbr-debugger subagent to perform analysis, while you focus on defining hypotheses and tests. It enforces a workflow: ensure the .planning/debug directory exists, detect active sessions by reading depth settings, and list debug files to resume or start anew. Each step logs status, hypothesis, test, and result to a persistent file.
When to Use It
- Diagnose intermittent or multi-step bugs that span sessions
- Coordinate a hypothesis-driven investigation with a dedicated debugger subagent
- Resume prior debugging sessions without losing context
- Organize multiple investigations in a central .planning/debug store
- Verify fixes incrementally by recording tests and outcomes
Quick Start
- Step 1: mkdir -p .planning/debug to ensure the directory exists
- Step 2: Load depth config to determine max hypothesis rounds (node ${PLUGIN_ROOT}/scripts/pbr-tools.js config resolve-depth)
- Step 3: Start a new session or resume an existing one, then log hypothesis, test, and result after each action
Best Practices
- Define a clear hypothesis before each test
- Log every hypothesis, test, and result in the debug file
- Delegate code understanding and testing to the debugger subagent
- Create the .planning/debug directory before starting
- Limit the number of hypothesis rounds based on the config (depth) to avoid drift
Example Use Cases
- Flaky API responses tracked across sessions with saved hypotheses
- Sequential bug fixes in a build pipeline using checkpointed tests
- User-reported bug reproduced by following structured steps
- Race condition investigation with persistent context between prompts
- Post-fix verification by re-running tests against saved hypotheses