parallel-debugging
Scannednpx machina-cli add skill wshobson/agents/parallel-debugging --openclawParallel Debugging
Framework for debugging complex issues using the Analysis of Competing Hypotheses (ACH) methodology with parallel agent investigation.
When to Use This Skill
- Bug has multiple plausible root causes
- Initial debugging attempts haven't identified the issue
- Issue spans multiple modules or components
- Need systematic root cause analysis with evidence
- Want to avoid confirmation bias in debugging
Hypothesis Generation Framework
Generate hypotheses across 6 failure mode categories:
1. Logic Error
- Incorrect conditional logic (wrong operator, missing case)
- Off-by-one errors in loops or array access
- Missing edge case handling
- Incorrect algorithm implementation
2. Data Issue
- Invalid or unexpected input data
- Type mismatch or coercion error
- Null/undefined/None where value expected
- Encoding or serialization problem
- Data truncation or overflow
3. State Problem
- Race condition between concurrent operations
- Stale cache returning outdated data
- Incorrect initialization or default values
- Unintended mutation of shared state
- State machine transition error
4. Integration Failure
- API contract violation (request/response mismatch)
- Version incompatibility between components
- Configuration mismatch between environments
- Missing or incorrect environment variables
- Network timeout or connection failure
5. Resource Issue
- Memory leak causing gradual degradation
- Connection pool exhaustion
- File descriptor or handle leak
- Disk space or quota exceeded
- CPU saturation from inefficient processing
6. Environment
- Missing runtime dependency
- Wrong library or framework version
- Platform-specific behavior difference
- Permission or access control issue
- Timezone or locale-related behavior
Evidence Collection Standards
What Constitutes Evidence
| Evidence Type | Strength | Example |
|---|---|---|
| Direct | Strong | Code at file.ts:42 shows if (x > 0) should be if (x >= 0) |
| Correlational | Medium | Error rate increased after commit abc123 |
| Testimonial | Weak | "It works on my machine" |
| Absence | Variable | No null check found in the code path |
Citation Format
Always cite evidence with file:line references:
**Evidence**: The validation function at `src/validators/user.ts:87`
does not check for empty strings, only null/undefined. This allows
empty email addresses to pass validation.
Confidence Levels
| Level | Criteria |
|---|---|
| High (>80%) | Multiple direct evidence pieces, clear causal chain, no contradicting evidence |
| Medium (50-80%) | Some direct evidence, plausible causal chain, minor ambiguities |
| Low (<50%) | Mostly correlational evidence, incomplete causal chain, some contradicting evidence |
Result Arbitration Protocol
After all investigators report:
Step 1: Categorize Results
- Confirmed: High confidence, strong evidence, clear causal chain
- Plausible: Medium confidence, some evidence, reasonable causal chain
- Falsified: Evidence contradicts the hypothesis
- Inconclusive: Insufficient evidence to confirm or falsify
Step 2: Compare Confirmed Hypotheses
If multiple hypotheses are confirmed, rank by:
- Confidence level
- Number of supporting evidence pieces
- Strength of causal chain
- Absence of contradicting evidence
Step 3: Determine Root Cause
- If one hypothesis clearly dominates: declare as root cause
- If multiple hypotheses are equally likely: may be compound issue (multiple contributing causes)
- If no hypotheses confirmed: generate new hypotheses based on evidence gathered
Step 4: Validate Fix
Before declaring the bug fixed:
- Fix addresses the identified root cause
- Fix doesn't introduce new issues
- Original reproduction case no longer fails
- Related edge cases are covered
- Relevant tests are added or updated
Source
git clone https://github.com/wshobson/agents/blob/main/plugins/agent-teams/skills/parallel-debugging/SKILL.mdView on GitHub Overview
Parallel Debugging uses the Analysis of Competing Hypotheses (ACH) framework to tackle complex bugs with multiple plausible causes. It organizes parallel investigations across six failure modes, collects structured evidence, and applies a formal arbitration protocol to converge on the true root cause and remediation plan.
How This Skill Works
engineers generate hypotheses across six failure modes (Logic, Data, State, Integration, Resource, Environment). Multiple investigators run parallel probes and collect evidence using Direct, Correlational, Testimonial, and Absence types with file:line citations. Findings are categorized (Confirmed, Plausible, Falsified, Inconclusive) and then subjected to the Result Arbitration Protocol to converge on the root cause and next steps.
When to Use It
- When a bug has multiple plausible root causes and a single-path debugging approach is inconclusive
- When initial debugging attempts haven’t identified the issue
- When the issue spans multiple modules or components and requires cross-team investigation
- When you need systematic root cause analysis supported by evidence
- When you want to reduce confirmation bias by examining competing hypotheses in parallel
Quick Start
- Step 1: Define hypotheses across Logic, Data, State, Integration, Resource, and Environment; assign investigators and evidence targets
- Step 2: Launch parallel investigations and collect evidence with file:line citations; log Direct, Correlational, Testimonial, and Absence data
- Step 3: Apply the arbitration protocol to categorize results (Confirmed, Plausible, Falsified, Inconclusive) and converge on the root cause with a remediation plan
Best Practices
- Define explicit hypotheses for all six failure modes (Logic, Data, State, Integration, Resource, Environment) at the outset
- Run parallel investigations with clear ownership and a shared evidence log using file:line citations
- Collect and categorize evidence (Direct, Correlational, Testimonial, Absence) and record confidence levels
- Preserve an objective trace of hypothesis evolution; avoid pruning evidence prematurely
- Apply the arbitration protocol (Categorize Results: Confirmed, Plausible, Falsified, Inconclusive) to converge on a root cause
Example Use Cases
- A flaky cache yields stale data across modules; parallel hypotheses test cache invalidation timing, synchronization, and data race, with evidence from logs and memory dumps
- An API contract violation after a library upgrade; investigators assess logic, serialization, version compatibility, and environment configuration with citation-backed evidence
- A memory leak appearing under specific workloads; hypotheses cover resource management, mutation of shared state, and GC behavior, supported by heap snapshots and event traces
- A race condition between concurrent handlers; evidence includes interleaved logs, timing diagrams, and reproduction steps across components
- Timezone-related behavior causing intermittent failures in distributed services; hypotheses examine environment, locale handling, and scheduling differences with precise file:line evidence