What is the difference between debugging and troubleshooting?

Debugging is a structured, scientific approach to isolate and fix the root cause, while troubleshooting is a broader process of identifying the failure mode.

How do I handle non-reproducible bugs?

Aim to build a smallest reproducible case, gather observations, use logs and tests, and try to reproduce reliably before proceeding.

What if the fix doesn't address the root cause?

Revisit the diagnosis, run the experiments again, and ensure the change targets the root cause rather than symptoms.

debugging

npx machina-cli add skill tslateman/duet/debugging --openclaw

Files (1)

SKILL.md

4.6 KB

Systematic Debugging

Overview

Apply the scientific method to software failures. Observe before theorizing. Form hypotheses, design experiments, narrow the cause. Refuse to jump to fixes until the cause is isolated.

The Scientific Debugging Loop

Observe — What exactly happens? What did you expect?
Hypothesize — What could cause this specific discrepancy?
Predict — If the hypothesis is true, what else must be true?
Test — Design the smallest experiment that confirms or refutes
Conclude — Update understanding, repeat from step 2 if refuted

Never skip from Observe to Fix. The loop exists because human intuition about bug causes is wrong more often than right.

Agans' 9 Rules

Apply these rules in order. Most debugging failures trace to violating one of the first three.

Understand the System — Read the docs, read the code, know what it's supposed to do before deciding what's wrong
Make It Fail — Reproduce consistently. If you can't reproduce it, you can't confirm you fixed it
Quit Thinking and Look — Observe actual behavior. Print statements, debuggers, logs. Stop guessing
Divide and Conquer — Binary search the problem space. Narrow where the failure starts
Change One Thing at a Time — One variable per experiment. Otherwise you learn nothing
Keep an Audit Trail — Write down what you tried and what happened. Memory lies
Check the Plug — Verify the obvious. Wrong environment? Stale build? Wrong branch?
Get a Fresh View — Explain the problem aloud. The explanation often reveals the assumption
If You Didn't Fix It, It Ain't Fixed — Coincidence is not causation. Verify the fix addresses the root cause

See references/nine-rules.md for detailed application of each rule.

Debugging Workflow

1. Gather Facts

Before forming any hypothesis:

- What is the exact error message or unexpected behavior?
- When did it start? What changed?
- Does it reproduce consistently?
- What is the smallest reproduction case?

2. Form Hypotheses

List every plausible cause. Rank by:

Likelihood — What usually causes this class of failure?
Testability — Which hypothesis can you disprove fastest?

Start with the most testable, not the most likely. Fast elimination beats slow confirmation.

3. Design Experiments

Each experiment should:

Test exactly one hypothesis
Have a clear pass/fail criterion before running it
Be the smallest possible change

4. Narrow with Delta Debugging

When the cause lives in a large change:

Find a known-good state and a known-bad state
Binary search the difference (commits, config changes, code blocks)
Reduce to the minimal change that introduces the failure

5. Document the Diagnosis

## Bug

[What happened vs what was expected]

## Root Cause

[The specific defect and why it produced this symptom]

## Evidence

[What experiments confirmed this, what was ruled out]

## Fix

[The change and why it addresses the root cause]

## Prevention

[What would catch this earlier next time]

Anti-Patterns

Shotgun debugging — Changing multiple things at once hoping something works. You'll never know what fixed it, or if it's actually fixed.

Debugging by coincidence — "It works now" without understanding why. The bug is still there.

The usual suspects — Blaming the framework, the OS, the compiler. It's almost always your code.

Rubber ducking without listening — Explaining the problem but not hearing your own assumptions. Slow down at the part that feels obvious.

Fix and forget — Fixing the symptom without asking why the system allowed this failure in the first place.

Output Quality

A good diagnosis:

Identifies the root cause, not just the symptom
Explains the causal chain from defect to failure
Provides evidence that eliminates alternatives
Suggests how to prevent recurrence

Overview

Systematic debugging applies the scientific method to software failures: observe what happens, form hypotheses, design minimal experiments, and narrow the root cause before applying a fix. It emphasizes not skipping from observation to fix and uses Agans' 9 rules to guide the process.

How This Skill Works

Follow a five-step loop: Observe, Hypothesize, Predict, Test, Conclude. Gather facts, generate testable hypotheses, run the smallest possible experiments, and update your understanding after each test.

When to Use It

User reports a failure with prompts like 'debug this' or 'why is this failing'
An issue is intermittent or not consistently reproducible, requiring a structured approach
A bug is suspected and you need to form hypotheses and design minimal tests
You want to isolate root cause before applying any fix
You're stuck and thrashing without progress and need a disciplined workflow

Quick Start

Step 1: Gather Facts — collect exact error messages, timing, and reproduction steps
Step 2: Form Hypotheses — list plausible causes and rank by testability
Step 3: Design Experiments — craft small tests to confirm or refute a single hypothesis

Best Practices

Observe before hypothesizing; collect exact error messages, timing, and reproduction steps
Follow Agans' rules in order and avoid guessing
Test one hypothesis with the smallest possible change and predefine pass/fail criteria
Keep an audit trail of what you tried and the results
Use delta debugging to narrow large changes and isolate the cause

Example Use Cases

A failing API call where the error is unclear; observe logs, form hypotheses, and run minimal tests to identify the endpoint or auth issue
A flaky unit test that occasionally fails; reproduce reliably and narrow to a race condition
A regression after a dependency update; binary search commits to find the change introducing the bug
A user-reported bug with unclear reproduction; create a smallest reproducible case and verify the root cause
A long-running process showing increasing memory usage; apply delta-diagnosis to configuration and code paths

Frequently Asked Questions

Add this skill to your agents

debugging

Systematic Debugging

Overview

The Scientific Debugging Loop

Agans' 9 Rules

Debugging Workflow

1. Gather Facts

2. Form Hypotheses

3. Design Experiments

4. Narrow with Delta Debugging

5. Document the Diagnosis

Anti-Patterns

Output Quality

See Also

Source

Overview

How This Skill Works

When to Use It

Quick Start

Best Practices

Example Use Cases

Frequently Asked Questions