Get the FREE Ultimate OpenClaw Setup Guide →

debugging-strategies

npx machina-cli add skill wshobson/agents/debugging-strategies --openclaw
Files (1)
SKILL.md
12.0 KB

Debugging Strategies

Transform debugging from frustrating guesswork into systematic problem-solving with proven strategies, powerful tools, and methodical approaches.

When to Use This Skill

  • Tracking down elusive bugs
  • Investigating performance issues
  • Understanding unfamiliar codebases
  • Debugging production issues
  • Analyzing crash dumps and stack traces
  • Profiling application performance
  • Investigating memory leaks
  • Debugging distributed systems

Core Principles

1. The Scientific Method

1. Observe: What's the actual behavior? 2. Hypothesize: What could be causing it? 3. Experiment: Test your hypothesis 4. Analyze: Did it prove/disprove your theory? 5. Repeat: Until you find the root cause

2. Debugging Mindset

Don't Assume:

  • "It can't be X" - Yes it can
  • "I didn't change Y" - Check anyway
  • "It works on my machine" - Find out why

Do:

  • Reproduce consistently
  • Isolate the problem
  • Keep detailed notes
  • Question everything
  • Take breaks when stuck

3. Rubber Duck Debugging

Explain your code and problem out loud (to a rubber duck, colleague, or yourself). Often reveals the issue.

Systematic Debugging Process

Phase 1: Reproduce

## Reproduction Checklist

1. **Can you reproduce it?**
   - Always? Sometimes? Randomly?
   - Specific conditions needed?
   - Can others reproduce it?

2. **Create minimal reproduction**
   - Simplify to smallest example
   - Remove unrelated code
   - Isolate the problem

3. **Document steps**
   - Write down exact steps
   - Note environment details
   - Capture error messages

Phase 2: Gather Information

## Information Collection

1. **Error Messages**
   - Full stack trace
   - Error codes
   - Console/log output

2. **Environment**
   - OS version
   - Language/runtime version
   - Dependencies versions
   - Environment variables

3. **Recent Changes**
   - Git history
   - Deployment timeline
   - Configuration changes

4. **Scope**
   - Affects all users or specific ones?
   - All browsers or specific ones?
   - Production only or also dev?

Phase 3: Form Hypothesis

## Hypothesis Formation

Based on gathered info, ask:

1. **What changed?**
   - Recent code changes
   - Dependency updates
   - Infrastructure changes

2. **What's different?**
   - Working vs broken environment
   - Working vs broken user
   - Before vs after

3. **Where could this fail?**
   - Input validation
   - Business logic
   - Data layer
   - External services

Phase 4: Test & Verify

## Testing Strategies

1. **Binary Search**
   - Comment out half the code
   - Narrow down problematic section
   - Repeat until found

2. **Add Logging**
   - Strategic console.log/print
   - Track variable values
   - Trace execution flow

3. **Isolate Components**
   - Test each piece separately
   - Mock dependencies
   - Remove complexity

4. **Compare Working vs Broken**
   - Diff configurations
   - Diff environments
   - Diff data

Debugging Tools

JavaScript/TypeScript Debugging

// Chrome DevTools Debugger
function processOrder(order: Order) {
  debugger; // Execution pauses here

  const total = calculateTotal(order);
  console.log("Total:", total);

  // Conditional breakpoint
  if (order.items.length > 10) {
    debugger; // Only breaks if condition true
  }

  return total;
}

// Console debugging techniques
console.log("Value:", value); // Basic
console.table(arrayOfObjects); // Table format
console.time("operation");
/* code */ console.timeEnd("operation"); // Timing
console.trace(); // Stack trace
console.assert(value > 0, "Value must be positive"); // Assertion

// Performance profiling
performance.mark("start-operation");
// ... operation code
performance.mark("end-operation");
performance.measure("operation", "start-operation", "end-operation");
console.log(performance.getEntriesByType("measure"));

VS Code Debugger Configuration:

// .vscode/launch.json
{
  "version": "0.2.0",
  "configurations": [
    {
      "type": "node",
      "request": "launch",
      "name": "Debug Program",
      "program": "${workspaceFolder}/src/index.ts",
      "preLaunchTask": "tsc: build - tsconfig.json",
      "outFiles": ["${workspaceFolder}/dist/**/*.js"],
      "skipFiles": ["<node_internals>/**"]
    },
    {
      "type": "node",
      "request": "launch",
      "name": "Debug Tests",
      "program": "${workspaceFolder}/node_modules/jest/bin/jest",
      "args": ["--runInBand", "--no-cache"],
      "console": "integratedTerminal"
    }
  ]
}

Python Debugging

# Built-in debugger (pdb)
import pdb

def calculate_total(items):
    total = 0
    pdb.set_trace()  # Debugger starts here

    for item in items:
        total += item.price * item.quantity

    return total

# Breakpoint (Python 3.7+)
def process_order(order):
    breakpoint()  # More convenient than pdb.set_trace()
    # ... code

# Post-mortem debugging
try:
    risky_operation()
except Exception:
    import pdb
    pdb.post_mortem()  # Debug at exception point

# IPython debugging (ipdb)
from ipdb import set_trace
set_trace()  # Better interface than pdb

# Logging for debugging
import logging
logging.basicConfig(level=logging.DEBUG)
logger = logging.getLogger(__name__)

def fetch_user(user_id):
    logger.debug(f'Fetching user: {user_id}')
    user = db.query(User).get(user_id)
    logger.debug(f'Found user: {user}')
    return user

# Profile performance
import cProfile
import pstats

cProfile.run('slow_function()', 'profile_stats')
stats = pstats.Stats('profile_stats')
stats.sort_stats('cumulative')
stats.print_stats(10)  # Top 10 slowest

Go Debugging

// Delve debugger
// Install: go install github.com/go-delve/delve/cmd/dlv@latest
// Run: dlv debug main.go

import (
    "fmt"
    "runtime"
    "runtime/debug"
)

// Print stack trace
func debugStack() {
    debug.PrintStack()
}

// Panic recovery with debugging
func processRequest() {
    defer func() {
        if r := recover(); r != nil {
            fmt.Println("Panic:", r)
            debug.PrintStack()
        }
    }()

    // ... code that might panic
}

// Memory profiling
import _ "net/http/pprof"
// Visit http://localhost:6060/debug/pprof/

// CPU profiling
import (
    "os"
    "runtime/pprof"
)

f, _ := os.Create("cpu.prof")
pprof.StartCPUProfile(f)
defer pprof.StopCPUProfile()
// ... code to profile

Advanced Debugging Techniques

Technique 1: Binary Search Debugging

# Git bisect for finding regression
git bisect start
git bisect bad                    # Current commit is bad
git bisect good v1.0.0            # v1.0.0 was good

# Git checks out middle commit
# Test it, then:
git bisect good   # if it works
git bisect bad    # if it's broken

# Continue until bug found
git bisect reset  # when done

Technique 2: Differential Debugging

Compare working vs broken:

## What's Different?

| Aspect       | Working     | Broken         |
| ------------ | ----------- | -------------- |
| Environment  | Development | Production     |
| Node version | 18.16.0     | 18.15.0        |
| Data         | Empty DB    | 1M records     |
| User         | Admin       | Regular user   |
| Browser      | Chrome      | Safari         |
| Time         | During day  | After midnight |

Hypothesis: Time-based issue? Check timezone handling.

Technique 3: Trace Debugging

// Function call tracing
function trace(
  target: any,
  propertyKey: string,
  descriptor: PropertyDescriptor,
) {
  const originalMethod = descriptor.value;

  descriptor.value = function (...args: any[]) {
    console.log(`Calling ${propertyKey} with args:`, args);
    const result = originalMethod.apply(this, args);
    console.log(`${propertyKey} returned:`, result);
    return result;
  };

  return descriptor;
}

class OrderService {
  @trace
  calculateTotal(items: Item[]): number {
    return items.reduce((sum, item) => sum + item.price, 0);
  }
}

Technique 4: Memory Leak Detection

// Chrome DevTools Memory Profiler
// 1. Take heap snapshot
// 2. Perform action
// 3. Take another snapshot
// 4. Compare snapshots

// Node.js memory debugging
if (process.memoryUsage().heapUsed > 500 * 1024 * 1024) {
  console.warn("High memory usage:", process.memoryUsage());

  // Generate heap dump
  require("v8").writeHeapSnapshot();
}

// Find memory leaks in tests
let beforeMemory: number;

beforeEach(() => {
  beforeMemory = process.memoryUsage().heapUsed;
});

afterEach(() => {
  const afterMemory = process.memoryUsage().heapUsed;
  const diff = afterMemory - beforeMemory;

  if (diff > 10 * 1024 * 1024) {
    // 10MB threshold
    console.warn(`Possible memory leak: ${diff / 1024 / 1024}MB`);
  }
});

Debugging Patterns by Issue Type

Pattern 1: Intermittent Bugs

## Strategies for Flaky Bugs

1. **Add extensive logging**
   - Log timing information
   - Log all state transitions
   - Log external interactions

2. **Look for race conditions**
   - Concurrent access to shared state
   - Async operations completing out of order
   - Missing synchronization

3. **Check timing dependencies**
   - setTimeout/setInterval
   - Promise resolution order
   - Animation frame timing

4. **Stress test**
   - Run many times
   - Vary timing
   - Simulate load

Pattern 2: Performance Issues

## Performance Debugging

1. **Profile first**
   - Don't optimize blindly
   - Measure before and after
   - Find bottlenecks

2. **Common culprits**
   - N+1 queries
   - Unnecessary re-renders
   - Large data processing
   - Synchronous I/O

3. **Tools**
   - Browser DevTools Performance tab
   - Lighthouse
   - Python: cProfile, line_profiler
   - Node: clinic.js, 0x

Pattern 3: Production Bugs

## Production Debugging

1. **Gather evidence**
   - Error tracking (Sentry, Bugsnag)
   - Application logs
   - User reports
   - Metrics/monitoring

2. **Reproduce locally**
   - Use production data (anonymized)
   - Match environment
   - Follow exact steps

3. **Safe investigation**
   - Don't change production
   - Use feature flags
   - Add monitoring/logging
   - Test fixes in staging

Best Practices

  1. Reproduce First: Can't fix what you can't reproduce
  2. Isolate the Problem: Remove complexity until minimal case
  3. Read Error Messages: They're usually helpful
  4. Check Recent Changes: Most bugs are recent
  5. Use Version Control: Git bisect, blame, history
  6. Take Breaks: Fresh eyes see better
  7. Document Findings: Help future you
  8. Fix Root Cause: Not just symptoms

Common Debugging Mistakes

  • Making Multiple Changes: Change one thing at a time
  • Not Reading Error Messages: Read the full stack trace
  • Assuming It's Complex: Often it's simple
  • Debug Logging in Prod: Remove before shipping
  • Not Using Debugger: console.log isn't always best
  • Giving Up Too Soon: Persistence pays off
  • Not Testing the Fix: Verify it actually works

Quick Debugging Checklist

## When Stuck, Check:

- [ ] Spelling errors (typos in variable names)
- [ ] Case sensitivity (fileName vs filename)
- [ ] Null/undefined values
- [ ] Array index off-by-one
- [ ] Async timing (race conditions)
- [ ] Scope issues (closure, hoisting)
- [ ] Type mismatches
- [ ] Missing dependencies
- [ ] Environment variables
- [ ] File paths (absolute vs relative)
- [ ] Cache issues (clear cache)
- [ ] Stale data (refresh database)

Resources

  • references/debugging-tools-guide.md: Comprehensive tool documentation
  • references/performance-profiling.md: Performance debugging guide
  • references/production-debugging.md: Debugging live systems
  • assets/debugging-checklist.md: Quick reference checklist
  • assets/common-bugs.md: Common bug patterns
  • scripts/debug-helper.ts: Debugging utility functions

Source

git clone https://github.com/wshobson/agents/blob/main/plugins/developer-essentials/skills/debugging-strategies/SKILL.mdView on GitHub

Overview

Master systematic debugging techniques, profiling tools, and root-cause analysis to efficiently track down bugs across any codebase or technology stack. This skill emphasizes the scientific method, a disciplined debugging mindset, and practical techniques to diagnose bugs, performance issues, or unexpected behavior in development, production, or distributed environments.

How This Skill Works

Follow a repeatable workflow: reproduce the issue, gather information, form a hypothesis, and test the hypothesis. Leverage targeted strategies like binary search, added logs, and component isolation, and verify findings before proceeding. Use rubber duck debugging to surface hidden assumptions and document results for traceability.

When to Use It

  • Tracking down elusive bugs
  • Debugging production issues
  • Investigating performance issues
  • Understanding unfamiliar codebases
  • Investigating memory leaks

Quick Start

  1. Step 1: Reproduce the issue with a minimal example and document exact steps and environment
  2. Step 2: Gather information — collect error messages, environment, and recent changes
  3. Step 3: Form a hypothesis and test it using binary search, logging, and component isolation until the root cause is found

Best Practices

  • Reproduce consistently and create a minimal reproduction
  • Collect complete error messages, environment details, and recent changes
  • Form testable hypotheses about what changed and where it could fail
  • Apply targeted testing: binary search, add logging, and component isolation
  • Keep notes and use rubber duck debugging to articulate the problem

Example Use Cases

  • Debug a flaky UI bug by reproducing steps and capturing logs
  • Analyze a memory leak in a Node.js service using heap dumps and allocation tracing
  • Investigate a production crash with a stack trace and recent deploy information
  • Profile a slow API endpoint with timing logs and differential analysis
  • Debug a distributed system by isolating services and simulating failure scenarios

Frequently Asked Questions

Add this skill to your agents
Sponsor this space

Reach thousands of developers