What is root-cause-tracing?

A debugging approach that traces errors backward through the call stack to identify the original trigger and fix at the source, not just the symptom.

What tools are mentioned for tracing?

Serena MCP (find_referencing_symbols) for call tracing, and insert_before_symbol for surgical instrumentation; console.error can be used in tests for diagnostics.

How do I know when to stop tracing?

Continue tracing until you identify the original trigger and verify the data flow, then implement a fix at the source and validate with tests.

root-cause-tracing

npx machina-cli add skill faulkdev/github-copilot-superpowers/root-cause-tracing --openclaw

Files (1)

SKILL.md

6.8 KB

Root Cause Tracing

Overview

Bugs often manifest deep in the call stack (git init in wrong directory, file created in wrong location, database opened with wrong path). Your instinct is to fix where the error appears, but that's treating a symptom.

Core principle: Trace backward through the call chain until you find the original trigger, then fix at the source.

When to Use

digraph when_to_use {
    "Bug appears deep in stack?" [shape=diamond];
    "Can trace backwards?" [shape=diamond];
    "Fix at symptom point" [shape=box];
    "Trace to original trigger" [shape=box];
    "BETTER: Also add defense-in-depth" [shape=box];

    "Bug appears deep in stack?" -> "Can trace backwards?" [label="yes"];
    "Can trace backwards?" -> "Trace to original trigger" [label="yes"];
    "Can trace backwards?" -> "Fix at symptom point" [label="no - dead end"];
    "Trace to original trigger" -> "BETTER: Also add defense-in-depth";
}

Use when:

Error happens deep in execution (not at entry point)
Stack trace shows long call chain
Unclear where invalid data originated
Need to find which test/code triggers the problem

The Tracing Process

1. Observe the Symptom

Error: git init failed in /Users/jesse/project/packages/core

2. Find Immediate Cause

What code directly causes this?

await execFileAsync('git', ['init'], { cwd: projectDir });

3. Ask: What Called This?

Using Serena MCP for efficient call tracing:

Use find_referencing_symbols(functionName) to get all callers with code context
Returns symbolic information showing which functions/classes call your target
Includes code snippets around each reference
Much faster than manually searching

Example:

// Instead of manually searching, use Serena:
find_referencing_symbols("gitInit")
// Returns: WorktreeManager.createSessionWorktree() calls gitInit()
find_referencing_symbols("createSessionWorktree")
// Returns: Session.initializeWorkspace() calls createSessionWorktree()
// Continue tracing upward...

Manual tracing (if Serena unavailable):

WorktreeManager.createSessionWorktree(projectDir, sessionId)
  → called by Session.initializeWorkspace()
  → called by Session.create()
  → called by test at Project.create()

4. Keep Tracing Up

What value was passed?

projectDir = '' (empty string!)
Empty string as cwd resolves to process.cwd()
That's the source code directory!

5. Find Original Trigger

Where did empty string come from?

const context = setupCoreTest(); // Returns { tempDir: '' }
Project.create('name', context.tempDir); // Accessed before beforeEach!

Adding Stack Traces

When you can't trace manually, add instrumentation:

Using Serena MCP for surgical instrumentation:

Use insert_before_symbol to add diagnostic logging at function entry
No need to read entire file or guess indentation
Keeps logging changes isolated and minimal

// Use insert_before_symbol("gitInit", '...code...') to add:
const stack = new Error().stack;
console.error('DEBUG git init:', {
  directory,
  cwd: process.cwd(),
  nodeEnv: process.env.NODE_ENV,
  stack,
});

Manual insertion (if Serena unavailable):

// Before the problematic operation
async function gitInit(directory: string) {
  const stack = new Error().stack;
  console.error('DEBUG git init:', {
    directory,
    cwd: process.cwd(),
    nodeEnv: process.env.NODE_ENV,
    stack,
  });

  await execFileAsync('git', ['init'], { cwd: directory });
}

Critical: Use console.error() in tests (not logger - may not show)

Run and capture (via subagent):

# (subagent) npm test 2>&1 | grep 'DEBUG git init'

Analyze stack traces:

Look for test file names
Find the line number triggering the call
Identify the pattern (same test? same parameter?)

Finding Which Test Causes Pollution

If something appears during tests but you don't know which test:

Ask a research subagent to use the bisection script: ~/.claude/skills/root-cause-tracing/find-polluter.sh (return a cited Context Package with the first polluter)

# (subagent) ./find-polluter.sh '.git' 'src/**/*.test.ts'

Runs tests one-by-one, stops at first polluter. See script for usage.

Real Example: Empty projectDir

Symptom: .git created in packages/core/ (source code)

Trace chain:

git init runs in process.cwd() ← empty cwd parameter
WorktreeManager called with empty projectDir
Session.create() passed empty string
Test accessed context.tempDir before beforeEach
setupCoreTest() returns { tempDir: '' } initially

Root cause: Top-level variable initialization accessing empty value

Fix: Made tempDir a getter that throws if accessed before beforeEach

Also added defense-in-depth:

Layer 1: Project.create() validates directory
Layer 2: WorkspaceManager validates not empty
Layer 3: NODE_ENV guard refuses git init outside tmpdir
Layer 4: Stack trace logging before git init

Key Principle

digraph principle {
    "Found immediate cause" [shape=ellipse];
    "Can trace one level up?" [shape=diamond];
    "Trace backwards" [shape=box];
    "Is this the source?" [shape=diamond];
    "Fix at source" [shape=box];
    "Add validation at each layer" [shape=box];
    "Bug impossible" [shape=doublecircle];
    "NEVER fix just the symptom" [shape=octagon, style=filled, fillcolor=red, fontcolor=white];

    "Found immediate cause" -> "Can trace one level up?";
    "Can trace one level up?" -> "Trace backwards" [label="yes"];
    "Can trace one level up?" -> "NEVER fix just the symptom" [label="no"];
    "Trace backwards" -> "Is this the source?";
    "Is this the source?" -> "Trace backwards" [label="no - keeps going"];
    "Is this the source?" -> "Fix at source" [label="yes"];
    "Fix at source" -> "Add validation at each layer";
    "Add validation at each layer" -> "Bug impossible";
}

NEVER fix just where the error appears. Trace back to find the original trigger.

Stack Trace Tips

In tests: Use console.error() not logger - logger may be suppressed Before operation: Log before the dangerous operation, not after it fails Include context: Directory, cwd, environment variables, timestamps Capture stack: new Error().stack shows complete call chain

Real-World Impact

From debugging session (2025-10-03):

Found root cause through 5-level trace
Fixed at source (getter validation)
Added 4 layers of defense
1847 tests passed, zero pollution

Source

git clone https://github.com/faulkdev/github-copilot-superpowers/blob/integrate-obra-superpowers/.github/skills/root-cause-tracing/SKILL.md

View on GitHub

Overview

Root Cause Tracing focuses on locating the actual origin of an error by tracing backward through the call stack. Instead of fixing the symptom, it seeks the source of invalid data or incorrect behavior and fix at the source.

How This Skill Works

Start from the symptom, identify the immediate cause, then trace callers upward using tooling like Serena MCP's find_referencing_symbols to map the call graph. If tracing stalls, insert lightweight instrumentation at function entries (insert_before_symbol) to capture context and stack traces, then continue until the original trigger is found and addressed.

When to Use It

Error happens deep in execution (not at entry point)
Stack trace shows a long call chain
Unclear where invalid data originated
Need to determine which test/code triggers the problem
Symptoms suggest commands are running in the wrong working directory (e.g., git init in the wrong dir)

Quick Start

Step 1: Observe the symptom and reproduce if possible
Step 2: Use Serena MCP to run find_referencing_symbols and map the call chain
Step 3: If needed, insert lightweight instrumentation with insert_before_symbol to capture context and stack traces

Best Practices

Trace upward from the symptom to the origin rather than patching the symptom
Use find_referencing_symbols(functionName) to locate callers with code context
Follow the call chain upward to identify the original trigger
Add targeted instrumentation at function entry with insert_before_symbol
Use console.error in tests to surface diagnostics and keep instrumentation isolated

Example Use Cases

Error: git init failed in /Users/jesse/project/packages/core
Empty string as cwd resolves to process.cwd(), pointing to the wrong directory
context.tempDir is '' leading to an incorrect path being used
Tracing from WorktreeManager.createSessionWorktree to Session.initializeWorkspace
Inserting diagnostic logging via insert_before_symbol to capture stack traces

Frequently Asked Questions

Add this skill to your agents