Get the FREE Ultimate OpenClaw Setup Guide →

Data Flow Tracing

npx machina-cli add skill allsmog/vuln-scout/data-flow-tracing --openclaw
Files (1)
SKILL.md
5.7 KB

Data Flow Tracing

Purpose

Guide the process of tracing user-controlled input from entry points (sources) through the application to security-sensitive functions (sinks). This is essential for confirming vulnerability exploitability.

When to Use

Activate this skill when:

  • Confirming if identified sinks receive user input
  • Mapping the path from source to sink
  • Understanding data transformations and filters
  • Determining if sanitization can be bypassed

Core Concepts

Sources (Input Entry Points)

HTTP Sources:

LanguageCommon Sources
PHP$_GET, $_POST, $_REQUEST, $_COOKIE, $_FILES, $_SERVER
Javarequest.getParameter(), request.getHeader(), @RequestParam
Pythonrequest.args, request.form, request.data, request.json
Node.jsreq.query, req.body, req.params, req.headers
.NETRequest.QueryString, Request.Form, Request["param"]

Other Sources:

  • Database queries (stored user data)
  • File contents (user-uploaded or modified)
  • Environment variables
  • External API responses

Sinks (Dangerous Functions)

Refer to the dangerous-functions skill for comprehensive sink lists.

Data Transformations

Track how data changes between source and sink:

  • Encoding/Decoding (base64, URL, HTML)
  • Concatenation with other strings
  • Array/object property access
  • Type conversions
  • String manipulations

Tracing Methodology

Step 1: Identify the Sink

Start from the dangerous function identified during code review.

Step 2: Find Direct Parameters

Identify what variables/parameters are passed to the sink.

Example: system($cmd);
Direct parameter: $cmd

Step 3: Trace Backwards

Follow each parameter to its origin:

  1. Check function parameters
  2. Check variable assignments
  3. Check conditional branches
  4. Check loop iterations
  5. Check included/required files

Step 4: Identify Sources

Determine where user input enters:

$cmd = $_GET['command'];  // Direct source
$cmd = $row['command'];   // Database (check how it was stored)
$cmd = $config['cmd'];    // Config file (check if user-modifiable)

Step 5: Map Transformations

Document all changes to the data:

Source: $_GET['input']
  -> urldecode()
  -> str_replace(['../', '..\\'], '', $input)
  -> escapeshellarg()
  -> Sink: exec()

Step 6: Assess Exploitability

Consider:

  • Are filters/sanitization bypassable?
  • Is the full input controllable?
  • Are there alternative paths?

Tracing Techniques

Static Analysis (Manual)

Forward Tracing: Start from source, follow to sinks

$input = $_GET['x'];
$processed = process($input);
dangerous_function($processed);

Backward Tracing: Start from sink, trace to source

dangerous_function($var);
  <- $var = transform($data);
  <- $data = $_POST['param'];

Using IDE Features

  • Find all references to variable
  • Go to definition
  • Find usages
  • Call hierarchy

Using Grep

# Find where variable is assigned
grep -rn "\$varname\s*=" --include="*.php"

# Find where variable is used
grep -rn "\$varname" --include="*.php"

# Find function calls
grep -rn "functionName\s*(" --include="*.php"

Common Patterns

Direct Flow

$input = $_GET['cmd'];
system($input);  // Vulnerable

Database-Mediated Flow

// Store
$db->insert(['cmd' => $_POST['cmd']]);

// Later, retrieve and execute
$row = $db->query("SELECT cmd FROM jobs")->fetch();
system($row['cmd']);  // Vulnerable if original input wasn't sanitized

Configuration Flow

// Config loaded from user-modifiable file
$config = parse_ini_file('/var/www/config.ini');
system($config['backup_cmd']);  // Vulnerable if config is modifiable

Multi-File Flow

// file1.php
$_SESSION['cmd'] = $_GET['cmd'];

// file2.php
system($_SESSION['cmd']);  // Vulnerable

Sanitization Analysis

Identify Sanitization Functions

$input = htmlspecialchars($_GET['x']);  // XSS protection
$input = escapeshellarg($_GET['x']);    // Command injection protection
$input = intval($_GET['x']);            // Type casting
$input = preg_replace('/[^a-z]/', '', $_GET['x']);  // Whitelist

Assess Bypass Potential

SanitizationBypass Considerations
BlacklistMissing characters, encoding
WhitelistLogic errors, regex flaws
Type castingDepends on sink requirements
EncodingDouble encoding, context
Length limitsTruncation attacks

Common Bypass Techniques

  • Case variations
  • Encoding (URL, Unicode, HTML)
  • Null bytes
  • Double encoding
  • Alternative representations

Documentation Template

When tracing, document findings:

## Finding: [Vulnerability Type]

### Sink
- File: path/to/file.php
- Line: 42
- Function: system($cmd)

### Source
- File: path/to/file.php  
- Line: 35
- Source: $_GET['command']

### Data Flow
1. $_GET['command'] received (line 35)
2. Passed to sanitize() function (line 36)
3. Concatenated with prefix (line 38)
4. Passed to system() (line 42)

### Sanitization
- sanitize() removes semicolons and pipes
- Bypass: Use newline (%0a) or $() syntax

### Exploitability
- Confirmed exploitable
- Payload: `valid_command%0awhoami`

Integration with Other Skills

  • Use dangerous-functions to identify sinks
  • Use vuln-patterns for exploitation techniques
  • Use exploit-techniques to develop PoC

Source

git clone https://github.com/allsmog/vuln-scout/blob/main/whitebox-pentest/skills/data-flow-tracing/SKILL.mdView on GitHub

Overview

Data Flow Tracing guides you to follow user-controlled input from entry points (sources) through the application to security-sensitive sinks. It helps confirm exploitability by revealing data paths, transformations, and whether sanitization can be bypassed.

How This Skill Works

It begins by identifying the sink and its direct parameters, then traces those values backward to their origins. The process emphasizes documenting data transformations (encoding, concatenation, type conversions) and uses static analysis, IDE features, and grep to map the flow and assess exploitability.

When to Use It

  • Confirming if identified sinks receive user input
  • Mapping the path from source to sink
  • Understanding data transformations and filters
  • Determining if sanitization can be bypassed
  • Tracing tainted data during whitebox pentesting

Quick Start

  1. Step 1: Identify the sink in code review (dangerous function).
  2. Step 2: Trace direct parameters to their origins and map all transformations.
  3. Step 3: Assess exploitability and document findings with sources, sinks, and paths.

Best Practices

  • Identify the sink first during code review and map back to sources
  • Document all data transformations between source and sink
  • Leverage language-specific source lists to locate initial inputs
  • Test sanitization by attempting controlled bypass techniques in a safe test environment
  • Capture and annotate each step with code references and line numbers

Example Use Cases

  • Direct Flow: $input = $_GET['cmd']; system($input); // vulnerable
  • Database-Mediated Flow: Store in DB then read back and execute via system($row['cmd']);
  • Configuration Flow: Config value used in a command without proper validation
  • File Contents Flow: Read user-uploaded file contents and pass to a dangerous function
  • Environment/API Flow: Environment variable or external API response used to drive a shell command

Frequently Asked Questions

Add this skill to your agents
Sponsor this space

Reach thousands of developers