Get the FREE Ultimate OpenClaw Setup Guide →

cpg-analysis

npx machina-cli add skill allsmog/vuln-scout/cpg-analysis --openclaw
Files (1)
SKILL.md
3.7 KB

Code Property Graph (CPG) Analysis

What is a Code Property Graph?

A Code Property Graph (CPG) is a unified data structure that combines three representations of code:

  1. Abstract Syntax Tree (AST) - Structural representation
  2. Control Flow Graph (CFG) - Execution paths
  3. Program Dependence Graph (PDG) - Data and control dependencies

This combination enables powerful semantic queries that pattern-matching tools cannot achieve.

When to Use CPG vs Pattern Matching

ApproachUse WhenExample
Pattern Matching (Semgrep)Known vulnerability patterns, syntax-level issuesFinding dynamic code execution calls
CPG Analysis (Joern)Data flow tracking, cross-function analysisProving request input reaches database query through 5 functions

Rule of thumb: Use CPG when you need to prove data flows between points, especially across function boundaries.

Joern Overview

Joern is the primary tool for CPG analysis. It:

  • Parses source code into CPG representation
  • Provides CPGQL (Scala-based) query language
  • Supports JavaScript, TypeScript, Python, Java, C/C++, Go, PHP

Basic Joern Workflow

# 1. Parse codebase into CPG
joern-parse /path/to/code --output cpg.bin

# 2. Start Joern REPL or run scripts
joern --script analysis.sc --params cpgFile=cpg.bin

# 3. Or use Joern REPL interactively
joern
> importCpg("cpg.bin")
> cpg.method.name(".*login.*").l

CPGQL Query Language

CPGQL uses Scala syntax with CPG-specific operations.

Core Concepts

Nodes: Represent code elements

  • cpg.method - All methods/functions
  • cpg.call - All function calls
  • cpg.parameter - Function parameters
  • cpg.literal - Literal values
  • cpg.identifier - Variable references

Traversals: Navigate the graph

  • .name("pattern") - Filter by name (regex)
  • .code("pattern") - Filter by code content
  • .argument - Get call arguments
  • .caller - Get calling methods
  • .callee - Get called methods

Data Flow: Track how data moves

  • .reachableBy(source) - Find if source reaches this point
  • .reachableByFlows(source) - Get full paths

Common Query Patterns

Find all calls to a function:

cpg.call.name("query").l

Find parameters that reach dangerous sinks:

val sources = cpg.parameter.name("req.*|request.*")
val sinks = cpg.call.name("query|execute|run")
sinks.argument.reachableBy(sources).l

Get full data flow paths:

val sources = cpg.parameter.name("userInput")
val sinks = cpg.call.name("executeQuery")
sinks.argument.reachableByFlows(sources).p

Confidence Scoring

After CPG verification:

Verification ResultConfidenceMeaning
Data flow confirmedHIGH (0.9+)CPG proves exploitability
Partial flow foundMEDIUM (0.6-0.9)Some path exists, manual review needed
No flow foundLOW (0.3-0.6)May be false positive or complex flow
Verification failedUNKNOWNQuery error, manual analysis required

Skill References

  • references/cpgql-patterns.md - Common vulnerability query patterns
  • references/joern-cheatsheet.md - Quick Joern/CPGQL reference

Related Skills

  • data-flow-tracing - Manual source-to-sink analysis
  • dangerous-functions - Sink identification by language
  • vuln-patterns - Pattern-based vulnerability knowledge

Source

git clone https://github.com/allsmog/vuln-scout/blob/main/whitebox-pentest/skills/cpg-analysis/SKILL.mdView on GitHub

Overview

Code Property Graph (CPG) unifies AST, CFG, and PDG into a single model to enable semantic queries beyond pattern matching. This skill focuses on using Joern and CPGQL to perform data-flow verification and taint tracking, helping you prove how input moves through code and where vulnerabilities may arise.

How This Skill Works

Joern parses source code into a Code Property Graph and exposes CPGQL for traversing code elements and dependencies. Data-flow analysis is done with traversals like reachableBy and reachableByFlows to prove paths from sources (inputs) to sinks (vulnerable calls), across functions when needed.

When to Use It

  • Proving that user input reaches a dangerous sink across multiple functions
  • Performing taint tracking and data-flow verification with Joern
  • Analyzing data flows in multi-language codebases (JS, Python, Java, C/C++, Go, PHP)
  • When pattern matching misses complex paths and you need end-to-end flow proof
  • Verifying full data-flow paths using reachableByFlows in CPGQL

Quick Start

  1. Step 1: joern-parse /path/to/code --output cpg.bin
  2. Step 2: joern --script analysis.sc --params cpgFile=cpg.bin
  3. Step 3: In Joern REPL, importCpg("cpg.bin"); run queries like cpg.call.name("query").l

Best Practices

  • Start by parsing the codebase into a CPG with joern-parse /path/to/code --output cpg.bin
  • Use Joern REPL or an analysis script to run queries against the CPG (e.g., joern --script analysis.sc --params cpgFile=cpg.bin)
  • In queries, leverage .reachableBy(source) and .reachableByFlows(source) to identify paths and full data-flow chains
  • Compare CPG-based results with pattern-based checks (e.g., Semgrep) when appropriate, to validate findings
  • Keep a library of known sources (inputs) and sinks (dangerous calls) and update as code evolves

Example Use Cases

  • Find all calls to a dangerous sink: cpg.call.name("query").l
  • Trace whether request parameters reach a sink: val sources = cpg.parameter.name("req.*|request.*"); val sinks = cpg.call.name("query|execute|run"); sinks.argument.reachableBy(sources).l
  • Get full data-flow paths from userInput to executeQuery: val sources = cpg.parameter.name("userInput"); val sinks = cpg.call.name("executeQuery"); sinks.argument.reachableByFlows(sources).p
  • Basic Joern workflow: joern-parse /path/to/code --output cpg.bin; joern --script analysis.sc --params cpgFile=cpg.bin; joern; importCpg("cpg.bin"); cpg.method.name(".*login.*").l
  • Explore core CPGQL concepts: cpg.method, cpg.call, cpg.parameter, cpg.literal, cpg.identifier and traversals like .name(), .code(), .caller, .callee

Frequently Asked Questions

Add this skill to your agents
Sponsor this space

Reach thousands of developers