What is abstract-state-analyzer?

A static analysis tool that uses abstract interpretation to infer program states and detect potential runtime errors without executing the code.

Which abstract domains are supported?

Supported domains include Interval (numeric ranges), Sign (positive/negative/zero), Null (nullability), Type (possible types), and combinations of these for more precise analysis.

How should I use the results?

Review the reported locations, error types, and the associated abstract state. Use the explanations and suggestions to fix issues, then re-run analysis to confirm resolutions.

abstract-state-analyzer

Scanned

npx machina-cli add skill ArabelaTso/Skills-4-SE/abstract-state-analyzer --openclaw

Files (1)

SKILL.md

7.0 KB

Abstract State Analyzer

Overview

This skill performs abstract interpretation to statically analyze source code and infer possible program states, variable ranges, and data properties. It identifies potential runtime errors without executing the program.

Analysis Workflow

Step 1: Parse and Understand Code Structure

Analyze the code to identify:

Functions and their control flow
Variable declarations and types
Loops and conditionals
Array/buffer operations
Pointer/reference operations
Function calls and parameter passing

Step 2: Select Abstract Domains

Choose appropriate abstract domains based on the analysis goals:

Interval Domain: Track numeric variable ranges

Example: x ∈ [0, 100] means x is between 0 and 100
Good for: Array bounds checking, overflow detection

Sign Domain: Track whether values are positive, negative, or zero

Values: {+, -, 0, ⊤}
Good for: Division by zero, sign-dependent operations

Null Domain: Track whether pointers/references can be null

Values: {null, not-null, maybe-null, ⊤}
Good for: Null dereference detection

Type Domain: Track possible types of variables

Good for: Type consistency checking, dynamic language analysis

Combination: Use multiple domains together for more precise analysis

Step 3: Initialize Abstract States

Set initial abstract values for:

Function parameters (based on preconditions or ⊤ for unknown)
Global variables
Constants and literals

Example:

def process(arr, index):
    # Initial state:
    # arr: not-null (assumed)
    # index: ⊤ (unknown integer)

Step 4: Perform Forward Analysis

Propagate abstract states through the program:

Assignment: Update abstract value

x = 5
# x: [5, 5]

y = x + 3
# y: [8, 8]

Conditionals: Split into branches

if x > 10:
    # Branch 1: x ∈ [11, ∞]
else:
    # Branch 2: x ∈ [-∞, 10]

Loops: Iterate until fixpoint

i = 0
while i < n:
    # Iteration 1: i ∈ [0, 0]
    # Iteration 2: i ∈ [0, 1]
    # ...
    # Fixpoint: i ∈ [0, n-1]
    i += 1

Join Points: Merge states from multiple paths

if condition:
    x = 5  # x: [5, 5]
else:
    x = 10  # x: [10, 10]
# After join: x ∈ [5, 10]

Step 5: Detect Potential Errors

Check for violations at each operation:

Array Bounds:

arr[index]
# Check: index ∈ [0, len(arr)-1]?
# If index: ⊤ → Potential out-of-bounds
# If index: [0, 5] and len(arr) = 10 → Safe

Null Dereference:

ptr.field
# Check: ptr is not-null?
# If ptr: maybe-null → Potential null dereference

Division by Zero:

x / y
# Check: 0 ∉ y?
# If y: [1, 10] → Safe
# If y: [-5, 5] → Potential division by zero

Integer Overflow:

x = a * b
# Check: a * b within type bounds?
# If a: [1000, 2000], b: [1000, 2000] → Potential overflow for int32

Type Inconsistency:

result = func(arg)
# Check: arg type matches parameter type?

Step 6: Report Findings

For each potential error, report:

Location: File, line number, function
Error Type: Out-of-bounds, null dereference, etc.
Abstract State: Variable values at the error point
Severity: Definite error vs. potential error
Explanation: Why the error might occur
Suggestion: How to fix (add check, change bounds, etc.)

Complete Example

def find_max(arr, n):
    if n <= 0:
        return None

    max_val = arr[0]
    i = 1
    while i < n:
        if arr[i] > max_val:
            max_val = arr[i]
        i += 1
    return max_val

Analysis:

Initial State:

arr: not-null (assumed)
n: ⊤ (unknown integer)

Line 2: if n <= 0

Branch 1 (n ≤ 0): n ∈ [-∞, 0]
Branch 2 (n > 0): n ∈ [1, ∞]

Line 3: return None (Branch 1)

Safe return

Line 5: max_val = arr[0] (Branch 2)

Access: arr[0]
Check: 0 < len(arr)?
POTENTIAL ERROR: arr length unknown, might be empty
State: max_val = arr[0], n ∈ [1, ∞]

Line 6: i = 1

State: i = [1, 1]

Line 7: while i < n

Loop invariant: i ∈ [1, n]
Fixpoint: i ∈ [1, n-1] inside loop

Line 8: if arr[i] > max_val

Access: arr[i] where i ∈ [1, n-1]
Check: i < len(arr)?
POTENTIAL ERROR: If n > len(arr), out-of-bounds access
State: max_val updated if arr[i] > max_val

Line 10: i += 1

State: i ∈ [2, n]

Report:

POTENTIAL ERRORS FOUND:

1. Out-of-Bounds Access
   Location: line 5, arr[0]
   State: n ∈ [1, ∞], arr length unknown
   Severity: Potential
   Explanation: Array 'arr' might be empty when n > 0
   Suggestion: Add check: if len(arr) == 0 or add precondition

2. Out-of-Bounds Access
   Location: line 8, arr[i]
   State: i ∈ [1, n-1], arr length unknown
   Severity: Potential
   Explanation: If n > len(arr), accessing beyond array bounds
   Suggestion: Add precondition: n <= len(arr) or check i < len(arr)

Language-Specific Considerations

C/C++

Track pointer arithmetic carefully
Consider undefined behavior (signed overflow, null dereference)
Analyze memory allocation/deallocation
Check buffer sizes for string operations

Python

Dynamic typing requires type domain
List/dict operations need bounds checking
None values require null domain
Consider duck typing and attribute access

Java

Null pointer exceptions
Array bounds (ArrayIndexOutOfBoundsException)
Integer overflow (silent wraparound)
Type casting (ClassCastException)

JavaScript

Undefined and null values
Type coercion issues
Array bounds (returns undefined, not error)
Property access on null/undefined

Handling Complexity

Widening for Loops

When loops don't converge quickly, apply widening:

# Instead of: [0, 0] → [0, 1] → [0, 2] → ...
# Widen to: [0, ∞]

Function Summaries

For called functions, use summaries instead of full analysis:

def helper(x):
    # Summary: returns x + 1, no errors
    return x + 1

# Use summary instead of analyzing helper body

Path Sensitivity

For complex conditionals, track path conditions:

if x > 0 and x < 10:
    # x ∈ [1, 9] (path-sensitive)
    # vs x ∈ [-∞, ∞] (path-insensitive)

References

For detailed information on abstract interpretation techniques and domains:

references/abstract_domains.md: Detailed abstract domain definitions and operations
references/analysis_patterns.md: Common analysis patterns for different error types
references/language_specifics.md: Language-specific analysis considerations

Source

git clone https://github.com/ArabelaTso/Skills-4-SE/blob/main/skills/abstract-state-analyzer/SKILL.mdView on GitHub

Overview

Performs abstract interpretation to statically analyze source code and infer possible program states, variable ranges, and data properties without execution. It identifies potential runtime errors such as array bounds violations, null dereferences, type inconsistencies, division by zero, and integer overflows, helping you assess safety and verify behavior before running code.

How This Skill Works

The analyzer parses code structure, selects one or more abstract domains (Interval, Sign, Null, Type), initializes abstract states from parameters and globals, and performs forward analysis to propagate states through assignments, conditionals, and loops until a fixpoint. It then detects potential errors at operations like array indexing, dereferences, division, and type mismatches, and reports findings with context and guidance.

When to Use It

Auditing a codebase for potential runtime errors before deployment
Performing static analysis to verify safety properties without running the program
Checking array/buffer bounds, null safety, and division-by-zero risks in critical code
Verifying type consistency and data invariants in dynamic or weakly-typed languages
Investigating complex control flows to identify possible state violations and fixpoints

Quick Start

Step 1: Provide the source code or repository path to the analyzer
Step 2: Configure abstract domains (Interval, Sign, Null, Type) to match your goals
Step 3: Run analysis and review the reported potential errors with their abstract states and recommendations

Best Practices

Define clear input contracts and initialize parameters with ⊤ only when the value is truly unknown
Choose and combine abstract domains (Interval, Sign, Null, Type) tailored to your analysis goals
Analyze with forward propagation and ensure fixpoint convergence using appropriate widening
Carefully interpret array index and dereference checks to distinguish definite vs. potential issues
Validate findings with targeted tests or dynamic checks to confirm real-world impact

Example Use Cases

Identifying potential array bounds violations in C/C++ routines
Finding null dereferences in Java/C# code paths without running the program
Uncovering division-by-zero risks in numerical computations
Spotting integer overflow scenarios in arithmetic-intensive modules
Verifying type consistency across dynamic languages or runtime-typed segments

Frequently Asked Questions

Add this skill to your agents