Get the FREE Ultimate OpenClaw Setup Guide →

binary-re-triage

Scanned
npx machina-cli add skill aiskillstore/marketplace/binary-re-triage --openclaw
Files (1)
SKILL.md
6.6 KB

Binary Triage (Phase 1)

Purpose

Quick fingerprinting to establish baseline facts before deeper analysis. Runs in seconds, not minutes.

When to Use

  • First contact with an unknown binary
  • Need architecture/ABI info for tool selection
  • Quick capability assessment
  • Before committing to expensive analysis

Key Principle

Gather facts fast, defer analysis.

This phase identifies WHAT the binary is, not HOW it works.

Triage Sequence

Step 1: File Identification

# Basic identification
file binary

# Expected output patterns:
# ELF 32-bit LSB executable, ARM, EABI5 version 1 (SYSV), dynamically linked, interpreter /lib/ld-linux-armhf.so.3
# ELF 64-bit LSB pie executable, ARM aarch64, version 1 (SYSV), dynamically linked, interpreter /lib/ld-linux-aarch64.so.1

Extract:

  • Architecture (ARM, ARM64, x86_64, MIPS)
  • Bit width (32/64)
  • Endianness (LSB/MSB)
  • Link type (static/dynamic)
  • Interpreter path (libc indicator)

Step 2: Structured Metadata (rabin2)

# All metadata as JSON
rabin2 -q -j -I binary | jq .

# Key fields:
# .arch     - "arm", "x86", "mips"
# .bits     - 32 or 64
# .endian   - "little" or "big"
# .os       - "linux", "none"
# .machine  - "ARM", "AARCH64"
# .stripped - true/false
# .static   - true/false

Step 3: ABI Detection

# Interpreter detection
readelf -p .interp binary 2>/dev/null

# Or via rabin2
rabin2 -I binary | grep interp

# ARM-specific: float ABI
readelf -A binary | grep "Tag_ABI_VFP_args"
# hard-float: "VFP registers"
# soft-float: missing or "compatible"

Interpreter → Libc mapping:

InterpreterLibcNotes
/lib/ld-linux-armhf.so.3glibcARM hard-float
/lib/ld-linux.so.3glibcARM soft-float
/lib/ld-musl-arm.so.1muslARM 32-bit
/lib/ld-musl-aarch64.so.1muslARM 64-bit
/lib/ld-uClibc.so.0uClibcEmbedded
/lib64/ld-linux-x86-64.so.2glibcx86_64

Step 4: Dependencies

# Library dependencies
rabin2 -q -j -l binary | jq '.libs[]'

# Common patterns:
# libcurl.so.* → HTTP client
# libssl.so.* → TLS/crypto
# libpthread.so.* → Threading
# libz.so.* → Compression
# libsqlite3.so.* → Local database

Step 5: Entry Points & Exports

# Entry points
rabin2 -q -j -e binary | jq .

# Exports (for shared libraries)
rabin2 -q -j -E binary | jq '.exports[] | {name, vaddr}'

Step 6: Quick String Scan

# All strings with metadata
rabin2 -q -j -zz binary | jq '.strings | length'  # Count first

# Filter interesting strings (URLs, paths, errors)
rabin2 -q -j -zz binary | jq '
  .strings[] |
  select(.length > 8) |
  select(.string | test("http|ftp|/etc|/var|error|fail|pass|key|token"; "i"))
'

Step 7: Import Analysis

# All imports
rabin2 -q -j -i binary | jq '.imports[] | {name, lib}'

# Group by capability
rabin2 -q -j -i binary | jq '
  .imports | group_by(.lib) |
  map({lib: .[0].lib, functions: [.[].name]})
'

Capability Mapping

Import PatternCapability
socket, connect, sendNetwork client
bind, listen, acceptNetwork server
open, read, writeFile I/O
fork, exec*, systemProcess spawning
pthread_*Multi-threading
SSL_*, EVP_*Cryptography
dlopen, dlsymDynamic loading
mmap, mprotectMemory manipulation

Output Format

After triage, record structured facts:

{
  "artifact": {
    "path": "/path/to/binary",
    "sha256": "abc123...",
    "size_bytes": 245760
  },
  "identification": {
    "arch": "arm",
    "bits": 32,
    "endian": "little",
    "os": "linux",
    "stripped": true,
    "static": false
  },
  "abi": {
    "interpreter": "/lib/ld-musl-arm.so.1",
    "libc": "musl",
    "float_abi": "hard"
  },
  "dependencies": [
    "libcurl.so.4",
    "libssl.so.1.1",
    "libz.so.1"
  ],
  "capabilities_inferred": [
    "network_client",
    "tls_encryption",
    "compression"
  ],
  "strings_of_interest": [
    {"value": "https://api.vendor.com/telemetry", "type": "url"},
    {"value": "/etc/config.json", "type": "path"}
  ],
  "complexity_estimate": {
    "functions": "unknown (stripped)",
    "strings": 847,
    "imports": 156
  }
}

Knowledge Journaling

After triage completes, record findings for episodic memory:

[BINARY-RE:triage] {filename} (sha256: {hash})

Identification:
  Architecture: {arch} {bits}-bit {endian}
  Libc: {glibc|musl|uclibc} ({interpreter_path})
  Stripped: {yes|no}
  Size: {bytes}

FACT: Links against {library} (source: rabin2 -l)
FACT: Contains {N} strings of interest (source: rabin2 -zz)
FACT: Imports {function} from {library} (source: rabin2 -i)

Capabilities inferred:
  - {capability_1} (evidence: {import/string})
  - {capability_2} (evidence: {import/string})

HYPOTHESIS: {what binary likely does} (confidence: {0.0-1.0})

QUESTION: {open unknown that needs investigation}

Next phase: {static-analysis|dynamic-analysis}
Sysroot needed: {path or "extract from device"}

Example Journal Entry

[BINARY-RE:triage] thermostat_daemon (sha256: a1b2c3d4...)

Identification:
  Architecture: ARM 32-bit LE
  Libc: musl (/lib/ld-musl-arm.so.1)
  Stripped: yes
  Size: 153,600 bytes

FACT: Links against libcurl.so.4 (source: rabin2 -l)
FACT: Links against libssl.so.1.1 (source: rabin2 -l)
FACT: Contains string "api.thermco.com" (source: rabin2 -zz)
FACT: Imports curl_easy_perform (source: rabin2 -i)

Capabilities inferred:
  - HTTP client (evidence: libcurl import)
  - TLS encryption (evidence: libssl import)
  - Network communication (evidence: URL string)

HYPOTHESIS: Telemetry client that reports to api.thermco.com (confidence: 0.6)

QUESTION: What data does it collect and transmit?

Next phase: static-analysis
Sysroot needed: musl ARM (extract from device or Alpine)

Decision Points

After triage, determine:

  1. Sysroot selection - Based on arch + libc
  2. Analysis tool chain - r2 vs Ghidra vs both
  3. Dynamic analysis feasibility - QEMU viability based on arch
  4. Initial hypotheses - What does this binary likely do?

Next Steps

→ Proceed to binary-re-static-analysis for function enumeration → Or binary-re-dynamic-analysis if behavior observation is priority

Source

git clone https://github.com/aiskillstore/marketplace/blob/main/skills/2389-research/binary-re-triage/SKILL.mdView on GitHub

Overview

Binary Triage Phase 1 provides rapid fingerprinting of unknown binaries, ELF files, or firmware blobs to establish baseline facts in seconds. It leverages rabin2 for architecture detection, ABI identification, dependency mapping, and string extraction to guide deeper analysis.

How This Skill Works

It starts with quick file identification, then uses rabin2 to produce a JSON metadata snapshot of arch, bits, endian, OS, and libraries. It then checks ABI signals via readelf or rabin2, maps dependencies, and enumerates entry points and exports, followed by a fast string scan to surface URLs, paths, and keys.

When to Use It

  • First contact with an unknown binary
  • Need architecture/ABI info for tool selection
  • Quick capability assessment
  • Before committing to expensive analysis
  • Initial triage to decide on file type and baseline facts

Quick Start

  1. Step 1: file binary
  2. Step 2: rabin2 -q -j -I binary | jq .
  3. Step 3: rabin2 -q -j -zz binary | jq '.strings | length'

Best Practices

  • Run file to identify ELF/PIE and architecture
  • Extract JSON metadata with rabin2 -q -j -I and inspect fields
  • Verify interpreter and libraries to map ABI
  • Inventory dependencies and potential capabilities from imports
  • Scan strings for URLs, paths, errors, keys to guide focus

Example Use Cases

  • Unknown ARM ELF on an embedded device
  • x86_64 Linux binary with dynamic linking and libc
  • MIPS firmware blob from a router
  • Shared library with exports and external dependencies
  • Static binary with non-stripped strings indicating potential targets

Frequently Asked Questions

Add this skill to your agents
Sponsor this space

Reach thousands of developers