exploring-codebases
Scannednpx machina-cli add skill oaustegard/claude-skills/exploring-codebases --openclawExploring Codebases
Hybrid search tool that combines the speed of ripgrep with structural awareness from tree-sitter or pre-generated _MAP.md files. It finds matches and returns the entire function or class containing the match, de-duplicating results semantically.
Progressive Disclosure
By default, returns signatures only (docstrings + declarations without function bodies), reducing token usage by 10-20x. Use --expand-full to get complete implementations when needed.
Installation
uv venv /home/claude/.venv
# tree-sitter is optional if using --use-maps mode
uv pip install tree-sitter-language-pack --python /home/claude/.venv/bin/python
Usage
# Default: signatures only with tree-sitter (efficient)
/home/claude/.venv/bin/python /mnt/skills/user/exploring-codebases/scripts/search.py "query" /path/to/repo
# Full implementations
/home/claude/.venv/bin/python /mnt/skills/user/exploring-codebases/scripts/search.py "query" /path/to/repo --expand-full
# Map-based mode: no tree-sitter required (#276)
# Requires _MAP.md files generated by mapping-codebases
/home/claude/.venv/bin/python /mnt/skills/user/exploring-codebases/scripts/search.py "query" /path/to/repo --use-maps
Options
--glob pattern: Filter files (e.g.,*.py,*.ts)--expand-full: Return full implementations instead of signatures--json: Output JSON for machine processing (default is Markdown)--use-maps: Use pre-generated_MAP.mdfiles instead of tree-sitter for context expansion. Eliminates redundant tree-sitter parsing when maps already exist.
Map-Based Mode (v0.3.0)
When _MAP.md files exist (generated by mapping-codebases), use --use-maps to skip tree-sitter entirely:
mapping-codebasesruns tree-sitter once to generate_MAP.mdfiles with symbol locationsexploring-codebases --use-mapsuses ripgrep + map data for context expansion- No redundant AST parsing at search time
Benefits:
- Single tree-sitter execution per codebase (during map generation)
- Faster searches (no AST parsing overhead)
tree-sitter-language-packnot required at runtime- Maps serve as canonical index for both navigation and search
Recommended workflow:
# Step 1: Generate maps (one-time, or after code changes)
python /mnt/skills/user/mapping-codebases/scripts/codemap.py /path/to/repo
# Step 2: Search using maps (fast, no tree-sitter needed)
python /mnt/skills/user/exploring-codebases/scripts/search.py "query" /path/to/repo --use-maps
Examples
Find class signatures:
/home/claude/.venv/bin/python /mnt/skills/user/exploring-codebases/scripts/search.py "class User" /path/to/repo
Output:
class User:
"""User account model."""
...
Find full method implementation:
/home/claude/.venv/bin/python /mnt/skills/user/exploring-codebases/scripts/search.py "def validate" /path/to/repo --expand-full
Find usage of process_data in Python files:
/home/claude/.venv/bin/python /mnt/skills/user/exploring-codebases/scripts/search.py "process_data" /path/to/repo --glob "*.py"
Scope and Limitations
Returns structural code elements — functions, classes, methods, interfaces, enums, structs, traits, and modules across 11 supported languages (Python, JavaScript, TypeScript, Go, Rust, Ruby, Java, C, C++, PHP, C#).
Does not return:
- Import/require statements
- Module-level variable assignments or constants
- Standalone decorators (decorators attached to functions/classes are included with their parent)
- Type aliases or standalone type annotations
- Comments outside of functions/classes
For these non-structural elements, use plain ripgrep (via the Grep tool) directly.
Source
git clone https://github.com/oaustegard/claude-skills/blob/main/exploring-codebases/SKILL.mdView on GitHub Overview
Exploring Codebases is a hybrid search tool that pairs ripgrep speed with structural awareness from tree-sitter or pre-generated _MAP.md files. It locates matches and returns the entire function or class containing the match, de-duplicating results semantically. This ensures you get complete, syntactically valid blocks across 11 languages when full context is needed.
How This Skill Works
The tool searches with ripgrep for a query and then expands matches to full AST nodes using tree-sitter or pre-generated _MAP.md maps. By default, it returns only signatures (docstrings and declarations) to save tokens; add --expand-full to retrieve full implementations. In map-based mode with --use-maps, pre-generated maps skip runtime AST parsing for faster results.
When to Use It
- Locating the full function or class that contains a search hit to understand context
- Looking for implementations or usage patterns across a large codebase
- Comparing how a function is implemented across languages
- Reducing token usage by inspecting only signatures before deep dives
- Preferring faster searches when maps exist and _MAP.md is available
Quick Start
- Step 1: Start in signatures mode to scan results; switch to full bodies with --expand-full when needed
- Step 2: Run the search script, e.g. python path/to/scripts/search.py "query" /path/to/repo
- Step 3: Optional: tailor results with --glob, --expand-full, or --use-maps depending on your needs
Best Practices
- Start with signatures mode to quickly skim results
- Use --expand-full when you need full function or class bodies
- Filter results with --glob to narrow to relevant file types
- If maps exist, use --use-maps to skip runtime AST parsing
- Treat outputs as structural code elements and verify syntax if needed
Example Use Cases
- Find class signatures for User with a repo-wide search
- Find full method implementation for a specific function like 'def validate'
- Find usage of a symbol such as process_data in Python files using a glob
- Inspect cross-language implementations of a function to compare styles
- Audit symbol locations across a codebase using map-based mode