Get the FREE Ultimate OpenClaw Setup Guide →

scan-repo-readme

npx machina-cli add skill nainishshafi/developer-productivity-skills/scan-repo-readme --openclaw
Files (1)
SKILL.md
6.7 KB

Scan Repo README

Efficiently locate and extract information from repository README files using a haiku-model agent with minimal context. The agent performs dual-phase search (keyword + semantic), writes results to a timestamped file, and the main agent reads and presents the findings.

⚠️ IMPORTANT — always use --root-only by default.
Running find-readmes.py without --root-only performs a full recursive walk and returns every nested README (including .venv, vendor packages, and large example trees), producing noisy, slow, and unreliable results.
Only omit --root-only when you deliberately need a full repository audit.

Quick Start

Windows (PowerShell):

if (-not (Test-Path .venv)) { python -m venv .venv }
$PYTHON = if (Test-Path .venv\Scripts\python.exe) { ".venv\Scripts\python" } else { "python3" }
& $PYTHON .github/skills/scan-repo-readme/scripts/find-readmes.py --root-only

Unix/macOS (Bash):

[ -d .venv ] || python -m venv .venv
PYTHON=$(if [ -f .venv/Scripts/python ]; then echo .venv/Scripts/python; else echo .venv/bin/python; fi)
$PYTHON .github/skills/scan-repo-readme/scripts/find-readmes.py --root-only

Then take the output path (line 1) + file list (lines 2+) and pass them to the subagent (see Step 2).

Workflow

Step 1 — Find README Files

Windows (PowerShell):

if (-not (Test-Path .venv)) { python -m venv .venv }
$PYTHON = if (Test-Path .venv\Scripts\python.exe) { ".venv\Scripts\python" } else { "python3" }

# RECOMMENDED — root-only scan (top-level + immediate child directories only)
& $PYTHON .github/skills/scan-repo-readme/scripts/find-readmes.py --root-only

# Full recursive scan — use only for broad audits; expect many noisy results
# & $PYTHON .github/skills/scan-repo-readme/scripts/find-readmes.py

# Include .github/skills SKILL.md files in the listing
# & $PYTHON .github/skills/scan-repo-readme/scripts/find-readmes.py --root-only --include-skills

# Integrated keyword+semantic scanner (writes timestamped Markdown report)
& $PYTHON .github/skills/scan-repo-readme/scripts/run_scan_design_patterns.py --root-only

Unix/macOS (Bash):

[ -d .venv ] || python -m venv .venv
PYTHON=$(if [ -f .venv/Scripts/python ]; then echo .venv/Scripts/python; else echo .venv/bin/python; fi)

# RECOMMENDED — root-only
$PYTHON .github/skills/scan-repo-readme/scripts/find-readmes.py --root-only

# Full recursive scan (noisy — use only for repo-wide audits)
# $PYTHON .github/skills/scan-repo-readme/scripts/find-readmes.py

# Include SKILL.md files
# $PYTHON .github/skills/scan-repo-readme/scripts/find-readmes.py --root-only --include-skills

# Integrated keyword+semantic scanner
$PYTHON .github/skills/scan-repo-readme/scripts/run_scan_design_patterns.py --root-only

The script prints to stdout:

  • Line 1: output file path (e.g., .scan-readme-results/readme-scan-20240315-143022.md)
  • Line 2+: absolute paths of all README files found

If Bash is unavailable, use Glob with patterns **/README.md, **/readme.md, **/README.rst, **/README.txt, **/README and build the output path manually from the current timestamp.

Step 2 — Launch Haiku Subagent (required)

Invoke a dedicated subagent to perform the dual-phase scan and write the report. The skill requires a subagent run (no local Python fallback) so the subagent must be granted write access to the repository workspace.

Agent parameters (example):

  • subagent_type: "general-purpose"
  • model: "haiku" (agent may fall back to the session model if unavailable)
  • description: "Scan README files"
  • write: true # the subagent must be able to create the timestamped output file

Provide the subagent with a compact prompt containing only the file list, the output path, and the user query (no conversation history). Example prompt payload:

Files to scan: {paste file paths from Step 1, one per line}

Output file path: {paste output path from Step 1}

User query: {user's search term, or "full summary" if no query provided}

The subagent must perform the dual-phase search described below and write the final Markdown report to the provided output file. Do not return the report text in the agent response.

Dual-phase scan (subagent behavior):

  • Phase 1 — Keyword search:

    • Expand the user's query using synonym groups from references/search-patterns.md.
    • Grep all listed files for these keywords, recording file path, line number, and ±3 lines of context.
  • Phase 2 — Semantic search:

    • Split files into sections by Markdown headings (H2/H3 preferred).
    • Score each section as HIGH / MEDIUM / LOW for relevance to the query.
  • Combine results (priority order): HIGH keyword+semantic → HIGH semantic-only → keyword-only. Quote directly from sources, include file path and heading, and deduplicate overlapping excerpts.

Report format (Markdown, written to the output path):

  • H1: # Repository Scan Report (include query, date, file count)
  • H2: file path of each README scanned
  • H3: relevance tier (HIGH / MEDIUM / LOW) when a query is provided
  • Body: direct quotes with surrounding context

Note: the subagent must create the output file at the provided path. If you run this skill in an environment that cannot launch haiku, configure the agent to use the session-inherited model and allow write access.

Root-only mode (default recommendation)

Use --root-only when the user's intent is limited to top-level project documentation. This avoids excessive noise from nested examples, vendored packages, and virtualenvs. The full recursive scan is available for broad repo-wide audits but will include many irrelevant README files in large monorepos — validate and trim the file list before launching downstream scans.

Step 3 — Read and Present

Read the timestamped output file and present the findings clearly and concisely to the user.

Additional Resources

  • references/search-patterns.md — README filename patterns, synonym groups, relevance scoring criteria, output format spec
  • scripts/find-readmes.py — Locate README files and print the output path + file list. Supports --root-only and opt-in --include-skills.
  • scripts/run_scan_design_patterns.py — Integrated scanner for the design patterns query that performs keyword+semantic scanning and writes a timestamped Markdown report. Supports --root-only and --out.

Source

git clone https://github.com/nainishshafi/developer-productivity-skills/blob/master/.github/skills/scan-repo-readme/SKILL.mdView on GitHub

Overview

Efficiently locate and extract information from repository README files using a haiku-model agent with minimal context. It performs a dual-phase search (keyword + semantic), writes results to a timestamped file, and the main agent reads and presents the findings.

How This Skill Works

The skill first performs a root-constrained scan to find README files (default --root-only to avoid noise). It then launches a dedicated haiku subagent to run a dual-phase keyword+semantic search, writing results to a timestamped Markdown report which the main agent reads and presents.

When to Use It

  • Find what a README says about a topic or keyword.
  • Scan top-level and immediate child README files quickly with root-only scans.
  • Generate a timestamped report of findings for a repo's documentation.
  • Perform keyword+semantic search across README files to locate relevant info.
  • Audit repository docs for coverage like install, usage, or contribution guidelines.

Quick Start

  1. Step 1: Run find-readmes.py --root-only to locate README files and capture the output path.
  2. Step 2: Run the integrated scanner (run_scan_design_patterns.py --root-only) to produce a timestamped report.
  3. Step 3: Open the generated report at the path output in Step 1 and review the findings.

Best Practices

  • Always run with --root-only by default to avoid noisy, full-repo scans.
  • Consciously omit --root-only only when you need a full repository audit.
  • Use --root-only --include-skills if you want to include SKILL.md files in the search.
  • Review the generated timestamped report before sharing results.
  • Keep the search scope focused on README documentation to maximize relevance.

Example Use Cases

  • Find installation steps or usage examples described in a project's README.
  • Check whether README files across modules consistently document APIs.
  • Locate contribution guidelines and license information in repository docs.
  • Identify gaps where READMEs lack sections like setup or troubleshooting.
  • Compare how different components describe configuration or environment setup.

Frequently Asked Questions

Add this skill to your agents
Sponsor this space

Reach thousands of developers