A local-first pipeline that converts PDFs and documents to markdown, indexes them for RAG, and analyzes them token-efficiently.

Where are outputs saved?

Converted markdown files are saved by default under knowledge/boofed/. Override with --output-dir and specify a collection with --collection.

Can I batch-process documents?

Yes. Boof supports batch processing and cross-document queries to compare across papers.

Boof

Scanned

@chiefsegundo

npx machina-cli add skill @chiefsegundo/boof --openclaw

Files (1)

SKILL.md

2.4 KB

Boof 🍑

Local-first document processing: PDF → markdown → RAG index → token-efficient analysis.

Documents stay local. Only relevant chunks go to the LLM. Maximum knowledge absorption, minimum token burn.

Quick Reference

Convert + index a document

bash {SKILL_DIR}/scripts/boof.sh /path/to/document.pdf

Convert with custom collection name

bash {SKILL_DIR}/scripts/boof.sh /path/to/document.pdf --collection my-project

Query indexed content

qmd query "your question" -c collection-name

Core Workflow

Boof it: Run boof.sh on a PDF. This converts it to markdown via Marker (local ML, no API) and indexes it into QMD for semantic search.
Query it: Use qmd query to retrieve only the relevant chunks. Send those chunks to the LLM — not the entire document.
Analyze it: The LLM sees focused, relevant excerpts. No wasted tokens, no lost-in-the-middle problems.

When to Use Each Approach

"Analyze this specific aspect of the paper" → Boof + query (cheapest, most focused)

"Summarize this entire document" → Boof, then read the markdown section by section. Summarize each section individually, then merge summaries. See advanced-usage.md.

"Compare findings across multiple papers" → Boof all papers into one collection, then query across them.

"Find where the paper discusses X" → qmd search "X" -c collection for exact match, qmd query "X" -c collection for semantic match.

Output Location

Converted markdown files are saved to knowledge/boofed/ by default (override with --output-dir).

Setup

If boof.sh reports missing dependencies, see setup-guide.md for installation instructions (Marker + QMD).

Environment

MARKER_ENV — Path to marker-pdf Python venv (default: ~/.openclaw/tools/marker-env)
QMD_BIN — Path to qmd binary (default: ~/.bun/bin/qmd)
BOOF_OUTPUT_DIR — Default output directory (default: ~/.openclaw/workspace/knowledge/boofed)

Source

git clone https://clawhub.ai/chiefsegundo/boofView on GitHub

Overview

Boof converts PDFs and documents into markdown, then builds a local RAG index for efficient, token-conscious analysis. It keeps source files on your machine, sending only relevant chunks to the LLM to maximize knowledge absorption while minimizing token burn.

How This Skill Works

Boof.sh converts a PDF to markdown using Marker (local ML, no API) and indexes it into QMD for semantic search. You then use qmd query to fetch only the most relevant chunks and feed those excerpts to the LLM for analysis, ensuring focused, efficient processing.

When to Use It

Read/analyze/summarize a PDF
Process a document
Boof a file
Extract information from papers/decks/NOFOs
Work with large documents without filling the context window

Quick Start

Step 1: Run boof.sh /path/to/document.pdf to convert to markdown and index it
Step 2: Query content with qmd query "your question" -c collection-name
Step 3: Review retrieved excerpts and send them to the LLM for analysis

Best Practices

Run boof.sh on the target PDF to generate markdown and build the local index
Organize projects with --collection and customize output with --output-dir
Query selectively with qmd query to retrieve only relevant chunks
Batch-process multiple documents before performing analysis
Review retrieved excerpts and synthesize results to avoid token waste

Example Use Cases

Analyze a long research paper locally and extract key findings
Index a stack of NOFOs to pull deadlines and requirements across documents
Compare findings across multiple decks by Boof-ing them into a single collection
Summarize a long report section-by-section and merge the summaries
Build a local knowledge base for a project by ingesting multiple PDFs

Frequently Asked Questions

Add this skill to your agents