rag-research
npx machina-cli add skill docutray/docutray-claude-code-plugins/rag-research --openclawRAG Research Skill
Use this skill when users ask about RAG (Retrieval-Augmented Generation), semantic search, document indexing, embeddings, vector databases, or chunking strategies. This skill provides best practices for working with the rag-research plugin and optimizing document retrieval.
When to Use
Trigger this skill when users:
- Ask about RAG, embeddings, or semantic search concepts
- Want to optimize their document indexing strategy
- Need help with chunking or retrieval quality
- Ask "how does rag-research work?" or "how to improve search results?"
- Troubleshoot poor search results or missing information
Core Concepts
Document Indexing Pipeline
- Load Document: Extract text from PDF, Markdown, or Text files
- Chunk Text: Split into overlapping segments (default: 512 chars, 50 overlap)
- Generate Embeddings: Convert chunks to vectors using FastEmbed (BAAI/bge-small-en-v1.5)
- Store in Qdrant: Persist vectors with metadata for retrieval
Embedding Models
The plugin uses FastEmbed with ONNX Runtime for efficient CPU inference:
| Model | Dimensions | Speed | Quality | Use Case |
|---|---|---|---|---|
BAAI/bge-small-en-v1.5 | 384 | Fast | Good | Default, general use |
BAAI/bge-base-en-v1.5 | 768 | Medium | Better | Higher accuracy needs |
BAAI/bge-large-en-v1.5 | 1024 | Slow | Best | Maximum quality |
Chunking Strategies
Chunk size affects retrieval quality:
- Smaller chunks (256-512): More precise, may lose context
- Larger chunks (1024+): More context, may dilute relevance
- Overlap (10-20%): Prevents information loss at boundaries
Recommend: Start with defaults (512/50), adjust based on results.
Search Quality Tips
- Use specific queries: "Mistral OCR API configuration" > "OCR"
- Check coverage: Run
/rag-research:listto verify documents are indexed - Increase limit: Use
--limit 20for comprehensive research - Review scores: Scores > 0.7 are highly relevant, < 0.5 may be tangential
Troubleshooting
Poor Search Results
- Check if document is indexed:
/rag-research:list --filter "keyword" - Re-index with different chunking: Adjust
CHUNK_SIZEin.env - Use more specific queries: Add domain-specific terms
- Verify embeddings: Check model compatibility
PDF Extraction Issues
- Enable Mistral OCR: Set
MISTRAL_API_KEYin.envfor scanned PDFs - Fallback to pypdf: Use
--no-ocrflag for text-based PDFs - Check file permissions: Ensure PDF is readable
Database Issues
- Reset database:
rm -rf ~/.rag-researchand re-index - Check disk space: Qdrant needs space for vectors
- Verify installation:
uv run rag-research stats
Configuration Reference
See references/configuration.md for detailed settings documentation.
Examples
See references/examples.md for common usage patterns.
Source
git clone https://github.com/docutray/docutray-claude-code-plugins/blob/main/plugins/rag-research/skills/rag-research/SKILL.mdView on GitHub Overview
RAG Research helps you design and optimize Retrieval-Augmented Generation workflows. It covers document indexing, embedding, vector databases, and chunking strategies, with practical tips for using the rag-research plugin to boost retrieval quality.
How This Skill Works
Load documents from PDFs, Markdown, or Text, then chunk into overlapping segments (default: 512 chars, 50 overlap). Generate embeddings with FastEmbed (ONNX Runtime) and store vectors in Qdrant with metadata for retrieval. Choose embedding models based on accuracy needs (BAAI/bge-small-en-v1.5, -base-en-v1.5, -large-en-v1.5) and tune chunk size/overlap to balance precision and context.
When to Use It
- When you’re working with RAG, embeddings, or semantic search concepts
- When you want to optimize how documents are indexed for retrieval
- When you need to improve chunking or retrieval quality
- When asking 'how does rag-research work?' or 'how to improve search results?'
- When troubleshooting poor search results or missing information
Quick Start
- Step 1: Load documents (PDF/Markdown/Text) and chunk with default 512/50
- Step 2: Generate embeddings using FastEmbed (ONNX Runtime) and store in Qdrant
- Step 3: Run basic queries and verify results with /rag-research:list and score thresholds
Best Practices
- Start with default chunking (512 chars, 50 overlap) and adjust based on results
- Use the ONNX FastEmbed models for CPU inference and balance speed vs. accuracy
- Verify indexing with /rag-research:list and ensure documents are indexed
- Experiment with chunk sizes (256-512 vs 1024+) and measure retrieval quality
- Keep embeddings and database healthy: monitor Qdrant storage and model compatibility
Example Use Cases
- Index a corpus of PDFs and answer precise questions with chunked segments
- Tune chunking to improve retrieval for product manuals or technical docs
- Switch to a larger embedding model for higher accuracy when needed
- Audit result quality by checking scores and adjusting limit with --limit 20
- Re-index after corpus updates and verify with /rag-research:list