mcp-local-rag
Local-first RAG server for developers using MCP. Semantic + keyword search for code and technical docs. Fully private, zero setup.
claude mcp add --transport stdio shinpr-mcp-local-rag npx -y mcp-local-rag \ --env BASE_DIR="Path to the folder containing your documents to index and search"
How to use
MCP Local RAG is a fully private, offline-capable retrieval-augmented generation server. It performs local semantic search with keyword boosts to find exact terms (like useEffect, error codes, or class names) while preserving topic coherence through smart semantic chunking. The server lets you ingest documents (PDF, DOCX, TXT, Markdown) or HTML content, builds local embeddings, stores them in a vector database, and serves queries through six MCP tools: ingest_file, ingest_data, query_documents, list_files, delete_file, and status. You can tune search behavior with environment variables and adjust how results are grouped and boosted to balance semantic relevance with exact-term matching.
To use the MCP server in your tools, add it to your local configuration as a running MCP endpoint. The Quick Start demonstrates integrating via Cursor, Codex, or Claude Code by invoking npx mcp-local-rag and pointing tools at the BASE_DIR that contains your documents. After setup, you ingest content to populate the local index, then issue natural language questions to retrieve most relevant chunks with exact-term matches surfaced through keyword boosts. The system emphasizes privacy and offline operation, with no external API calls once the model is downloaded.
You can interact with the server through the documented MCP tool commands, such as:
- ingest_file: index a document file
- ingest_data: index HTML or other content fetched externally
- query_documents: perform a semantic + keyword search
- list_files: view ingested documents and status
- delete_file: remove an ingested document
- status: check server health
How to install
Prerequisites:
- Node.js (recommended latest LTS, e.g., 18.x or newer) and npm installed on your machine
- Internet access for the first model download via npx
Install and run the MCP Local RAG server:
- Ensure Node.js and npm are installed
- Check: node -v and npm -v
- Create a working directory and set BASE_DIR to your documents location (absolute path recommended)
- Run the MCP server via npx (no local install required):
# Example: start the MCP Local RAG server with BASE_DIR set
BASE_DIR="/path/to/your/documents" \
npx -y mcp-local-rag
Optional: If you want to explicitly pass environment variables, you can export BASE_DIR before running or set them inline when starting the process. The quick-start examples show using the npx command with an environment variable in place for your tooling integration (Cursor, Codex, Claude Code).
If you prefer a single-command invocation from your tooling configuration, you can mirror the start command shown in the Quick Start by wiring your tool to execute:
BASE_DIR=/path/to/your/documents npx -y mcp-local-rag
Additional notes
Tips and common considerations:
- BASE_DIR is required for indexing and searching. Set it to a directory containing documents you want to search locally.
- The server runs entirely offline after the initial model download, so ensure you have the model file options available locally.
- You can tune search behavior using environment variables (e.g., RAG_HYBRID_WEIGHT, RAG_GROUPING, RAG_MAX_DISTANCE, RAG_MAX_FILES) to balance semantic relevance with exact-term boosts.
- When ingesting HTML via ingest_data, the content is cleaned (Readability) and converted to Markdown before indexing. Ensure you respect copyright and terms of service when ingesting web content.
- Ingesting the same file replaces the previous version automatically, preserving a clean index.
- If you use multiple MCP tools or integrate with IDEs, you can reuse the same BASE_DIR across tools for a consistent search experience.
Related MCP Servers
grepai
Semantic Search & Call Graphs for AI Agents (100% Local)
Mantic.sh
A structural code search engine for Al agents.
swiftlens
SwiftLens is a Model Context Protocol (MCP) server that provides deep, semantic-level analysis of Swift codebases to any AI models. By integrating directly with Apple's SourceKit-LSP, SwiftLens enables AI models to understand Swift code with compiler-grade accuracy.
langfuse
A Model Context Protocol (MCP) server for Langfuse, enabling AI agents to query Langfuse trace data for enhanced debugging and observability
mcp-ragex
MCP server for intelligent code search: semantic (RAG), symbolic (tree-sitter), and regex (ripgrep) search modes. Built for Claude Code and AI coding assistants.
agent-configs
Control Claude Code, Cursor & Gemini CLI remotely — answer agent questions from your phone via Slack