Get the FREE Ultimate OpenClaw Setup Guide →

mcp-rag

Lightweight RAG server for the Model Context Protocol: ingest source code, docs, build a vector index, and expose search/citations to LLMs via MCP tools.

Installation
Run this command in your terminal to add the MCP server to Claude Code.
Run in terminal:
Command
claude mcp add --transport stdio daniel-barta-mcp-rag-server node dist/index.js \
  --env HOST="0.0.0.0 (default) or specific host for HTTP transport" \
  --env MCP_PORT="3000 (default) or any port when using HTTP transport" \
  --env REPO_ROOT="path to target repository (absolute or relative to working dir)" \
  --env CHUNK_SIZE="chunk size in characters (default 800)" \
  --env MODEL_NAME="embedding model name (e.g. 'sentence-transformers/...')" \
  --env ALLOWED_EXT="comma-separated list of allowed file extensions (e.g. ts,tsx,js,md)" \
  --env CHUNK_OVERLAP="chunk overlap in characters (default 120)" \
  --env MCP_TRANSPORT="stdio | http" \
  --env EXCLUDED_FOLDERS="comma-separated glob patterns for folders to skip" \
  --env INDEX_STORE_PATH="path to persistent index storage (optional)" \
  --env TRANSFORMERS_CACHE="path to local transformers cache"

How to use

mcp-rag-server is a local, zero-network after model load Retrieval-Augmented Generation helper that you plug into any client speaking the Model Context Protocol (MCP). It indexes a repository directory, creates local embeddings with the Hugging Face transformers, and exposes MCP tools such as rag_query for semantic search, read_file for safe file access, and list_files to browse the repo. You can run this server in stdio mode for IDE integration or in HTTP mode to watch logs and readiness from a distance. The server supports PDF extraction, per-language config, chunking with overlap for better recall, and incremental updates to avoid full rebuilds. Start via Node.js (dist/index.js) and configure environment variables like REPO_ROOT, MCP_TRANSPORT, and MODEL_NAME to tailor behavior. Tools like rag_query, read_file, and list_files are available through the MCP interface once the server reports readiness.

How to install

Prerequisites:\n- Node.js 18+ installed on your system.\n- A target repository directory you want to index (set via REPO_ROOT).\n- Optional: a local Transformer model cache path (TRANSFORMERS_CACHE).\n\nInstall steps:\n1) Install dependencies and build:\n\n``` npm install npm run build

2) Prepare environment and run the server locally:\n\n```
# Windows PowerShell / macOS/Linux shell examples
export REPO_ROOT="/path/to/your-repo"        # or set for Windows PowerShell
export MCP_TRANSPORT="stdio"                  # default; change to http for HTTP transport
# Optional: point to a model cache to speed up first run
export TRANSFORMERS_CACHE="/path/to/cache"

node dist/index.js
  1. Verify readiness (if using HTTP): http://127.0.0.1:3000/health should indicate ready when embeddings exist.\n\nOptional: run inspector for MCP tooling:\n``` npx @modelcontextprotocol/inspector http://localhost:3000/mcp --transport http

Additional notes

Tips and notes:\n- Start with REPO_ROOT pointing to a local checkout of your repository. The server will read files relative to REPO_ROOT and will reject paths escaping that root for safety.\n- For large repos, prefer MCP_TRANSPORT=http to monitor progress and watch readiness. The health endpoint includes indexing progress: filesDiscovered, chunksTotal, and chunksEmbedded.\n- If you enable INDEX_STORE_PATH, embeddings persist across restarts, enabling faster warm starts.\n- PDF files are supported; text is cached in pdf-text-cache.json and treated like normal text for search.\n- LAZY/Incremental indexing helps avoid full rebuilds on small changes.\n- Fine-tuning: set ALLOWED_EXT to limit indexing to relevant languages and file types.\n- If you encounter slow startup, set TRANSFORMERS_CACHE to a fast local directory and ensure MODEL_NAME is reachable.

Related MCP Servers

Sponsor this space

Reach thousands of developers