mcp-rag
mcp-rag-server is a Model Context Protocol (MCP) server that enables Retrieval Augmented Generation (RAG) capabilities. It empowers Large Language Models (LLMs) to answer questions based on your document content by indexing and retrieving relevant information efficiently.
claude mcp add --transport stdio kwanleefrmvi-mcp-rag-server npx -y mcp-rag-server \ --env CHUNK_SIZE="500" \ --env BASE_LLM_API="http://localhost:11434/v1" \ --env EMBEDDING_MODEL="nomic-embed-text" \ --env VECTOR_STORE_PATH="./vector_store"
How to use
The mcp-rag-server is a Retrieval Augmented Generation (RAG) MCP server. It indexes your documents, creates embeddings, and serves relevant context to MCP clients by exposing tools and resources over the MCP protocol. The server uses a local vector store (SQLite) and supports multiple embedding providers, with options to customize the chunk size and embedding model. Once running, you can index documents, query for relevant chunks, and retrieve document content through URIs like rag://documents and rag://query-document/{n}/{query}. Tools exposed by the server include embedding_documents(path), query_documents(query, k), remove_document(path), remove_all_documents(confirm), and list_documents(). The example configuration shows how to run the server via npx, including environment variables for API endpoints, embedding model, vector store path, and chunk size. To integrate with your MCP client, start the server and point the client at the rag MCP resource namespace to perform indexing, querying, and document retrieval.
How to install
Prerequisites:
- Node.js (and npm) installed on your machine
- Git (optional, for cloning the repository)
Installation steps:
-
Global npm installation (recommended for quick start): npm install -g mcp-rag-server
-
From source (if you want to build locally): git clone https://github.com/kwanLeeFrmVi/mcp-rag-server.git cd mcp-rag-server npm install npm run build npm start
-
Run with npx (as shown in the integration example): npx -y mcp-rag-server
-
If you prefer explicit environment setup, you can export variables before starting: export BASE_LLM_API=http://localhost:11434/v1 export EMBEDDING_MODEL=granite-embedding-278m-multilingual-Q6_K-1743674737397:latest export VECTOR_STORE_PATH=./vector_store export CHUNK_SIZE=500 npx -y mcp-rag-server
Additional notes
Tips and common considerations:
- The DEFAULT integration uses BASE_LLM_API pointing to a local embedding service. Adjust BASE_LLM_API to your embedding provider if needed.
- EMBEDDING_MODEL can be switched to models like nomic-embed-text or granite-based embeddings; Ollama is recommended for best performance, but ensure the model is available in your environment.
- VECTOR_STORE_PATH controls where vectors are stored locally (SQLite). Move this to a persistent path if you restart the server.
- CHUNK_SIZE defines how much text is included per chunk; larger values may improve context but require more memory during embedding.
- When running via MCP clients, ensure the client is configured to use the rag namespace (rag://) for documents and queries.
- If you run into indexing issues, verify that your embedding API is reachable and that the vector store directory is writable.
- The npm package name is mcp-rag-server and the npm package page is linked in the README.
Related MCP Servers
obsidian -tools
Add Obsidian integrations like semantic search and custom Templater prompts to Claude or any MCP client.
Mantic.sh
A structural code search engine for Al agents.
pluggedin-app
The Crossroads for AI Data Exchanges. A unified, self-hostable web interface for discovering, configuring, and managing Model Context Protocol (MCP) servers—bringing together AI tools, workspaces, prompts, and logs from multiple MCP sources (Claude, Cursor, etc.) under one roof.
furi
CLI & API for MCP management
mcp -arangodb
This is a TypeScript-based MCP server that provides database interaction capabilities through ArangoDB. It implements core database operations and allows seamless integration with ArangoDB through MCP tools. You can use it wih Claude app and also extension for VSCode that works with mcp like Cline!
CodeRAG
Advanced graph-based code analysis for AI-assisted software development