mcp-rag

mcp-rag-server is a Model Context Protocol (MCP) server that enables Retrieval Augmented Generation (RAG) capabilities. It empowers Large Language Models (LLMs) to answer questions based on your document content by indexing and retrieving relevant information efficiently.

Installation

Run this command in your terminal to add the MCP server to Claude Code.

Run in terminal:

View docs

Command

claude mcp add --transport stdio kwanleefrmvi-mcp-rag-server npx -y mcp-rag-server \
  --env CHUNK_SIZE="500" \
  --env BASE_LLM_API="http://localhost:11434/v1" \
  --env EMBEDDING_MODEL="nomic-embed-text" \
  --env VECTOR_STORE_PATH="./vector_store"

How to use

The mcp-rag-server is a Retrieval Augmented Generation (RAG) MCP server. It indexes your documents, creates embeddings, and serves relevant context to MCP clients by exposing tools and resources over the MCP protocol. The server uses a local vector store (SQLite) and supports multiple embedding providers, with options to customize the chunk size and embedding model. Once running, you can index documents, query for relevant chunks, and retrieve document content through URIs like rag://documents and rag://query-document/{n}/{query}. Tools exposed by the server include embedding_documents(path), query_documents(query, k), remove_document(path), remove_all_documents(confirm), and list_documents(). The example configuration shows how to run the server via npx, including environment variables for API endpoints, embedding model, vector store path, and chunk size. To integrate with your MCP client, start the server and point the client at the rag MCP resource namespace to perform indexing, querying, and document retrieval.

How to install

Prerequisites:

Node.js (and npm) installed on your machine
Git (optional, for cloning the repository)

Installation steps:

Global npm installation (recommended for quick start): npm install -g mcp-rag-server
From source (if you want to build locally): git clone https://github.com/kwanLeeFrmVi/mcp-rag-server.git cd mcp-rag-server npm install npm run build npm start
Run with npx (as shown in the integration example): npx -y mcp-rag-server
If you prefer explicit environment setup, you can export variables before starting: export BASE_LLM_API=http://localhost:11434/v1 export EMBEDDING_MODEL=granite-embedding-278m-multilingual-Q6_K-1743674737397:latest export VECTOR_STORE_PATH=./vector_store export CHUNK_SIZE=500 npx -y mcp-rag-server

Additional notes

Tips and common considerations:

The DEFAULT integration uses BASE_LLM_API pointing to a local embedding service. Adjust BASE_LLM_API to your embedding provider if needed.
EMBEDDING_MODEL can be switched to models like nomic-embed-text or granite-based embeddings; Ollama is recommended for best performance, but ensure the model is available in your environment.
VECTOR_STORE_PATH controls where vectors are stored locally (SQLite). Move this to a persistent path if you restart the server.
CHUNK_SIZE defines how much text is included per chunk; larger values may improve context but require more memory during embedding.
When running via MCP clients, ensure the client is configured to use the rag namespace (rag://) for documents and queries.
If you run into indexing issues, verify that your embedding API is reachable and that the vector store directory is writable.
The npm package name is mcp-rag-server and the npm package page is linked in the README.

Related MCP Servers

obsidian -tools

612

Add Obsidian integrations like semantic search and custom Templater prompts to Claude or any MCP client.

Mantic.sh

540

A structural code search engine for Al agents.

pluggedin-app

The Crossroads for AI Data Exchanges. A unified, self-hostable web interface for discovering, configuring, and managing Model Context Protocol (MCP) servers—bringing together AI tools, workspaces, prompts, and logs from multiple MCP sources (Claude, Cursor, etc.) under one roof.

furi

CLI & API for MCP management

mcp -arangodb

This is a TypeScript-based MCP server that provides database interaction capabilities through ArangoDB. It implements core database operations and allows seamless integration with ArangoDB through MCP tools. You can use it wih Claude app and also extension for VSCode that works with mcp like Cline!

CodeRAG

Advanced graph-based code analysis for AI-assisted software development