supernova -rag
A practical POC demonstrating how to build and run a local MCP server with Retrieval-Augmented Generation (RAG) for semantic search over internal documentation. Leverages Node.js, TypeScript, Hugging Face embeddings, and an in-memory vector store to enable fast, context-aware answers in tools like Cursor.
claude mcp add --transport stdio shabib87-supernova-mcp-rag node /absolute-path-to/supernova-mcp-rag/mcp-rag-server/dist/index.js \ --env HUGGINGFACE_API_KEY="your_huggingface_token_here"
How to use
SuperNova MCP RAG Server is a Node.js-based MCP server that provides a semantic search over the SuperNova HTML documentation and uses a Retrieval-Augmented Generation (RAG) pipeline to answer questions. It exposes a tool named search_docs through the MCP protocol, which performs semantic search over the pre-processed documentation chunks stored in memory. The server loads the documentation from docs/SuperNovaStorybook-Mobile-Swift, chunks the text, embeds those chunks using HuggingFace embeddings, and stores them in an in-memory vector store. When a user asks a question, the MCP server retrieves the most relevant chunks via semantic search and constructs an answer leveraging the RAG pipeline. To use it, run the MCP server and connect via Cursor or the MCP Inspector to issue queries and observe tool calls and responses in real time.
How to install
Prerequisites:
- Node.js 18+ (and Yarn for workspace support)
- Access to HuggingFace API (required for embeddings)
Install and run steps:
- Install dependencies at the repo root:
yarn install
- Install workspace dependencies and verify workspace info (optional but recommended):
yarn workspaces info
- Build and start the MCP RAG server workspace:
# Build the server package
yarn workspace mcp-rag-server build
# Start the server
yarn workspace mcp-rag-server start
- Environment setup:
- Create a .env file under mcp-rag-server and set your HuggingFace API key:
# in mcp-rag-server/.env
HUGGINGFACE_API_KEY=your_huggingface_token_here
- If you’re testing with Cursor, add the MCP configuration (for example in Cursor settings) pointing to the Node.js command and the built index.js path:
{
"mcpServers": {
"mcp-rag-server": {
"command": "node",
"args": ["/absolute-path-to/supernova-mcp-rag/mcp-rag-server/dist/index.js"],
"disabled": false,
"autoApprove": []
}
}
}
- For development with hot-reload, you can run:
yarn dev
Prerequisites recap: ensure Node.js 18+ is installed, Yarn is available for workspace support, and you have a HuggingFace API key configured in .env.
Additional notes
Tips and common issues:
- The server may take a while to initialize as it loads and embeds all HTML docs into the in-memory vector store. Monitor logs for progress during startup.
- The MCP tool exposed is search_docs, which performs semantic search over the documentation chunks.
- If you hit HuggingFace API rate limits on embedding, consider batching where supported or increasing your quota. Review the Hugging Face API pricing and limits in the README.
- The in-memory vector store is suitable for small to medium doc sets; there is no persistence across restarts in the current setup, so startup time can be long on each launch due to re-embedding.
- Ensure your environment variable HUGGINGFACE_API_KEY is set correctly; otherwise embeddings will fail at startup.
- If you need to adapt the doc source, you can modify the path that the RAG pipeline scans (e.g., docs/SuperNovaStorybook-Mobile-Swift) to point to different HTML sources.
- Cursor integration steps rely on absolute path references; replace with your actual project path when configuring Cursor.
- If you plan to run in production, consider replacing the in-memory vector store with a persistent vector store (e.g., Pinecone, Weaviate, or Qdrant) for scalability.
Related MCP Servers
minima
On-premises conversational RAG with configurable containers
mcp-pinecone
Model Context Protocol server to allow for reading and writing from Pinecone. Rudimentary RAG
langgraph-ai
LangGraph AI Repository
pluggedin-app
The Crossroads for AI Data Exchanges. A unified, self-hostable web interface for discovering, configuring, and managing Model Context Protocol (MCP) servers—bringing together AI tools, workspaces, prompts, and logs from multiple MCP sources (Claude, Cursor, etc.) under one roof.
nutrient-dws
A Model Context Protocol (MCP) server implementation that integrates with the Nutrient Document Web Service (DWS) Processor API, providing powerful PDF processing capabilities for AI assistants.
mcp-playground
A Streamlit-based chat app for LLMs with plug-and-play tool support via Model Context Protocol (MCP), powered by LangChain, LangGraph, and Docker.