supernova -rag

A practical POC demonstrating how to build and run a local MCP server with Retrieval-Augmented Generation (RAG) for semantic search over internal documentation. Leverages Node.js, TypeScript, Hugging Face embeddings, and an in-memory vector store to enable fast, context-aware answers in tools like Cursor.

Installation

Run this command in your terminal to add the MCP server to Claude Code.

Run in terminal:

View docs

Command

claude mcp add --transport stdio shabib87-supernova-mcp-rag node /absolute-path-to/supernova-mcp-rag/mcp-rag-server/dist/index.js \
  --env HUGGINGFACE_API_KEY="your_huggingface_token_here"

How to use

SuperNova MCP RAG Server is a Node.js-based MCP server that provides a semantic search over the SuperNova HTML documentation and uses a Retrieval-Augmented Generation (RAG) pipeline to answer questions. It exposes a tool named search_docs through the MCP protocol, which performs semantic search over the pre-processed documentation chunks stored in memory. The server loads the documentation from docs/SuperNovaStorybook-Mobile-Swift, chunks the text, embeds those chunks using HuggingFace embeddings, and stores them in an in-memory vector store. When a user asks a question, the MCP server retrieves the most relevant chunks via semantic search and constructs an answer leveraging the RAG pipeline. To use it, run the MCP server and connect via Cursor or the MCP Inspector to issue queries and observe tool calls and responses in real time.

How to install

Prerequisites:

Node.js 18+ (and Yarn for workspace support)
Access to HuggingFace API (required for embeddings)

Install and run steps:

Install dependencies at the repo root:

yarn install

Install workspace dependencies and verify workspace info (optional but recommended):

yarn workspaces info

Build and start the MCP RAG server workspace:

# Build the server package
yarn workspace mcp-rag-server build

# Start the server
yarn workspace mcp-rag-server start

Environment setup:

Create a .env file under mcp-rag-server and set your HuggingFace API key:

# in mcp-rag-server/.env
HUGGINGFACE_API_KEY=your_huggingface_token_here

If you’re testing with Cursor, add the MCP configuration (for example in Cursor settings) pointing to the Node.js command and the built index.js path:

{
  "mcpServers": {
    "mcp-rag-server": {
      "command": "node",
      "args": ["/absolute-path-to/supernova-mcp-rag/mcp-rag-server/dist/index.js"],
      "disabled": false,
      "autoApprove": []
    }
  }
}

For development with hot-reload, you can run:

yarn dev

Prerequisites recap: ensure Node.js 18+ is installed, Yarn is available for workspace support, and you have a HuggingFace API key configured in .env.

Additional notes

Tips and common issues:

The server may take a while to initialize as it loads and embeds all HTML docs into the in-memory vector store. Monitor logs for progress during startup.
The MCP tool exposed is search_docs, which performs semantic search over the documentation chunks.
If you hit HuggingFace API rate limits on embedding, consider batching where supported or increasing your quota. Review the Hugging Face API pricing and limits in the README.
The in-memory vector store is suitable for small to medium doc sets; there is no persistence across restarts in the current setup, so startup time can be long on each launch due to re-embedding.
Ensure your environment variable HUGGINGFACE_API_KEY is set correctly; otherwise embeddings will fail at startup.
If you need to adapt the doc source, you can modify the path that the RAG pipeline scans (e.g., docs/SuperNovaStorybook-Mobile-Swift) to point to different HTML sources.
Cursor integration steps rely on absolute path references; replace with your actual project path when configuring Cursor.
If you plan to run in production, consider replacing the in-memory vector store with a persistent vector store (e.g., Pinecone, Weaviate, or Qdrant) for scalability.

Related MCP Servers

minima

1.0k

On-premises conversational RAG with configurable containers

mcp-pinecone

148

Model Context Protocol server to allow for reading and writing from Pinecone. Rudimentary RAG

langgraph-ai

LangGraph AI Repository

pluggedin-app

The Crossroads for AI Data Exchanges. A unified, self-hostable web interface for discovering, configuring, and managing Model Context Protocol (MCP) servers—bringing together AI tools, workspaces, prompts, and logs from multiple MCP sources (Claude, Cursor, etc.) under one roof.

nutrient-dws

A Model Context Protocol (MCP) server implementation that integrates with the Nutrient Document Web Service (DWS) Processor API, providing powerful PDF processing capabilities for AI assistants.

mcp-playground

A Streamlit-based chat app for LLMs with plug-and-play tool support via Model Context Protocol (MCP), powered by LangChain, LangGraph, and Docker.