haiku.rag

Opinionated agentic RAG powered by LanceDB, Pydantic AI, and Docling

Installation

Run this command in your terminal to add the MCP server to Claude Code.

Run in terminal:

Command

claude mcp add --transport stdio ggozad-haiku.rag haiku-rag serve --mcp --stdio

How to use

Haiku RAG exposes its document management, search, QA, and research capabilities as MCP tools so you can orchestrate them from an AI assistant or other automation. The server integrates hybrid search (vector + full-text), QA with citations, RLM code execution, and multi-agent workflows, all backed by LanceDB storage and Docling document structures. With the MCP mode enabled, you can call these capabilities through the AI assistant as discrete tools (e.g., index documents, search chunks, perform QA with citations, or run iterative research workflows) and receive structured results that include provenance such as page numbers and section headings. The CLI and Python API allow you to index sources, perform searches, run QA queries, and manage conversations or research planning within your environment. To use it from an assistant, run the MCP-enabled server and point your assistant’s MCP configuration at the haiku-rag tool, which will surface functions like add-src, search, ask, research, rlm, chat, and serve with memory and monitoring features.

How to install

Prerequisites:

Python 3.12 or newer
Access to install Python packages (pip)
Optional: an embedding provider (e.g., Ollama, OpenAI) and a storage backend (LanceDB) configured as described in the Haiku RAG docs

Create a virtual environment (recommended):

python3 -m venv env
source env/bin/activate

Install the full Haiku RAG package (recommended):

pip install haiku.rag

This includes document processing, all embedding providers, and rerankers. If you want a minimal footprint, install the slim package instead:

pip install haiku.rag-slim

Verify installation and run the MCP-enabled server:

haiku-rag serve --mcp --stdio

(Optional) If using uv/alternative runtimes, follow the uv installation guidance and ensure your environment satisfies the provider dependencies as described in the Haiku RAG installation docs.
See the MQTT/MCP integration docs for configuring your AI assistant to talk to the MCP server (the example in the README shows haiku-rag as the tool name and the command/args to expose).

Additional notes

Tips and notes:

The MCP server exposes tools such as add-src (index documents), search (hybrid search), ask (QA with citations), research (iterative planning/search), rlm (code execution), and chat (interactive memory-enabled conversations).
Environment variables can be used to configure embedding providers, LanceDB paths, and cloud storage backends; consult the Haiku RAG docs for provider-specific settings.
When running in production, consider mount points for directories to enable file monitoring (--monitor) and ensure proper permissions for read/write access.
If you encounter performance or memory issues, adjust the index batch sizes, routing for rerankers, and the LanceDB storage configuration as described in the configuration documentation.
The MCP integration is intended to be consumed by AI assistants (e.g., Claude Desktop); ensure the assistant is configured to use the haiku-rag MCP server and to handle citations from QA results.

Related MCP Servers

jupyter

918

🪐 🔧 Model Context Protocol (MCP) Server for Jupyter.

mcp-pinecone

148

Model Context Protocol server to allow for reading and writing from Pinecone. Rudimentary RAG

pluggedin-app

The Crossroads for AI Data Exchanges. A unified, self-hostable web interface for discovering, configuring, and managing Model Context Protocol (MCP) servers—bringing together AI tools, workspaces, prompts, and logs from multiple MCP sources (Claude, Cursor, etc.) under one roof.

beemcp

BeeMCP: an unofficial Model Context Protocol (MCP) server that connects your Bee wearable lifelogger to AI via the Model Context Protocol

RiMCP_hybrid

Rimworld Coding RAG MCP server

BinAssistMCP

Binary Ninja plugin to provide MCP functionality.