sourcerer
MCP for semantic code search & navigation that reduces token waste
claude mcp add --transport stdio st3v3nmw-sourcerer-mcp sourcerer \ --env OPENAI_API_KEY="your-openai-api-key" \ --env SOURCERER_WORKSPACE_ROOT="/path/to/your/project"
How to use
Sourcerer MCP is a semantic code search and navigation server designed to help AI agents work with codebases efficiently by indexing code chunks and enabling semantic retrieval rather than brute-force text matching. It builds a searchable vector database of code chunks (functions, classes, methods, types) using Tree-sitter-based parsing, stores embeddings, and exposes MCP tools for semantic queries and chunk retrieval. The available MCP tools let you perform semantic searches, fetch specific chunks by ID, find similar chunks, and manually trigger or monitor indexing. To use it, configure the server via mcp.json (or rely on environment variables), start the sourcerer binary, and then issue MCP tool commands like semantic_search or get_chunk_code through your MCP client. This setup reduces token usage for AI agents by returning contextual code chunks instead of entire files.
How to install
Prerequisites:
- Go installed (go1.16+ or newer)
- Git installed
- A Git repository with your codebase and a .gitignore (including a .sourcerer/ directory if you plan to store embeddings locally)
Installation steps:
-
Install the Sourcerer MCP binary:
go install github.com/st3v3nmw/sourcerer-mcp/cmd/sourcerer@latest
-
Ensure the binary is available in your PATH:
export PATH=$PATH:$(go env GOPATH)/bin
-
Create an mcp.json configuration (see example below) to run Sourcerer as an MCP server:
{ "mcpServers": { "sourcerer": { "command": "sourcerer", "env": { "OPENAI_API_KEY": "your-openai-api-key", "SOURCERER_WORKSPACE_ROOT": "/path/to/your/project" } } } }
-
Start or initialize indexing as needed via the provided MCP tools once the server is running.
Notes:
- If you prefer using Homebrew, you can also install via the provided taps and then ensure the binary is on your PATH.
- If you want to automate startup, you can wrap the command in a script or a container invocation that uses the mcp.json configuration.
Additional notes
Tips and considerations:
- OPENAI_API_KEY is required for generating embeddings; ensure it has appropriate access and usage limits.
- The workspace root (SOURCERER_WORKSPACE_ROOT) should point to the root of the git repository you want to index.
- The vector database is stored in .sourcerer/db/; ensure you have sufficient disk space for large codebases.
- Sourcerer respects .gitignore and can watch filesystem changes to re-index automatically via fsnotify; ensure your environment allows filesystem events.
- The MCP tools available include:
- semantic_search: perform semantic queries over code chunks
- get_chunk_code: retrieve specific chunk by its ID
- find_similar_chunks: locate chunks similar to a given chunk
- index_workspace: manually trigger re-indexing of the workspace
- get_index_status: monitor indexing progress
- Language support relies on Tree-sitter queries per language; current support includes Go, JavaScript, Markdown, Python, and TypeScript with planned expansion.
Related MCP Servers
mcp-local-rag
Local-first RAG server for developers using MCP. Semantic + keyword search for code and technical docs. Fully private, zero setup.
mcp-codebase-index
17 MCP query tools for codebase navigation — functions, classes, imports, dependency graphs, change impact. Zero dependencies. 87% token reduction.
cco
Real-time audit and approval system for Claude Code tool calls.
mcp-ragex
MCP server for intelligent code search: semantic (RAG), symbolic (tree-sitter), and regex (ripgrep) search modes. Built for Claude Code and AI coding assistants.
vibe-workspace
Manage a vibe workspace with many repos
voice-status-report
A Model Context Protocol (MCP) server that provides voice status updates using OpenAI's text-to-speech API.