code-memory

MCP server with local vector search for your codebase. Smart indexing, semantic search, Git history — all offline.

Installation

Run this command in your terminal to add the MCP server to Claude Code.

Run in terminal:

Command

claude mcp add --transport stdio kapillamba4-code-memory uvx code-memory \
  --env EMBEDDING_MODEL="BAAI/bge-small-en-v1.5"

How to use

code-memory is a local, deterministic code intelligence MCP server that indexes and semantically searches your codebase. It provides three specialized tooling paths to retrieve relevant code context: search_code for definitions, references, and structural queries using a hybrid BM25 + dense-vector index; search_docs for architectural explanations and workflow patterns via semantic/fuzzy matching; and search_history for debugging history by leveraging Git history alongside vector-based retrieval. Before using the search tools, you typically index your codebase with index_codebase, which parses languages via tree-sitter and generates embeddings with sentence-transformers, all locally on your machine. Once indexed, you can query for specific code constructs, relationships, or documentation-level explanations to obtain precise results that respect your repository structure and language specifics.

How to install

Prerequisites: Python 3.13 or newer, uv (preferred) or pip, and network access for initial dependencies.

Install the MCP server from PyPI (recommended):

pip install code-memory

Run the MCP server using uvx (recommended):

uvx code-memory

(Optional) If you want to run directly with Python tooling without uvx, ensure you have the package installed and use the Python entrypoint as appropriate for your MCP host configuration. See the Host Configuration section for examples.
Prebuilt binaries are available for standalone usage if you prefer not to install Python dependencies:

Linux: code-memory-linux-x86_64
macOS Intel: code-memory-macos-x86_64
macOS ARM: code-memory-macos-arm64
Windows: code-memory-windows-x86_64.exe

For standalone usage, make the binary executable and invoke it according to your operating system instructions. The first run downloads the embedding model (~600MB) to ~/.cache/huggingface/; subsequent runs reuse the cached model.

Optional: Configure your MCP host to point at this server using the appropriate command/args and environment variables as shown in the Config section.

Additional notes

Tips and caveats:

The embedding model is configurable via EMBEDDING_MODEL. Changing the model invalidates existing indexes; you’ll need to re-run index_codebase after switching models.
The first run may download a sizable embedding model (~600MB); ensure you have stable disk space and network access.
You can run index_codebase, search_code, search_docs, and search_history locally; all data remains on your machine unless you explicitly share it.
If using standalone binaries, ensure the binary has execute permissions on Linux/macOS and that your MCP host configuration points to the correct path.
For best results, run index_codebase prior to querying with search_code/search_docs to ensure the vector store is up to date.

Related MCP Servers

grepai

1.4k

Semantic Search & Call Graphs for AI Agents (100% Local)

VectorCode

811

A code repository indexing tool to supercharge your LLM experience.

Mantic.sh

540

A structural code search engine for Al agents.

octocode

236

Semantic code searcher and codebase utility

mie

Persistent memory graph for AI agents. Facts, decisions, entities, and relationships that survive across sessions, tools, and providers. MCP server — works with Claude, Cursor, ChatGPT, and any MCP client.

heuristic

Enhanced MCP server for semantic code search with call-graph proximity, recency ranking, and find-similar-code. Built for AI coding assistants.