code-memory
MCP server with local vector search for your codebase. Smart indexing, semantic search, Git history — all offline.
claude mcp add --transport stdio kapillamba4-code-memory uvx code-memory \ --env EMBEDDING_MODEL="BAAI/bge-small-en-v1.5"
How to use
code-memory is a local, deterministic code intelligence MCP server that indexes and semantically searches your codebase. It provides three specialized tooling paths to retrieve relevant code context: search_code for definitions, references, and structural queries using a hybrid BM25 + dense-vector index; search_docs for architectural explanations and workflow patterns via semantic/fuzzy matching; and search_history for debugging history by leveraging Git history alongside vector-based retrieval. Before using the search tools, you typically index your codebase with index_codebase, which parses languages via tree-sitter and generates embeddings with sentence-transformers, all locally on your machine. Once indexed, you can query for specific code constructs, relationships, or documentation-level explanations to obtain precise results that respect your repository structure and language specifics.
How to install
Prerequisites: Python 3.13 or newer, uv (preferred) or pip, and network access for initial dependencies.
- Install the MCP server from PyPI (recommended):
pip install code-memory
- Run the MCP server using uvx (recommended):
uvx code-memory
-
(Optional) If you want to run directly with Python tooling without uvx, ensure you have the package installed and use the Python entrypoint as appropriate for your MCP host configuration. See the Host Configuration section for examples.
-
Prebuilt binaries are available for standalone usage if you prefer not to install Python dependencies:
- Linux: code-memory-linux-x86_64
- macOS Intel: code-memory-macos-x86_64
- macOS ARM: code-memory-macos-arm64
- Windows: code-memory-windows-x86_64.exe
For standalone usage, make the binary executable and invoke it according to your operating system instructions. The first run downloads the embedding model (~600MB) to ~/.cache/huggingface/; subsequent runs reuse the cached model.
- Optional: Configure your MCP host to point at this server using the appropriate command/args and environment variables as shown in the Config section.
Additional notes
Tips and caveats:
- The embedding model is configurable via EMBEDDING_MODEL. Changing the model invalidates existing indexes; you’ll need to re-run index_codebase after switching models.
- The first run may download a sizable embedding model (~600MB); ensure you have stable disk space and network access.
- You can run index_codebase, search_code, search_docs, and search_history locally; all data remains on your machine unless you explicitly share it.
- If using standalone binaries, ensure the binary has execute permissions on Linux/macOS and that your MCP host configuration points to the correct path.
- For best results, run index_codebase prior to querying with search_code/search_docs to ensure the vector store is up to date.
Related MCP Servers
grepai
Semantic Search & Call Graphs for AI Agents (100% Local)
VectorCode
A code repository indexing tool to supercharge your LLM experience.
Mantic.sh
A structural code search engine for Al agents.
octocode
Semantic code searcher and codebase utility
mie
Persistent memory graph for AI agents. Facts, decisions, entities, and relationships that survive across sessions, tools, and providers. MCP server — works with Claude, Cursor, ChatGPT, and any MCP client.
heuristic
Enhanced MCP server for semantic code search with call-graph proximity, recency ranking, and find-similar-code. Built for AI coding assistants.