mcp-context
MCP Context Server — a FastMCP-based server providing persistent multimodal context storage for LLM agents.
claude mcp add --transport stdio alex-feel-mcp-context-server uvx --python 3.12 --with mcp-context-server[embeddings-ollama,reranking] mcp-context-server \
--env DB_PATH="${DB_PATH:-~/.mcp/context_storage.db}" \
--env LOG_LEVEL="INFO"How to use
The MCP Context Server offers a persistent multimodal context store designed for use with MCP-compatible clients. It can store and retrieve both text and image data, supports thread-based scoping so multiple agents working on the same task share a common context, and provides powerful search options including full-text, semantic, and hybrid approaches. By default, the server uses a SQLite database, but it can be configured to use PostgreSQL for high-concurrency production setups. Optional cross-encoder reranking is enabled to refine search results, and you can enable or disable features such as semantic search or chunking through environment variables and configuration. To use the server with Claude Code or other MCP clients, add the server as an MCP backend via the provided CLI or by editing your .mcp.json file, pointing to the uvx-based stdio integration that invokes the mcp-context-server package.
How to install
Prerequisites:
- uv package manager (https://docs.astral.sh/uv/getting-started/installation/)
- Python 3.12 (as used in examples)
- Ollama for embedding generation (optional but recommended): install from ollama.com/download and pull the embedding model (ollama pull qwen3-embedding:0.6b)
Installation steps:
- Install uv (if not already installed) following the official guide.
- Ensure Python 3.12 is available in your environment.
- Install the MCP Context Server package from PyPI:
- pip install mcp-context-server
- Verify installation by running a basic startup command (example using uvx as in the README):
- uvx --python 3.12 --with mcp-context-server[embeddings-ollama,reranking] mcp-context-server
- Create an MCP config (see mcp_config below) to connect your MCP clients (e.g., Claude Code) to this server.
- Start the server using your preferred orchestration (CLI, docker, or directly via your MCP manager).
Notes:
- If you want to use a persistent PostgreSQL backend, configure STORAGE_BACKEND and DB connection details via environment variables in your MCP configuration.
- For embedding and reranking features, ensure Ollama and the specified embedding model are installed and ready.
Additional notes
Tips and common issues:
- Environment variables: You can tune core and search settings via environment variables (e.g., LOG_LEVEL, DB_PATH, MAX_IMAGE_SIZE_MB, MAX_TOTAL_SIZE_MB). Use the ${VAR:-default} syntax in your MCP config to provide defaults.
- CHUNKING and RERANKING: Enabling chunking (ENABLE_CHUNKING) and cross-encoder reranking (ENABLE_RERANKING) can improve semantic search quality, but may increase latency. Adjust CHUNK_SIZE, CHUNK_OVERLAP, and RERANKING_OVERFETCH to balance performance and accuracy.
- Semantic vs. Full-Text: If you enable semantic search (ENABLE_SEMANTIC_SEARCH), ensure your embedding provider is correctly configured (OLLAMA, OpenAI, etc.) and that embeddings model dimensions match EMBEDDING_DIM.
- Backends: For production, consider PostgreSQL (high concurrency) over SQLite. Ensure backends are properly migrated when changing EMBEDDING_DIM or schema.
- Troubleshooting: If MCP tools are not available in your CLI, verify the uvx invocation in your mcp.json and ensure the mcp-context-server package is installed in the same Python environment referenced by --python.
- Licensing and compatibility: This server aims to be MCP-standard compliant and works with Claude Code, LangGraph, and other MCP clients. Check compatibility with your MCP version if upgrading.
Related MCP Servers
context-space
Ultimate Context Engineering Infrastructure, starting from MCPs and Integrations
aser
Aser is a lightweight, self-assembling AI Agent frame.
sudocode
Lightweight agent orchestration dev tool that lives in your repo
systemprompt-code-orchestrator
MCP server for orchestrating AI coding agents (Claude Code CLI & Gemini CLI). Features task management, process execution, Git integration, and dynamic resource discovery. Full TypeScript implementation with Docker support and Cloudflare Tunnel integration.
mem0 -selfhosted
Self-hosted mem0 MCP server for Claude Code. Run a complete memory server against self-hosted Qdrant + Neo4j + Ollama while using Claude as the main LLM.
engram-rs
Memory engine for AI agents — time axis (3-layer decay/promotion) + space axis (self-organizing topic tree). Hybrid search, LLM consolidation. Single Rust binary.