mcp-tool-filter
Ultra-fast semantic tool filtering for MCP (Model Context Protocol) servers using embedding similarity. Reduce your tool context from 1000+ tools down to the most relevant 10-20 tools in under 10ms.
claude mcp add --transport stdio portkey-ai-mcp-tool-filter npx -y @portkey-ai/mcp-tool-filter
How to use
The @portkey-ai/mcp-tool-filter library provides ultra-fast semantic filtering of MCP (Model Context Protocol) servers by ranking tools based on embedding similarity. It can operate with local embeddings for extremely low latency or with API-based embeddings for higher accuracy. After initializing with your MCP server definitions, you can call the filter method with a user query (or a piece of context), and it will return the top-K most relevant tools along with timing metrics. This lets you shrink large tool sets (often 1000+ tools) down to the most relevant 10-20 tools in under 10 milliseconds in typical local setups. The library supports smart top-K selection, optional server-descriptions in embeddings, and caching to further optimize performance.
How to install
Prerequisites:
- Node.js v14+ (recommended latest LTS)
- npm or pnpm
Install the MCP tool filter package:
npm install @portkey-ai/mcp-tool-filter
If you plan to use API embeddings (e.g., OpenAI), obtain an API key and export it as an environment variable:
export OPENAI_API_KEY=sk-...
Optionally, verify your setup by running a small TypeScript/JavaScript snippet that initializes the filter and processes a sample MCP server list as shown in the Quick Start section of the README.
Additional notes
Tips and caveats:
- If you choose local embeddings, ensure the model you load is available on first run. The initial load may download a model (~25MB) and take some time, after which subsequent requests are sub-millisecond to a few milliseconds depending on hardware.
- When using API embeddings, be aware of latency (typical 400-800ms per request) and costs per tokens.
- You can adjust defaults like topK, minScore, and alwaysInclude via the MCPToolFilter configuration to tailor results to your use case.
- If you enable includeServerDescription, embeddings may include additional context about the server, which can help with high-level intent queries but may affect accuracy for precise tool-specific queries.
- Ensure your MCP servers array includes rich tool definitions (name, description, keywords) to maximize matching quality.
- Cache behavior is important: tame cache lifetimes and consider cache invalidation strategies if your tool list changes frequently.