mcp-tool-filter

Ultra-fast semantic tool filtering for MCP (Model Context Protocol) servers using embedding similarity. Reduce your tool context from 1000+ tools down to the most relevant 10-20 tools in under 10ms.

Installation

Run this command in your terminal to add the MCP server to Claude Code.

Run in terminal:

View docs

Command

claude mcp add --transport stdio portkey-ai-mcp-tool-filter npx -y @portkey-ai/mcp-tool-filter

How to use

The @portkey-ai/mcp-tool-filter library provides ultra-fast semantic filtering of MCP (Model Context Protocol) servers by ranking tools based on embedding similarity. It can operate with local embeddings for extremely low latency or with API-based embeddings for higher accuracy. After initializing with your MCP server definitions, you can call the filter method with a user query (or a piece of context), and it will return the top-K most relevant tools along with timing metrics. This lets you shrink large tool sets (often 1000+ tools) down to the most relevant 10-20 tools in under 10 milliseconds in typical local setups. The library supports smart top-K selection, optional server-descriptions in embeddings, and caching to further optimize performance.

How to install

Prerequisites:

Node.js v14+ (recommended latest LTS)
npm or pnpm

Install the MCP tool filter package:

npm install @portkey-ai/mcp-tool-filter

If you plan to use API embeddings (e.g., OpenAI), obtain an API key and export it as an environment variable:

export OPENAI_API_KEY=sk-...

Optionally, verify your setup by running a small TypeScript/JavaScript snippet that initializes the filter and processes a sample MCP server list as shown in the Quick Start section of the README.

Additional notes

Tips and caveats:

If you choose local embeddings, ensure the model you load is available on first run. The initial load may download a model (~25MB) and take some time, after which subsequent requests are sub-millisecond to a few milliseconds depending on hardware.
When using API embeddings, be aware of latency (typical 400-800ms per request) and costs per tokens.
You can adjust defaults like topK, minScore, and alwaysInclude via the MCPToolFilter configuration to tailor results to your use case.
If you enable includeServerDescription, embeddings may include additional context about the server, which can help with high-level intent queries but may affect accuracy for precise tool-specific queries.
Ensure your MCP servers array includes rich tool definitions (name, description, keywords) to maximize matching quality.
Cache behavior is important: tame cache lifetimes and consider cache invalidation strategies if your tool list changes frequently.

Related MCP Servers

mcp-openmsx

A Model Context Protocol (MCP) server for automating openMSX emulator instances. This server provides comprehensive tools for MSX software development, testing, and automation through standardized MCP protocols.