Get the FREE Ultimate OpenClaw Setup Guide →

mcp-tool-filter

Ultra-fast semantic tool filtering for MCP (Model Context Protocol) servers using embedding similarity. Reduce your tool context from 1000+ tools down to the most relevant 10-20 tools in under 10ms.

Installation
Run this command in your terminal to add the MCP server to Claude Code.
Run in terminal:
Command
claude mcp add --transport stdio portkey-ai-mcp-tool-filter npx -y @portkey-ai/mcp-tool-filter

How to use

The @portkey-ai/mcp-tool-filter library provides ultra-fast semantic filtering of MCP (Model Context Protocol) servers by ranking tools based on embedding similarity. It can operate with local embeddings for extremely low latency or with API-based embeddings for higher accuracy. After initializing with your MCP server definitions, you can call the filter method with a user query (or a piece of context), and it will return the top-K most relevant tools along with timing metrics. This lets you shrink large tool sets (often 1000+ tools) down to the most relevant 10-20 tools in under 10 milliseconds in typical local setups. The library supports smart top-K selection, optional server-descriptions in embeddings, and caching to further optimize performance.

How to install

Prerequisites:

  • Node.js v14+ (recommended latest LTS)
  • npm or pnpm

Install the MCP tool filter package:

npm install @portkey-ai/mcp-tool-filter

If you plan to use API embeddings (e.g., OpenAI), obtain an API key and export it as an environment variable:

export OPENAI_API_KEY=sk-...

Optionally, verify your setup by running a small TypeScript/JavaScript snippet that initializes the filter and processes a sample MCP server list as shown in the Quick Start section of the README.

Additional notes

Tips and caveats:

  • If you choose local embeddings, ensure the model you load is available on first run. The initial load may download a model (~25MB) and take some time, after which subsequent requests are sub-millisecond to a few milliseconds depending on hardware.
  • When using API embeddings, be aware of latency (typical 400-800ms per request) and costs per tokens.
  • You can adjust defaults like topK, minScore, and alwaysInclude via the MCPToolFilter configuration to tailor results to your use case.
  • If you enable includeServerDescription, embeddings may include additional context about the server, which can help with high-level intent queries but may affect accuracy for precise tool-specific queries.
  • Ensure your MCP servers array includes rich tool definitions (name, description, keywords) to maximize matching quality.
  • Cache behavior is important: tame cache lifetimes and consider cache invalidation strategies if your tool list changes frequently.

Related MCP Servers

Sponsor this space

Reach thousands of developers