ToolRAG

Unlimited LLM tools, zero context penalties — ToolRAG serves exactly the LLM tools your user-query demands.

Installation

Run this command in your terminal to add the MCP server to Claude Code.

Run in terminal:

Command

claude mcp add --transport stdio antl3x-toolrag npx -y @antl3x/toolrag

How to use

ToolRAG is an MCP-enabled tool discovery and execution layer designed to manage an unlimited number of tool definitions for large language models. It uses semantic search over embedded tool descriptions to select the most relevant tools for a given user query, reducing context window usage and token costs while preserving performance. Tool definitions are registered to MCP servers and exposed in a way that LLMs can call them as standard OpenAI function calls. With ToolRAG, you can build multi-tool AI assistants that can query, filter, and execute actions across a large catalog of tools without overwhelming the model with every available function.

To use ToolRAG, start the server and configure your MCP servers (tool sources) in ToolRAG's initialization. Once running, you can initialize ToolRAG in your client and pass a list of MCP server URLs. When a user query arrives, ToolRAG retrieves the most relevant tools, constructs a tool-list for the LLM to consider, and then executes the selected tool calls against the appropriate MCP servers. This enables seamless tool orchestration where the model only interacts with the subset of tools that are contextually relevant to the current task.

How to install

Prerequisites:

Node.js (LTS) and npm or yarn installed on your system.
Access to the internet to install npm packages.

Installation steps:

Install Node.js if not already installed. Visit https://nodejs.org/ and install the LTS version for your platform.
Install ToolRAG package (as a local dev dependency or run directly via npx):

Using npm: npm install @antl3x/toolrag
Alternatively, run directly with npx (no installation required): npx -y @antl3x/toolrag

Initialize ToolRAG in your project code according to your environment (Node.js example shown in the repository's quick start):

# If installed locally
npm install @antl3x/toolrag

import { ToolRAG } from "@antl3x/toolrag";
import OpenAI from "openai";

const toolRag = await ToolRAG.init({
  mcpServers: [
    "https://mcp.example.com/token/tool-a",
    "https://mcp.example.com/token/tool-b",
  ],
});

// Use toolRag as described in the README

Configure and connect to your MCP servers as needed by your deployment. Ensure that the MCP endpoints are reachable from the environment running ToolRAG.

Additional notes

Tips:

ToolRAG relies on vector embeddings for semantic retrieval. Configure embedding providers (e.g., OpenAI or Google) and update your tool metadata accordingly.
Ensure your MCP endpoints expose tool definitions in the expected OpenAI function definition format for smooth execution.
Manage relevance thresholds and persistence settings to balance recall vs. precision in tool selection.
Monitor token usage and latency: ToolRAG’s strength is reducing the number of tools considered by the LLM; fine-tuning thresholds helps optimize performance for your workload.
If you run into connectivity issues with MCP servers, verify network access, authentication tokens, and CORS or API gateway settings as applicable.

Related MCP Servers

osaurus

3.7k

AI edge infrastructure for macOS. Run local or cloud models, share tools across apps via MCP, and power AI workflows with a native, always-on runtime.

mcp-router

1.8k

A Unified MCP Server Management App (MCP Manager).

Matryoshka

106

MCP server for token-efficient large document analysis via the use of REPL state

mcp-llm

An MCP server that provides LLMs access to other LLMs

kanban

MCP Kanban is a specialized middleware designed to facilitate interaction between Large Language Models (LLMs) and Planka, a Kanban board application. It serves as an intermediary layer that provides LLMs with a simplified and enhanced API to interact with Planka's task management system.

ContextPods

Model Context Protocol management suite/factory. An MCP that can generate and manage other local MCPs in multiple languages. Uses the official SDKs for code gen.