mcp-testing-framework
Testing framework for Model Context Protocol (MCP)
claude mcp add --transport stdio l-qun-mcp-testing-framework npx -y mcp-server
How to use
The MCP Testing Framework is an evaluation tool designed to help you test and measure how well your MCP Servers integrate with a variety of AI models. It supports batch testing across multiple model providers such as OpenAI, Google Gemini, Anthropic, Deepseek, and custom models, enabling you to assess how model outputs align with your server definitions and tool calls. Use it to run predefined test cases against one or more MCP servers, collect results, and generate a structured report that highlights pass rates and potential tool-call issues. The framework is designed to work with multiple MCP servers in parallel, allowing you to compare performance and behavior across environments or configurations. To start, initialize a project with the framework and then configure test rounds, pass thresholds, and the models to test in your project configuration. You can then run evaluations to produce a report saved in the mcp-report directory.
How to install
Prerequisites:
- Node.js (recommended: LTS version) and npm installed
- Git (optional, for cloning examples)
-
Install the MCP Testing Framework and initialize a project:
- Create a new project directory: mkdir my-mcp-tests && cd my-mcp-tests
- Initialize with a ready-made example: npx mcp-testing-framework@latest init [target directory] --example getting-started
-
Configure the project:
- Edit mcp-testing-framework.yaml to set testRound, passThreshold, modelsToTest, and testCases as needed.
- Define one or more MCP servers in the mcpServers section of the config (see the README for examples).
- Create a .env file to provide API keys for required models (e.g., OPENAI_API_KEY, GEMINI_API_KEY, ANTHROPIC_API_KEY, DEEPSEEK_API_KEY).
-
Run evaluations:
- Ensure your MCP servers are running or available as configured.
- Execute the evaluation process: npx mcp-testing-framework@latest evaluate
- The framework will run test cases, collect results, and write a report to the mcp-report directory.
Additional notes
Tips and common considerations:
- Environment variables: Many model providers require API keys. Add these to a .env file as described in the configuration sections (e.g., OPENAI_API_KEY, GEMINI_API_KEY, ANTHROPIC_API_KEY, DEEPSEEK_API_KEY).
- Test configuration: Use mcp-testing-framework.yaml to adjust testRound (number of rounds per model), passThreshold (minimum success rate), and the list of models to test. You can mix OpenAI, Gemini, Anthropic, Deepseek, and custom providers.
- Custom models: If you implement your own model provider, register it with registerProvider and reference it in modelsToTest as my-custom:my-model-name.
- Output: After evaluation, review the generated mcp-report to understand which tool calls passed, failed, or behaved unexpectedly. This helps you refine MCP server definitions and tool parameters.
Related MCP Servers
iterm
A Model Context Protocol server that executes commands in the current iTerm session - useful for REPL and CLI assistance
mcp
Octopus Deploy Official MCP Server
furi
CLI & API for MCP management
editor
MCP Server for Phaser Editor
DoorDash
MCP server from JordanDalton/DoorDash-MCP-Server
mcp
MCP сервер для автоматического создания и развертывания приложений в Timeweb Cloud