mcp-coder-bench
Benchmark tool for measuring MCP server effectiveness in LLM-assisted development
claude mcp add --transport stdio mpecan-mcp-coder-bench /usr/local/bin/boarder mcp
How to use
mcp-coder-bench is a Rust-based benchmarking tool designed to evaluate how different MCP servers impact token usage, cost, and task completion when used with Claude Code. It supports parallel execution, multiple runtimes, and rich output formats to help you quantify the benefits of MCP-enabled workflows. The tool collects metrics such as token usage (inputs, outputs, and cache operations), estimated cost, wall time, tool calls, and task success rate, then provides statistical analyses (confidence intervals, Welch's t-test, and effect sizes) to compare scenarios. You can run benchmarks across multiple scenarios, analyze results, and compare different MCP configurations side by side. The Quick Start demonstrates initializing a configuration, validating it, running benchmarks, analyzing results, and comparing scenarios to determine the impact of MCPs on your development tasks.
How to install
Prerequisites:
- Rust 1.75+ (for building from source)
- Docker or Podman (for containerized benchmarking)
- ANTHROPIC_API_KEY environment variable (for Claude access)
Install from source or binary:
Option A: Build from source
# Ensure you have Rust and cargo installed
cargo install --path .
Option B: Build release locally (optional)
cargo build --release
Verify installation:
mcp-coder-bench --version
Environment setup (example):
export ANTHROPIC_API_KEY=your_api_key_here
export RUST_LOG=info
Notes:
- The tool can auto-detect container runtimes (Docker/Podman) if installed.
- If you prefer a containerized workflow, ensure your environment has access to the required images.
Additional notes
Tips and gotchas:
- Ensure ANTHROPIC_API_KEY is set to enable Claude access; without it, benchmark tasks may fail when invoking MCP-enabled prompts.
- For reproducible results, use the workspace isolation and reset strategies described in the configuration (e.g., copy strategy).
- The tool supports multiple output formats (JSON, CSV, Markdown, HTML) and can generate visualizations; use --format and --stats options during analyze to tailor reports.
- If you run in parallel, monitor resource usage (CPU/memory) to avoid contention that could skew timing metrics.
- When using container runtimes auto-detection, ensure your environment has the necessary permissions to run containers without prompts (e.g., sudo not required).
- Always validate configuration with mcp-coder-bench validate before a full run to catch connectivity or image issues early.
Related MCP Servers
mcp-telegram
MCP Server for Telegram
claude-code-open
Open source AI coding platform with Web IDE, multi-agent system, 37+ tools, MCP protocol. MIT licensed.
boilerplate
TypeScript Model Context Protocol (MCP) server boilerplate providing IP lookup tools/resources. Includes CLI support and extensible structure for connecting AI systems (LLMs) to external data sources like ip-api.com. Ideal template for creating new MCP integrations via Node.js.
local -gateway
Aggregate multiple MCP servers into a single endpoint with web UI, OAuth 2.1, and profile-based tool management
aiquila
Connect Claude AI to your Nextcloud via the Model Context Protocol. Browse, search, and manage files through natural conversation.
elenchus
Elenchus MCP Server - Adversarial verification system for code review