mcp-coder-bench

Benchmark tool for measuring MCP server effectiveness in LLM-assisted development

Installation

Run this command in your terminal to add the MCP server to Claude Code.

Run in terminal:

Command

claude mcp add --transport stdio mpecan-mcp-coder-bench /usr/local/bin/boarder mcp

How to use

mcp-coder-bench is a Rust-based benchmarking tool designed to evaluate how different MCP servers impact token usage, cost, and task completion when used with Claude Code. It supports parallel execution, multiple runtimes, and rich output formats to help you quantify the benefits of MCP-enabled workflows. The tool collects metrics such as token usage (inputs, outputs, and cache operations), estimated cost, wall time, tool calls, and task success rate, then provides statistical analyses (confidence intervals, Welch's t-test, and effect sizes) to compare scenarios. You can run benchmarks across multiple scenarios, analyze results, and compare different MCP configurations side by side. The Quick Start demonstrates initializing a configuration, validating it, running benchmarks, analyzing results, and comparing scenarios to determine the impact of MCPs on your development tasks.

How to install

Prerequisites:

Rust 1.75+ (for building from source)
Docker or Podman (for containerized benchmarking)
ANTHROPIC_API_KEY environment variable (for Claude access)

Install from source or binary:

Option A: Build from source

# Ensure you have Rust and cargo installed
cargo install --path .

Option B: Build release locally (optional)

cargo build --release

Verify installation:

mcp-coder-bench --version

Environment setup (example):

export ANTHROPIC_API_KEY=your_api_key_here
export RUST_LOG=info

Notes:

The tool can auto-detect container runtimes (Docker/Podman) if installed.
If you prefer a containerized workflow, ensure your environment has access to the required images.

Additional notes

Tips and gotchas:

Ensure ANTHROPIC_API_KEY is set to enable Claude access; without it, benchmark tasks may fail when invoking MCP-enabled prompts.
For reproducible results, use the workspace isolation and reset strategies described in the configuration (e.g., copy strategy).
The tool supports multiple output formats (JSON, CSV, Markdown, HTML) and can generate visualizations; use --format and --stats options during analyze to tailor reports.
If you run in parallel, monitor resource usage (CPU/memory) to avoid contention that could skew timing metrics.
When using container runtimes auto-detection, ensure your environment has the necessary permissions to run containers without prompts (e.g., sudo not required).
Always validate configuration with mcp-coder-bench validate before a full run to catch connectivity or image issues early.

Related MCP Servers

mcp-telegram

233

MCP Server for Telegram

claude-code-open

138

Open source AI coding platform with Web IDE, multi-agent system, 37+ tools, MCP protocol. MIT licensed.

boilerplate

TypeScript Model Context Protocol (MCP) server boilerplate providing IP lookup tools/resources. Includes CLI support and extensible structure for connecting AI systems (LLMs) to external data sources like ip-api.com. Ideal template for creating new MCP integrations via Node.js.

local -gateway

Aggregate multiple MCP servers into a single endpoint with web UI, OAuth 2.1, and profile-based tool management

aiquila

Connect Claude AI to your Nextcloud via the Model Context Protocol. Browse, search, and manage files through natural conversation.

elenchus

Elenchus MCP Server - Adversarial verification system for code review