MCPCorpus
MCPCorpus is a comprehensive dataset for analyzing the Model Context Protocol (MCP) ecosystem, containing ~14K MCP servers and 300 MCP clients with 20+ normalized metadata attributes.
claude mcp add --transport stdio snakinya-mcpcorpus python Website/server.py \ --env GITHUB_TOKEN="GitHub token (optional, used for enriching metadata during updates)"
How to use
MCPCorpus provides a lightweight local web interface to explore and programmatically access the MCPCorpus dataset, which contains a large collection of MCP servers and clients with rich metadata. To use it, first run the included Python web server that serves the local search interface and dataset files. Once running, open http://localhost:8000 in your browser to search, filter, and browse the stored MCP artifacts. You can also load the underlying JSON datasets directly in your code (located under Crawler/Servers and Crawler/Clients) to perform programmatic analysis or convert them into DataFrames for research workflows. When updating the dataset, you can run the provided data collection scripts to fetch new servers/clients and then refresh the GitHub metadata using the included tooling.
How to install
Prerequisites:
- Python 3.8+ installed on your machine
- Access to the repository containing MCPCorpus
Installation steps:
- Clone the repository (or ensure you have the MCPCorpus directory structure locally).
- Install any Python dependencies if a requirements.txt is present (optional if the server runs with Python standard libraries):
pip install -r requirements.txt
- Start the local web server for exploring the dataset:
python Website/server.py
- Open http://localhost:8000 in your web browser to interact with the MCPCorpus interface.
Optional: If you intend to update the dataset or collect new metadata, ensure you can run the data collection scripts located under Crawler/Servers and Crawler/Clients, and have a GitHub token if you plan to enrich metadata via github_info_collector.py.
Additional notes
Notes and tips:
- The dataset and web interface are designed for local exploration; the repository includes scripts to collect server/client data and to enrich metadata via GitHub. If you plan to run the enrichment step, you may need a GitHub token and appropriate API access quotas.
- Update steps typically involve: (a) collecting new server data, (b) collecting new client data, (c) optionally enriching with GitHub metadata.
- The environment variable GITHUB_TOKEN is optional but recommended for larger updates to avoid rate limits when querying GitHub during enrichment.
- The mcp_config shown here assumes the server is started with a Python script at Website/server.py; adjust paths if you relocate the server script.
- If port 8000 is already in use, you can modify the server script to listen on a different port or run a local proxy as needed.
Related MCP Servers
mcp-vegalite
MCP server from isaacwasserman/mcp-vegalite-server
github-chat
A Model Context Protocol (MCP) for analyzing and querying GitHub repositories using the GitHub Chat API.
nautex
MCP server for guiding Coding Agents via end-to-end requirements to implementation plan pipeline
pagerduty
PagerDuty's official local MCP (Model Context Protocol) server which provides tools to interact with your PagerDuty account directly from your MCP-enabled client.
futu-stock
mcp server for futuniuniu stock
mcp -boilerplate
Boilerplate using one of the 'better' ways to build MCP Servers. Written using FastMCP