mineru
MinerU MCP Server - 完整的文档处理解决方案。支持PDF/PPTX/DOCX/图片,真正异步并发,批量异步并行,MCP自然语言交互,性能提升10倍。
claude mcp add --transport stdio neosun100-mineru-mcp-server /path/to/mineru-mcp-server/.venv/bin/python3 /path/to/mineru-mcp-server/src/mineru_mcp_server.py \ --env PYTHONPATH="/path/to/mineru-mcp-server/src"
How to use
MinerU MCP Server provides a complete document processing pipeline with native MCP integration. It supports handling PDFs, Word, PowerPoint, images, and HTML, orchestrating asynchronous processing, token management, and natural language interactions through MCP. The server exposes tools for single-document processing, directory-wide batch processing with true asynchronous concurrency, and token/status queries. Typical workflows include processing a single local file or a directory of documents via the MCP tooling, or using the CLI tools to operate in a headless, scalable manner. You can leverage the built-in batch and UI-enhanced processors to monitor progress, speed, and per-file status in real time.
How to install
Prerequisites:
- Python 3.10+
- Git
- uv (Python package manager)
- Basic development environment with network access to install dependencies
- Clone the repository
git clone https://github.com/neosun100/mineru-mcp-server.git
cd mineru-mcp-server
- Create and activate a virtual environment
uv venv
source .venv/bin/activate
- Install dependencies
uv pip install niquests PyPDF2 python-pptx python-docx mcp rich playwright pyyaml
playwright install chromium
- Configure accounts (example)
cp accounts.yaml.example accounts.yaml
vi accounts.yaml
- Run the MCP server (example via internal script or as configured in mcp_config)
# Start the MCP server via Python entrypoint (adjust path as needed)
./.venv/bin/python3 src/mineru_mcp_server.py
- MCP server configuration (example in ~/.kiro/settings/mcp.json)
{
"mcpServers": {
"mineru": {
"command": "/path/to/mineru-mcp-server/.venv/bin/python3",
"args": ["/path/to/mineru-mcp-server/src/mineru_mcp_server.py"],
"env": {
"PYTHONPATH": "/path/to/mineru-mcp-server/src"
}
}
}
}
- Optional: setup Kiro Skill for NLP-driven control (optional)
cp -r skills/mineru-token-manager ~/.kiro/skills/
Additional notes
Tips:
- Ensure PYTHONPATH points to the src directory so MCP components load correctly.
- If you use batch login and token management, run in an environment with graphical support if headless mode is not desired.
- For large documents, the server can split and process in parallel; adjust concurrency using internal settings if needed.
- Use the MCP tooling to process documents via natural language prompts (e.g., process_document, process_directory, get_token_status).
- If you upgrade dependencies, re-run the installation steps to synchronize binaries (e.g., PlaywrightChromium). Common issues:
- Missing dependencies can cause ImportError during startup; verify that the virtual environment is active and all jars/modules are installed.
- Token expiration checks may require network access; ensure accounts.yaml is correctly configured and accessible.
Related MCP Servers
mcp-vegalite
MCP server from isaacwasserman/mcp-vegalite-server
github-chat
A Model Context Protocol (MCP) for analyzing and querying GitHub repositories using the GitHub Chat API.
nautex
MCP server for guiding Coding Agents via end-to-end requirements to implementation plan pipeline
pagerduty
PagerDuty's official local MCP (Model Context Protocol) server which provides tools to interact with your PagerDuty account directly from your MCP-enabled client.
futu-stock
mcp server for futuniuniu stock
mcp -boilerplate
Boilerplate using one of the 'better' ways to build MCP Servers. Written using FastMCP