mineru_mcp_project
将 Mineru 官方 PDF 解析能力封装为 FastMCP 服务,便于任意支持 MCP 协议的客户端调用,实现批量 PDF 转 HTML(含图片资源整理与命名)。
claude mcp add --transport stdio harmonychen110-mineru_mcp_project python -m mineru_mcp_project
How to use
This MCP server wraps Mineru's PDF parsing capability into a FastMCP tool named convert_pdfs_with_mineru. It exposes a workflow that scans a folder for PDF files, uploads them to Mineru for OCR and conversion, and then downloads the resulting ZIP containing HTML and image assets. The server supports configurable language, table recognition, extra output formats, and optional renaming of image assets to keep HTML references consistent. To use it, start the service locally and connect with any MCP-compatible client. The convert_pdfs_with_mineru tool accepts parameters like pdf_folder, output_folder, api_token (or rely on the MINERU_API_TOKEN environment variable), language, enable_table, extra_formats, and more. After the job completes, you’ll receive a structured summary including upload status, playback of results, and the location of the output HTML and assets.
How to install
Prerequisites:\n- Python 3.10 or newer\n- Internet access to Mineru (mineru.net) and the ability to reach Mineru APIs\n\n1) Install the package in editable mode (from the repo root):\nbash\npip install -e .\n\n2) Provide Mineru API Token (either via env var or per-call):\nbash\n# Option A: set environment variable for the session\nexport MINERU_API_TOKEN=your_token_here\n# Option B: pass api_token in the call to convert_pdfs_with_mineru\n\n3) Start the MCP server:\nbash\n# If installed with the console script, you can use the wrapper:\nrun-mineru-mcp\n# Alternatively, run as a Python module (equivalent):\npython -m mineru_mcp_project\n\nThe service will listen by default at http://127.0.0.1:4399/mcp/ .
Additional notes
Environment variables and configuration:\n- MINERU_API_TOKEN: Mineru API token. Can be provided via this env var or per-call via api_token.\n- Ensure network access to https://mineru.net.\n\nCommon issues and tips:\n- If upload or conversion fails, verify the API token and network connectivity.\n- For long-running tasks, adjust max_wait and poll_interval accordingly in the convert_pdfs_with_mineru parameters.\n- If assets are not renamed in the output HTML, enable rename_assets_flag in the tool parameters and ensure the ZIP contains figure/images folders.\n- The server exposes convert_pdfs_with_mineru only after the service starts; ensure the MCP client calls the correct tool name and parameter schema.
Related MCP Servers
mineru-tianshu
天枢 - 企业级 AI 一站式数据预处理平台 | PDF/Office转Markdown | 支持MCP协议AI助手集成 | Vue3+FastAPI全栈方案 | 文档解析 | 多模态信息提取
ebook
A MCP server that supports mainstream eBook formats including EPUB, PDF and more. Simplify your eBook user experience with LLM.
nautex
MCP server for guiding Coding Agents via end-to-end requirements to implementation plan pipeline
mcp-yfinance
Real-time stock API with Python, MCP server example, yfinance stock analysis dashboard
cloudwatch-logs
MCP server from serkanh/cloudwatch-logs-mcp
servicenow-api
ServiceNow MCP Server and API Wrapper