mineru-tianshu
天枢 - 企业级 AI 一站式数据预处理平台 | PDF/Office转Markdown | 支持MCP协议AI助手集成 | Vue3+FastAPI全栈方案 | 文档解析 | 多模态信息提取
claude mcp add --transport stdio magicyuan876-mineru-tianshu python -m backend.mcp_server \ --env MCP_HOST="0.0.0.0 (bind to all interfaces) or specific host" \ --env MCP_PORT="5000 (MCP listening port)" \ --env JWT_SECRET="Secret key for JWT tokens" \ --env GPU_ENABLED="true/false to indicate GPU acceleration" \ --env DATABASE_URL="Database connection string (PostgreSQL/MySQL) if authentication/permissions are enabled" \ --env RUSTFS_PUBLIC_URL="URL of public RustFS object storage (if used)"
How to use
Tianshu is an enterprise-grade AI data preprocessing platform that exposes an MCP (Model Context Protocol) server to orchestrate AI-supported data extraction and structuring tasks. It integrates a front-end dashboard, a FastAPI back-end, and a GPU-accelerated worker pool to convert unstructured inputs (PDFs, images, audio, video) into structured Markdown/JSON and other outputs. Through the MCP interface, you can submit tasks, monitor progress, and retrieve structured results via standardized MCP commands. The server supports multiple data formats (documents, images, audio, video, biological formats) and can route tasks to specialized engines (MinerU, PaddleOCR-VL, SenseVoice, and custom plugins). Additionally, it provides authentication, queue management, and a containerized deployment path using Docker for scalable production use.
How to install
Prerequisites:
- Docker and Docker Compose (recommended for production) or a Python 3.8+ environment
- NVIDIA GPU drivers and CUDA toolkit if you intend to use GPU acceleration
- Access to a compatible object store if you plan to publish outputs (optional)
Option A – Docker-based deployment (recommended):
- Ensure Docker is installed and running.
- Prepare environment variables in a .env file or via your orchestration system. Example: MCP_HOST=0.0.0.0 MCP_PORT=5000 JWT_SECRET=your-secure-secret RUSTFS_PUBLIC_URL=https://your-rustfs.example.com
- Deploy using docker-compose (if provided) or the provided make targets from the repo (e.g., make setup).
- Access the MCP server endpoint at the configured host/port.
Option B – Python environment installation:
- Install Python 3.8+ and create a virtual environment: python3 -m venv venv source venv/bin/activate
- Install requirements: pip install -r backend/requirements.txt
- Run the MCP server module: python -m backend.mcp_server
- Ensure the server binds to the desired host/port and that prerequisites (DB, object storage) are reachable.
Notes:
- If you enable GPU acceleration, ensure CUDA toolkit is compatible with your driver and container runtime.
- Configure environment variables for authentication, database access, and storage as needed.
Additional notes
Tips and common issues:
- Check that MCP_PORT is open in your firewall and not blocked by another service.
- When using Docker, ensure NVIDIA runtime is installed and configured if GPU is required.
- For JWT-based authentication, securely manage JWT_SECRET and rotate it periodically.
- If outputs reference object storage URLs, ensure RUSTFS_PUBLIC_URL (or equivalent) is reachable from downstream systems.
- The MCP server may rely on back-end services (DB, auth service, storage); verify all dependencies are healthy before starting MCP.
- If you see task queuing delays, inspect worker pool settings and GPU availability; adjust WORKER_MEMORY_LIMIT and related env vars if needed.
Related MCP Servers
nautex
MCP server for guiding Coding Agents via end-to-end requirements to implementation plan pipeline
mcp-yfinance
Real-time stock API with Python, MCP server example, yfinance stock analysis dashboard
cloudwatch-logs
MCP server from serkanh/cloudwatch-logs-mcp
servicenow-api
ServiceNow MCP Server and API Wrapper
the -company
TheMCPCompany: Creating General-purpose Agents with Task-specific Tools
Convert-Markdown-PDF
Markdown To PDF Conversion MCP