Get the FREE Ultimate OpenClaw Setup Guide →

any2markdown

一个高性能的文档转换服务器,同时支持 Model Context Protocol (MCP) 和 RESTful API 接口。将 PDF、Word 和 Excel 文档转换为 Markdown 格式,具备图片提取、页眉页脚移除和批量处理等高级功能

Installation
Run this command in your terminal to add the MCP server to Claude Code.
Run in terminal:
Command
claude mcp add --transport stdio ww-ai-lab-any2markdown python run_server.py \
  --env HF_HOME="~/.cache/huggingface (HF cache home)" \
  --env USE_GPU="true|false (toggle GPU acceleration)" \
  --env TORCH_HOME="~/.cache/torch" \
  --env HF_HUB_CACHE="~/.cache/huggingface/hub" \
  --env MAX_FILE_SIZE="100MB (default) - maximum upload size" \
  --env TEMP_IMAGE_DIR="./temp_images (path to temporary images cache)" \
  --env HF_ASSETS_CACHE="~/.cache/huggingface/assets" \
  --env MCP_SERVER_HOST="0.0.0.0 (default) or your host" \
  --env MCP_SERVER_PORT="3000 (default) - MCP and REST endpoints" \
  --env MODEL_CACHE_DIR="~/.cache/marker (path to AI/marker model cache)" \
  --env TRANSFORMERS_CACHE="~/.cache/transformers" \
  --env MAX_CONCURRENT_JOBS="3 (default) - max simultaneous conversions" \
  --env HF_HUB_ENABLE_HF_TRANSFER="false (HF transfer optimization)" \
  --env HF_HUB_DISABLE_PROGRESS_BARS="false (progress bar visibility)"

How to use

Any2Markdown is a high-performance MCP server that exposes both the MCP protocol (streaming image-rich document conversion via the MCP client) and a RESTful API for converting documents to Markdown, HTML, or JSON. It supports converting PDFs, Word documents, and Excel sheets into structured Markdown with features like image extraction, header/footer removal, and batch processing. Clients can connect via the MCP protocol for streaming-style conversion or use the REST API endpoints to upload files or send base64-encoded content. The server provides an OpenAPI/Swagger document for API exploration and a set of tools/endpoints for converting documents and analyzing document structure. Typical workflows include uploading a document to the REST API for conversion or using the MCP client to request conversion tasks in a streaming fashion, receiving results as they are produced, and leveraging the bundled tools for PDFs, Word, and Excel formats.

How to install

Prerequisites:

  • Python 3.10–3.13 installed on the host (as validated by the project).
  • 4 GB+ RAM and 10 GB+ disk space for model caches and temporary files.
  • Network access for downloading models and dependencies.

Installation steps:

  1. Clone the repository:
git clone https://github.com/WW-AI-Lab/any2markdown.git
cd any2markdown
  1. Create and activate a virtual environment (recommended):
python3.13 -m venv .venv
source .venv/bin/activate  # Windows: .venv\Scripts\activate
  1. Prepare environment and install dependencies:
cp env.example .env
# Install dependencies (using Huawei mirror if desired)
PIP_CONFIG_FILE=.pip/pip.conf pip install -r requirements.txt
  1. Or run the provided deployment script (if available):
./scripts/setup_venv.sh
./scripts/deploy.sh source
  1. Start the server (from project root):
# Ensure virtual environment is active
python run_server.py
  1. Verify that the server is accessible:

Notes:

  • If using Docker, follow the Docker deployment instructions in README to run with a container.
  • Prepare environment variables in a .env file or via the mcp_config.env mapping as needed.

Additional notes

Tips and common issues:

  • Ensure that the model caches directories exist and are writable, e.g., ~/.cache/marker and related HF cache paths.
  • When enabling GPU acceleration, install the appropriate CUDA drivers and ensure the host has a compatible NVIDIA GPU.
  • For large documents, consider increasing MAX_CONCURRENT_JOBS and ensuring enough CPU/GPU resources are available.
  • If images should be served from a static URL, configure the image extraction feature and hosting path accordingly.
  • Review the API docs at /api/v1/docs to understand request payloads for file uploads and base64-encoded content.
  • If you run into port conflicts, adjust MCP_SERVER_PORT in the environment or the command invocation.

Related MCP Servers

Sponsor this space

Reach thousands of developers