mcp-ocr
用于自然语言识别图片内容
claude mcp add --transport stdio ricardo-m-l-mcp-ocr-server docker run -i mcp-ocr-server:latest \ --env CONFIG_PATH="/app/configs/config.yaml"
How to use
This MCP OCR server provides production-grade optical character recognition with intelligent image preprocessing. Built on GoCV and Tesseract, it offers multi-language recognition (English, Simplified Chinese, Traditional Chinese, Japanese), automatic image quality assessment, adaptive preprocessing, a worker pool for high throughput, result caching, and full MCP Model Context Protocol support. You can interact with it through the included MCP tools such as ocr_recognize_text, ocr_recognize_text_base64, ocr_batch_recognize, and ocr_get_supported_languages. The server is designed to be deployed via Docker and configured with a YAML config; once running, you can pass image data and language preferences to obtain recognized text, confidence scores, and processing duration.
How to install
Prerequisites:
- Docker and Docker Compose (optional for local development)
- Basic Go tooling is used in source builds if you choose to build from source, but the recommended route is to use the provided Docker image.
- Optional: tessdata language packs for Tesseract if not included in the image
Installation steps (Docker-based):
- Pull or build the Docker image:
- Build locally (if you have a Dockerfile): docker build -t mcp-ocr-server:latest .
- Or pull from registry (if published): docker pull mcp-ocr-server:latest
- Prepare configuration:
- Create a configs/config.yaml with the OCR, preprocessing, and performance settings described in the README. Example paths: configs/config.yaml
- Run the container:
- Simple run mounting config:
docker run --rm -it
-v $(pwd)/configs:/app/configs
-v $(pwd)/test:/app/test
mcp-ocr-server:latest
- Simple run mounting config:
docker run --rm -it
- Validate the server is running by calling an MCP tool (see below) or by checking logs.
If you prefer building from source:
- Ensure Go 1.21+ and dependencies are installed.
- In the project root, run: make deps make build
- Run the binary with a config file if required, e.g.: ./bin/mcp-ocr-server -config configs/config.yaml
Additional notes
Tips and caveats:
- Ensure Tesseract data is accessible (TESSDATA_PREFIX or tessdata path) and that the languages you request are installed (eng, chi_sim, chi_tra, jpn).
- OpenCV libraries must be present on the host or included in the image; for macOS/Linux, use the recommended installation commands from the README.
- If you encounter memory or performance issues, tune the worker_pool_size and cache_size in the YAML config as suggested in the performance tuning section.
- For Docker deployments, mount your configs directory into /app/configs inside the container to enable config-driven behavior; you can override via environment variables if supported by your image.
- If language data cannot be found, check TESSDATA_PREFIX and tessdata path in the container environment and ensure the data files are accessible to the running process.
Related MCP Servers
trpc-agent-go
trpc-agent-go is a powerful Go framework for building intelligent agent systems using large language models (LLMs) and tools.
station
Station is our open-source runtime that lets teams deploy agents on their own infrastructure with full control.
tiger-cli
Tiger CLI is the command-line interface for Tiger Cloud. It includes an MCP server for helping coding agents write production-level Postgres code.
gopls
MCP server for golang projects development: Expand AI Code Agent ability boundary to have a semantic understanding and determinisic information for golang projects.
kubernetes
A Model Context Protocol (MCP) server for the Kubernetes API.
gcp-cost
💰 An MCP server that enables AI assistants to estimate Google Cloud costs, powered by Cloud Billing Catalog API and built with Genkit for Go