mcp-ocr

用于自然语言识别图片内容

Installation

Run this command in your terminal to add the MCP server to Claude Code.

Run in terminal:

Command

claude mcp add --transport stdio ricardo-m-l-mcp-ocr-server docker run -i mcp-ocr-server:latest \
  --env CONFIG_PATH="/app/configs/config.yaml"

How to use

This MCP OCR server provides production-grade optical character recognition with intelligent image preprocessing. Built on GoCV and Tesseract, it offers multi-language recognition (English, Simplified Chinese, Traditional Chinese, Japanese), automatic image quality assessment, adaptive preprocessing, a worker pool for high throughput, result caching, and full MCP Model Context Protocol support. You can interact with it through the included MCP tools such as ocr_recognize_text, ocr_recognize_text_base64, ocr_batch_recognize, and ocr_get_supported_languages. The server is designed to be deployed via Docker and configured with a YAML config; once running, you can pass image data and language preferences to obtain recognized text, confidence scores, and processing duration.

How to install

Prerequisites:

Docker and Docker Compose (optional for local development)
Basic Go tooling is used in source builds if you choose to build from source, but the recommended route is to use the provided Docker image.
Optional: tessdata language packs for Tesseract if not included in the image

Installation steps (Docker-based):

Pull or build the Docker image:
- Build locally (if you have a Dockerfile): docker build -t mcp-ocr-server:latest .
- Or pull from registry (if published): docker pull mcp-ocr-server:latest
Prepare configuration:
- Create a configs/config.yaml with the OCR, preprocessing, and performance settings described in the README. Example paths: configs/config.yaml
Run the container:
- Simple run mounting config: docker run --rm -it
  -v $(pwd)/configs:/app/configs
  -v $(pwd)/test:/app/test
  mcp-ocr-server:latest
Validate the server is running by calling an MCP tool (see below) or by checking logs.

If you prefer building from source:

Ensure Go 1.21+ and dependencies are installed.
In the project root, run: make deps make build
Run the binary with a config file if required, e.g.: ./bin/mcp-ocr-server -config configs/config.yaml

Additional notes

Tips and caveats:

Ensure Tesseract data is accessible (TESSDATA_PREFIX or tessdata path) and that the languages you request are installed (eng, chi_sim, chi_tra, jpn).
OpenCV libraries must be present on the host or included in the image; for macOS/Linux, use the recommended installation commands from the README.
If you encounter memory or performance issues, tune the worker_pool_size and cache_size in the YAML config as suggested in the performance tuning section.
For Docker deployments, mount your configs directory into /app/configs inside the container to enable config-driven behavior; you can override via environment variables if supported by your image.
If language data cannot be found, check TESSDATA_PREFIX and tessdata path in the container environment and ensure the data files are accessible to the running process.

Related MCP Servers

trpc-agent-go

949

trpc-agent-go is a powerful Go framework for building intelligent agent systems using large language models (LLMs) and tools.

station

385

Station is our open-source runtime that lets teams deploy agents on their own infrastructure with full control.

tiger-cli

Tiger CLI is the command-line interface for Tiger Cloud. It includes an MCP server for helping coding agents write production-level Postgres code.

gopls

MCP server for golang projects development: Expand AI Code Agent ability boundary to have a semantic understanding and determinisic information for golang projects.

kubernetes

A Model Context Protocol (MCP) server for the Kubernetes API.

gcp-cost

💰 An MCP server that enables AI assistants to estimate Google Cloud costs, powered by Cloud Billing Catalog API and built with Genkit for Go