ocr-captcha

高级 OCR 和验证码识别 MCP 服务器 - 为 AI 代理提供图像识别能力

Installation

Run this command in your terminal to add the MCP server to Claude Code.

Run in terminal:

Command

claude mcp add --transport stdio ymeng98-ocr-captcha-mcp-server npx -y @smithery/cli run your-deployment-url

How to use

This OCR & CAPTCHA MCP server provides AI-assisted image understanding capabilities for MCP agents. It exposes tools that can perform optical character recognition, detect text regions, preprocess images to improve OCR accuracy, and solve slider CAPTCHAs through template matching. Core tools include ocr_recognize for extracting text from images, text_detection for locating text regions, image_preprocessing for image enhancement, and slide_captcha_match for solving common sliding CAPTCHA puzzles. You can call these tools by sending a properly structured MCP request with the required parameters (for example, an image payload in Base64 and any optional language or configuration parameters). The server is designed to run in environments like Smithery as well as locally in development mode, and it emphasizes non-root operation and safe handling of user images.

How to install

Prerequisites:

Node.js 18+ and npm installed
Access to a terminal/command prompt
Optional: Docker for containerized deployment

Local development and deployment steps:

Clone the repository: git clone https://github.com/ymeng98/ocr-captcha-mcp-server.git cd ocr-captcha-mcp-server
Install dependencies: npm install
Build the project (if applicable): npm run build
Run in development mode: npm run dev
Run tests (optional): npm test
Docker deployment (optional): docker build -t ocr-captcha-mcp . docker run -p 8080:8080 ocr-captcha-mcp

Note: If you’re deploying via Smithery, follow Smithery-specific deployment steps and configure the deployment URL in the MCP config (see the Smithery-related example in this repository).

Additional notes

Tips and common considerations:

Ensure Node.js 18+ compatibility as required by the project.
The server runs as a non-root user in Docker for security; do not rely on persistent local storage for user images.
Memory usage should be monitored; image processing can be memory-intensive depending on image size and language models used by Tesseract.js.
When using the ocr_recognize tool, provide the image as Base64 data and optionally specify language and whitelist parameters to improve accuracy.
The slide_captcha_match tool expects background and piece images as Base64 and uses a threshold to determine a match; tune it if you encounter false positives.
If deploying via Smithery, ensure the deployment URL is correctly referenced in the MCP config and that the environment has access to any needed assets or models.
Validate input parameters strictly to prevent errors and ensure robust error handling in MCP clients.

Related MCP Servers

zen

1.1k

Selfhosted notes app. Single golang binary, notes stored as markdown within SQLite, full-text search, very low resource usage

MCP -Deepseek_R1

A Model Context Protocol (MCP) server implementation connecting Claude Desktop with DeepSeek's language models (R1/V3)

mcp-fhir

A Model Context Protocol implementation for FHIR

mcp

Inkdrop Model Context Protocol Server

mcp-appium-gestures

This is a Model Context Protocol (MCP) server providing resources and tools for Appium mobile gestures using Actions API..

dubco -npm

The (Unofficial) dubco-mcp-server enables AI assistants to manage Dub.co short links via the Model Context Protocol. It provides three MCP tools: create_link for generating new short URLs, update_link for modifying existing links, and delete_link for removing short links.