image-description _server
Model Context Protocol (MCP) server that enables AI assistants to analyze images using xAI's Grok vision API. Supports URL and local file processing with OCR capabilities.
claude mcp add --transport stdio 7etsuo-image-description-mcp_server python image-description-mcp_server.py \ --env GROK_API_KEY="your-grok-api-key"
How to use
This MCP server provides AI-powered image analysis using Grok's vision capabilities. It exposes tools to describe images from both web URLs and local files, and to extract text via OCR. Specifically, you can describe an image from a URL with describe_image_url, analyze a local image with describe_image_file, and run OCR to extract readable text from an image with extract_text_from_image. To interact with the server, supply JSON-RPC requests that target these tools, and the server will return structured descriptions, metadata, and extracted text as provided by Grok.
Typical workflows include describing a public image URL to obtain a detailed description and metadata, analyzing a local image file stored on your machine, or performing OCR on a screenshot or document image. For local testing, you can start the Python MCP server and issue test requests via the provided CLI example. The server is designed to operate entirely client-side for image processing, with sensitive API calls routed to Grok via your Grok API key.
How to install
Prerequisites:
- Python 3.8+ and pip
- Grok API key from https://console.x.ai/
- (Optional) Docker Desktop with MCP Toolkit if you plan to run via Docker
Step-by-step installation:
- Clone or download the repository containing image-description-mcp_server.py
- Create a Python virtual environment (optional but recommended): python -m venv venv source venv/bin/activate # Linux/macOS venv\Scripts\activate.bat # Windows
- Install required dependencies (adjust if a requirements.txt is provided): pip install httpx Pillow opencv-python
- Configure your Grok API key:
- Set GROK_API_KEY in your environment or provide via the runtime config
- Example (Unix): export GROK_API_KEY=your-grok-api-key
- Run the MCP server directly (local testing): python image-description-mcp_server.py
- Optional: if you want to test the MCP protocol explicitly, use the provided test command example: echo '{"jsonrpc":"2.0","method":"tools/list","id":1}' | python image-description-mcp_server.py
- If you prefer Docker, build and run using the provided Dockerfile and MCP CLI per the repository's Docker setup instructions, ensuring GROK_API_KEY is supplied to the container as an environment variable.
Additional notes
Environment variables and configuration:
- GROK_API_KEY must be set to a valid Grok API key for image analysis and OCR features.
- When running in Docker, consider using Docker secrets for GROK_API_KEY and ensure the key is accessible to the container at runtime.
- The server processes images locally for metadata extraction and OCR; no image data is stored permanently.
- If you update tools or add new capabilities, re-run any build or deployment steps described in the repository (e.g., Docker image rebuild).
- Supported image sources include web URLs (HTTP/HTTPS) and local filesystem paths; ensure URLs are accessible and file paths are readable by the running process.
- If authentication or rate limits are encountered with Grok, verify API key validity and examine Grok usage quotas in the Grok console.
Related MCP Servers
mcp-vegalite
MCP server from isaacwasserman/mcp-vegalite-server
github-chat
A Model Context Protocol (MCP) for analyzing and querying GitHub repositories using the GitHub Chat API.
nautex
MCP server for guiding Coding Agents via end-to-end requirements to implementation plan pipeline
pagerduty
PagerDuty's official local MCP (Model Context Protocol) server which provides tools to interact with your PagerDuty account directly from your MCP-enabled client.
futu-stock
mcp server for futuniuniu stock
mcp -boilerplate
Boilerplate using one of the 'better' ways to build MCP Servers. Written using FastMCP