ai-vision

A Model Context Protocol (MCP) server that provides vision capabilities to analyze image and video

Installation

Run this command in your terminal to add the MCP server to Claude Code.

Run in terminal:

Command

claude mcp add --transport stdio tan-yong-sheng-ai-vision-mcp npx ai-vision-mcp \
  --env GEMINI_API_KEY="your-gemini-api-key" \
  --env IMAGE_PROVIDER="google" \
  --env VIDEO_PROVIDER="google"

How to use

AI Vision MCP Server provides an extensible MCP service to analyze images and videos using AI models hosted on Google Gemini and Vertex AI. It supports dual provider configuration, multimodal content analysis, and flexible file handling (URLs, local files, or base64 data). The server exposes four MCP tools under the ai-vision-mcp package, including analyze_image and compare_images, enabling you to perform rich image understanding tasks such as detailed scene descriptions and cross-image comparisons. To start, configure the environment to select either the Google Gemini or Vertex AI provider, then run the MCP server via npx ai-vision-mcp. Once running, you can interact with the MCP tools through standard MCP requests, providing image sources and prompts to guide the AI analysis.

How to install

Prerequisites:

Node.js and npm installed on your machine
Access credentials for Google Gemini API if you choose the google provider (GEMINI_API_KEY)
Optional: Vertex AI credentials if you switch to vertex_ai provider

Install the MCP server globally or locally via your preferred MCP client (examples assume using Claude/Cursor/Cline scripts as in the README):
Using Google AI Studio Provider (default example)

Ensure you have a valid Gemini API key and set the environment variables when starting the MCP:

export IMAGE_PROVIDER="google"
export VIDEO_PROVIDER="google"
export GEMINI_API_KEY="your-gemini-api-key"

Start the MCP server (example using npx):

npx ai-vision-mcp

Alternative: configure with Vertex AI provider

export IMAGE_PROVIDER="vertex_ai"
export VIDEO_PROVIDER="vertex_ai"
export VERTEX_CLIENT_EMAIL="your-service-account@project.iam.gserviceaccount.com"
export VERTEX_PRIVATE_KEY="-----BEGIN PRIVATE KEY-----\n...\n-----END PRIVATE KEY-----\n"
export VERTEX_PROJECT_ID="your-gcp-project-id"
export GCS_BUCKET_NAME="your-gcs-bucket"

Then run the MCP with npx ai-vision-mcp as above.

Integration with MCP clients (Claude Desktop/Claude Code/Cursor/Cline) is demonstrated in the README; you can paste the provided JSON configurations into the client's MCP settings or use the npx command with the appropriate environment variables.

Additional notes

Tips and notes:

If using Vertex AI, ensure your service account has access to Vertex AI and the GCS bucket is correctly configured.
For long-running analysis, consider increasing MCP timeout settings in your client configuration as recommended (at least 1 minute startup and several minutes tool timeouts).
The MCP tools (analyze_image and compare_images) accept image sources as URLs, base64 data, or file paths; ensure your prompts are clear to guide the AI analysis.
If you encounter authentication errors, double-check that environment variables (GEMINI_API_KEY or Vertex credentials) are correctly exported in the shell or defined in your MCP client config.
The server is designed to work with multiple MCP clients via stdio transport; you can customize per-client timeouts and environment variables as shown in the README examples.

Related MCP Servers

iterm

530

A Model Context Protocol server that executes commands in the current iTerm session - useful for REPL and CLI assistance

mcp

Octopus Deploy Official MCP Server

furi

CLI & API for MCP management

editor

MCP Server for Phaser Editor

DoorDash

MCP server from JordanDalton/DoorDash-MCP-Server

mcp

MCP сервер для автоматического создания и развертывания приложений в Timeweb Cloud