DINO-X

Official DINO-X Model Context Protocol (MCP) server that empowers LLMs with real-world visual perception through image object detection, localization, and captioning APIs.

Installation

Run this command in your terminal to add the MCP server to Claude Code.

Run in terminal:

View docs

Command

claude mcp add --transport stdio idea-research-dino-x-mcp npx -y @deepdataspace/dinox-mcp \
  --env DINOX_API_KEY="your-api-key-here" \
  --env IMAGE_STORAGE_DIRECTORY="/path/to/your/image/directory"

How to use

DINO-X MCP Server provides multimodal computer vision capabilities built on top of the DINO-X models. It supports fine-grained object detection, region-level descriptions, and structured outputs that include object categories, counts, locations, and attributes. The server exposes several tools through the MCP interface: full-scene object detection (detect-all-objects), text-prompted object detection (detect-objects-by-text), human pose estimation (detect-human-pose-keypoints), and a visualization utility (visualize-detection-result) that saves annotated images locally. You can run the MCP locally via STDIO or expose it over HTTP in streamable mode, enabling integration with other MCP servers to form end-to-end visual agents or automation pipelines.

How to install

Prerequisites:

Node.js (LTS version) and npm installed on your machine
Optional: a valid DINO-X API key if you plan to use the API-key protected features

Option B — Use the NPM package locally (STDIO)

Install Node.js from https://nodejs.org/ or ensure npm is available
Install and run the MCP via npx (as configured in mcp_config):

# Quick run using npx and the official package
# Ensure you have an API key ready
# Replace variables in the configuration as needed

Configure your MCP client with the following example (mcpServers.dinox-mcp):

{
  "mcpServers": {
    "dinox-mcp": {
      "command": "npx",
      "args": ["-y", "@deepdataspace/dinox-mcp"],
      "env": {
        "DINOX_API_KEY": "your-api-key-here",
        "IMAGE_STORAGE_DIRECTORY": "/path/to/your/image/directory"
      }
    }
  }
}

Option C — Run from source locally

Ensure Node.js is installed
Clone the repository and install dependencies:

git clone https://github.com/IDEA-Research/DINO-X-MCP.git
cd DINO-X-MCP
npm install

Build the project:

npm run build

Run the server via Node.js and point your MCP client to the built index.js:

{
  "mcpServers": {
    "dinox-mcp": {
      "command": "node",
      "args": ["/path/to/DINO-X-MCP/build/index.js"],
      "env": {
        "DINOX_API_KEY": "your-api-key-here",
        "IMAGE_STORAGE_DIRECTORY": "/path/to/your/image/directory"
      }
    }
  }
}

Note: If you prefer a hosted or containerized setup, you can adapt these steps to a Docker workflow or a cloud deployment, but the readme primarily demonstrates the npm-based local run and the source build workflow.

Additional notes

Tips and caveats:

The DINO-X MCP supports two transport modes: STDIO (default) and Streamable HTTP. Choose the mode that fits your workflow via command-line flags (e.g., --http for HTTP mode).
Required environment variable: DINOX_API_KEY is required for API-enabled features. IMAGE_STORAGE_DIRECTORY is optional for STDIO mode and controls where annotated images are saved.
If you enable HTTP mode behind a gateway, you may also set AUTH_TOKEN to restrict access.
The npm package to use is @deepdataspace/dinox-mcp. When using npx, the package name is passed as an argument as shown in the examples.
The client configuration examples assume the MCP server is reachable at localhost:3020 by default when using the built/index.js entry (adjust port as needed).
If you’re using the Option B or C configurations, ensure proper pathing for /path/to/your/image/directory and /path/to/DINO-X-MCP/build/index.js in your environment.

Related MCP Servers

context7

47.3k

Context7 MCP Server -- Up-to-date code documentation for LLMs and AI code editors

obsidian -tools

612

Add Obsidian integrations like semantic search and custom Templater prompts to Claude or any MCP client.

MiniMax -JS

105

Official MiniMax Model Context Protocol (MCP) JavaScript implementation that provides seamless integration with MiniMax's powerful AI capabilities including image generation, video generation, text-to-speech, and voice cloning APIs.

mcp-bundler

Is the MCP configuration too complicated? You can easily share your own simplified setup!

akyn-sdk

Turn any data source into an MCP server in 5 minutes. Build AI-agents-ready knowledge bases.

promptboard

The Shared Whiteboard for Your AI Agents via MCP. Paste screenshots, mark them up, and share with AI.