DINO-X
Official DINO-X Model Context Protocol (MCP) server that empowers LLMs with real-world visual perception through image object detection, localization, and captioning APIs.
claude mcp add --transport stdio idea-research-dino-x-mcp npx -y @deepdataspace/dinox-mcp \ --env DINOX_API_KEY="your-api-key-here" \ --env IMAGE_STORAGE_DIRECTORY="/path/to/your/image/directory"
How to use
DINO-X MCP Server provides multimodal computer vision capabilities built on top of the DINO-X models. It supports fine-grained object detection, region-level descriptions, and structured outputs that include object categories, counts, locations, and attributes. The server exposes several tools through the MCP interface: full-scene object detection (detect-all-objects), text-prompted object detection (detect-objects-by-text), human pose estimation (detect-human-pose-keypoints), and a visualization utility (visualize-detection-result) that saves annotated images locally. You can run the MCP locally via STDIO or expose it over HTTP in streamable mode, enabling integration with other MCP servers to form end-to-end visual agents or automation pipelines.
How to install
Prerequisites:
- Node.js (LTS version) and npm installed on your machine
- Optional: a valid DINO-X API key if you plan to use the API-key protected features
Option B — Use the NPM package locally (STDIO)
- Install Node.js from https://nodejs.org/ or ensure npm is available
- Install and run the MCP via npx (as configured in mcp_config):
# Quick run using npx and the official package
# Ensure you have an API key ready
# Replace variables in the configuration as needed
- Configure your MCP client with the following example (mcpServers.dinox-mcp):
{
"mcpServers": {
"dinox-mcp": {
"command": "npx",
"args": ["-y", "@deepdataspace/dinox-mcp"],
"env": {
"DINOX_API_KEY": "your-api-key-here",
"IMAGE_STORAGE_DIRECTORY": "/path/to/your/image/directory"
}
}
}
}
Option C — Run from source locally
- Ensure Node.js is installed
- Clone the repository and install dependencies:
git clone https://github.com/IDEA-Research/DINO-X-MCP.git
cd DINO-X-MCP
npm install
- Build the project:
npm run build
- Run the server via Node.js and point your MCP client to the built index.js:
{
"mcpServers": {
"dinox-mcp": {
"command": "node",
"args": ["/path/to/DINO-X-MCP/build/index.js"],
"env": {
"DINOX_API_KEY": "your-api-key-here",
"IMAGE_STORAGE_DIRECTORY": "/path/to/your/image/directory"
}
}
}
}
Note: If you prefer a hosted or containerized setup, you can adapt these steps to a Docker workflow or a cloud deployment, but the readme primarily demonstrates the npm-based local run and the source build workflow.
Additional notes
Tips and caveats:
- The DINO-X MCP supports two transport modes: STDIO (default) and Streamable HTTP. Choose the mode that fits your workflow via command-line flags (e.g., --http for HTTP mode).
- Required environment variable: DINOX_API_KEY is required for API-enabled features. IMAGE_STORAGE_DIRECTORY is optional for STDIO mode and controls where annotated images are saved.
- If you enable HTTP mode behind a gateway, you may also set AUTH_TOKEN to restrict access.
- The npm package to use is @deepdataspace/dinox-mcp. When using npx, the package name is passed as an argument as shown in the examples.
- The client configuration examples assume the MCP server is reachable at localhost:3020 by default when using the built/index.js entry (adjust port as needed).
- If you’re using the Option B or C configurations, ensure proper pathing for /path/to/your/image/directory and /path/to/DINO-X-MCP/build/index.js in your environment.
Related MCP Servers
context7
Context7 MCP Server -- Up-to-date code documentation for LLMs and AI code editors
obsidian -tools
Add Obsidian integrations like semantic search and custom Templater prompts to Claude or any MCP client.
MiniMax -JS
Official MiniMax Model Context Protocol (MCP) JavaScript implementation that provides seamless integration with MiniMax's powerful AI capabilities including image generation, video generation, text-to-speech, and voice cloning APIs.
mcp-bundler
Is the MCP configuration too complicated? You can easily share your own simplified setup!
akyn-sdk
Turn any data source into an MCP server in 5 minutes. Build AI-agents-ready knowledge bases.
promptboard
The Shared Whiteboard for Your AI Agents via MCP. Paste screenshots, mark them up, and share with AI.