OmniMCP

OmniMCP uses Microsoft OmniParser and Model Context Protocol (MCP) to provide AI models with rich UI context and powerful interaction capabilities.

Installation

Run this command in your terminal to add the MCP server to Claude Code.

Run in terminal:

View docs

Command

claude mcp add --transport stdio openadaptai-omnimcp python cli.py \
  --env LOG_LEVEL="INFO" \
  --env OMNIPARSER_URL="Optional: URL to OmniParser server" \
  --env PYTHONWARNINGS="ignore" \
  --env ANTHROPIC_API_KEY="YOUR_ANTHROPIC_KEY" \
  --env AWS_ACCESS_KEY_ID="YOUR_ACCESS_KEY" \
  --env AWS_SECRET_ACCESS_KEY="YOUR_SECRET_KEY"

How to use

OmniMCP provides a robust MCP-enabled interface that connects a large language model (LLM) with a visual UI understanding loop. It captures the current screen, parses UI elements with OmniParser, plans actions using an LLM, and then executes those actions via mouse/keyboard controls. The CLI entry point runs a perceive-plan-act loop, enabling tasks such as performing calculator operations or interacting with synthetic UI scenarios. You can run a default goal (e.g., a calculator-like task) or supply a custom goal to steer the agent’s behavior. The system also supports an experimental MCP server implementation for advanced integration and experimentation. Using OmniMCP, you can observe how perception, planning, and action execution come together to autonomously accomplish UI-oriented goals.

How to install

Prerequisites:

Python 3.10 to 3.12
A Linux desktop session with graphical support (X11/Wayland) for UI interaction
pynput and other system dependencies (handled by install.sh)
Optional: AWS credentials if you enable OmniParser deployment features

Installation steps:

Clone the repository: git clone https://github.com/OpenAdaptAI/OmniMCP.git cd OmniMCP
Run the installer to create a virtual environment and install dependencies: ./install.sh
Copy example environment configuration and edit keys: cp .env.example .env

Edit .env with your keys (AWS, ANTHROPIC_API_KEY, OMNIPARSER_URL, etc.)
Activate the virtual environment:

Linux/macOS

source .venv/bin/activate

Windows (PowerShell)

..venv\Scripts\Activate.ps1
Run the OmniMCP CLI to execute tasks: python cli.py
Optional: If you plan to use AWS deployment features, ensure AWS credentials are configured in .env and then you can leverage the auto-deploy capabilities as described in the docs.

Additional notes

Notes and tips:

The project includes an experimental MCP server implementation (OmniMCP) that sits alongside the standard CLI/AgentExecutor workflow. It is intended for experimentation and higher-level API usage.
Debug output is saved under runs/<timestamp>/ with per-step visuals and logs stored under logs/ for easier troubleshooting.
Ensure a functioning graphical session is available for UI perception and input control (pynput relies on system libraries such as libx11-dev on Linux).
When using AWS-driven auto deployment, be mindful of potential costs and clean up resources with the provided stop command: python -m omnimcp.omniparser.server stop.
If you modify or extend the UI perception or action space, consider updating the environment and installation steps to reflect new dependencies and any required permissions.

Related MCP Servers

SearChat

1.0k

Search + Chat = SearChat(AI Chat with Search), Support OpenAI/Anthropic/VertexAI/Gemini, DeepResearch, SearXNG, Docker. AI对话式搜索引擎，支持DeepResearch, 支持OpenAI/Anthropic/VertexAI/Gemini接口、聚合搜索引擎SearXNG，支持Docker一键部署。

scira -chat

830

A minimalistic MCP client with a good feature set.

mcp -aws

128

A Model Context Protocol server implementation for operations on AWS resources

MCP2Lambda

112

Run any AWS Lambda function as a Large Language Model (LLM) tool without code changes using Anthropic's Model Context Protocol (MCP).

lc2mcp

Convert LangChain tools to FastMCP tools

mcp-db

Session and event store for MCP server, also allows distributed coordination between servers.