ScreenPilot

Tool that allows the AI to control your device in the same way you do, enabling automation for everything!

Installation

Run this command in your terminal to add the MCP server to Claude Code.

Run in terminal:

Command

claude mcp add --transport stdio mtehabsim-screenpilot python pathToProject\ScreenPilot\main.py \
  --env PATH="pathToEnv\venv\Scripts;${PATH}" \
  --env VIRTUAL_ENV="pathToEnv\venv"

How to use

ScreenPilot exposes a set of screen automation capabilities to an MCP-enabled environment, allowing a language model to take control of your desktop by capturing the screen, moving the mouse, typing text, and performing keyboard shortcuts. The server integrates tools for Screen Capture, Mouse Control, Keyboard Actions, Scrolling, Element Detection, and Action Sequences. You can leverage these tools to automate GUI tasks, gather visual context, and interact with applications in a programmable way. Once the MCP server is running, you can issue tool invocations from your LLM-driven workflows to capture screenshots, search for UI elements, perform clicks and drags, and input text or key presses as part of a larger automation script.

How to install

Prerequisites:

Python 3.12 installed on your system
Git installed
Access to the ScreenPilot project repository

Installation steps:

Clone the repository:

git clone https://github.com/Mtehabsim/ScreenPilot.git

Create a Python virtual environment:

python -m venv venv

Activate the virtual environment:

Windows:

venv\Scripts\activate

macOS/Linux:

source venv/bin/activate

Install required packages:

pip install -r requirements.txt

Prepare the MCP configuration file that points to the ScreenPilot main script. Example (adjust paths to your environment):

{
  "mcpServers": {
    "device-controll": {
      "command": "pathToEnv\\venv\\Scripts\\python.exe",
      "args": [
        "pathToProject\\ScreenPilot\\main.py"
      ]
    }
  }
}

Save the configuration file in a location accessible to Claude AI Desktop (or your MCP client) and ensure the paths reflect your setup.
Run the MCP server as configured and connect your MCP client (e.g., Claude AI Desktop) to start issuing tool commands.

Additional notes

Tips and common issues:

Ensure the Python virtual environment is activated before running the server to guarantee dependencies are available.
The configuration example uses Windows-style paths. If you are on macOS/Linux, adapt paths accordingly (e.g., /home/user/... and forward slashes).
If the server cannot locate main.py, verify the path in the args exactly matches the location of ScreenPilot/main.py.
You may need additional system permissions for screen capture and GUI automation depending on your OS (e.g., macOS privacy settings for Screen Recording and Accessibility).
The tool set includes Screen Capture, Mouse Control, Keyboard Actions, Scrolling, Element Detection, and Action Sequences. Combine these tools to implement robust automation pipelines.

Related MCP Servers

gpt-researcher

25.5k

An autonomous agent that conducts deep research on any data using any LLM providers.

mcp-android -python

MCP Android agent - This project provides an *MCP (Model Context Protocol)* server for automating Android devices using uiautomator2. It's designed to be easily plugged into AI agents like GitHub Copilot Chat, Claude, or Open Interpreter to control Android devices through natural language.

Unified -Tool-Graph

Instead of dumping 1000+ tools into a model’s prompt and expecting it to choose wisely, the Unified MCP Tool Graph equips your LLM with structure, clarity, and relevance. It fixes tool confusion, prevents infinite loops, and enables modular, intelligent agent workflows.

mcp-ssh-orchestrator

Secure SSH access for AI agents via MCP. Execute commands across your server fleet with policy enforcement, network controls, and comprehensive audit logging.

mcp-pyautogui

An MCP server for PyAutoGUI

MIST

MCP server empowering AI assistants with real-world capabilities: Gmail, Calendar, Tasks, Git integration, and note management. Bridges AI assistants to external services through standardized protocol with secure authentication.