ScreenPilot
Tool that allows the AI to control your device in the same way you do, enabling automation for everything!
claude mcp add --transport stdio mtehabsim-screenpilot python pathToProject\ScreenPilot\main.py \
--env PATH="pathToEnv\venv\Scripts;${PATH}" \
--env VIRTUAL_ENV="pathToEnv\venv"How to use
ScreenPilot exposes a set of screen automation capabilities to an MCP-enabled environment, allowing a language model to take control of your desktop by capturing the screen, moving the mouse, typing text, and performing keyboard shortcuts. The server integrates tools for Screen Capture, Mouse Control, Keyboard Actions, Scrolling, Element Detection, and Action Sequences. You can leverage these tools to automate GUI tasks, gather visual context, and interact with applications in a programmable way. Once the MCP server is running, you can issue tool invocations from your LLM-driven workflows to capture screenshots, search for UI elements, perform clicks and drags, and input text or key presses as part of a larger automation script.
How to install
Prerequisites:
- Python 3.12 installed on your system
- Git installed
- Access to the ScreenPilot project repository
Installation steps:
- Clone the repository:
git clone https://github.com/Mtehabsim/ScreenPilot.git
- Create a Python virtual environment:
python -m venv venv
- Activate the virtual environment:
- Windows:
venv\Scripts\activate
- macOS/Linux:
source venv/bin/activate
- Install required packages:
pip install -r requirements.txt
- Prepare the MCP configuration file that points to the ScreenPilot main script. Example (adjust paths to your environment):
{
"mcpServers": {
"device-controll": {
"command": "pathToEnv\\venv\\Scripts\\python.exe",
"args": [
"pathToProject\\ScreenPilot\\main.py"
]
}
}
}
- Save the configuration file in a location accessible to Claude AI Desktop (or your MCP client) and ensure the paths reflect your setup.
- Run the MCP server as configured and connect your MCP client (e.g., Claude AI Desktop) to start issuing tool commands.
Additional notes
Tips and common issues:
- Ensure the Python virtual environment is activated before running the server to guarantee dependencies are available.
- The configuration example uses Windows-style paths. If you are on macOS/Linux, adapt paths accordingly (e.g., /home/user/... and forward slashes).
- If the server cannot locate main.py, verify the path in the args exactly matches the location of ScreenPilot/main.py.
- You may need additional system permissions for screen capture and GUI automation depending on your OS (e.g., macOS privacy settings for Screen Recording and Accessibility).
- The tool set includes Screen Capture, Mouse Control, Keyboard Actions, Scrolling, Element Detection, and Action Sequences. Combine these tools to implement robust automation pipelines.
Related MCP Servers
gpt-researcher
An autonomous agent that conducts deep research on any data using any LLM providers.
mcp-android -python
MCP Android agent - This project provides an *MCP (Model Context Protocol)* server for automating Android devices using uiautomator2. It's designed to be easily plugged into AI agents like GitHub Copilot Chat, Claude, or Open Interpreter to control Android devices through natural language.
Unified -Tool-Graph
Instead of dumping 1000+ tools into a model’s prompt and expecting it to choose wisely, the Unified MCP Tool Graph equips your LLM with structure, clarity, and relevance. It fixes tool confusion, prevents infinite loops, and enables modular, intelligent agent workflows.
mcp-ssh-orchestrator
Secure SSH access for AI agents via MCP. Execute commands across your server fleet with policy enforcement, network controls, and comprehensive audit logging.
mcp-pyautogui
An MCP server for PyAutoGUI
MIST
MCP server empowering AI assistants with real-world capabilities: Gmail, Calendar, Tasks, Git integration, and note management. Bridges AI assistants to external services through standardized protocol with secure authentication.