human

Bringing Human Capabilities to AI Agents

Installation

Run this command in your terminal to add the MCP server to Claude Code.

Run in terminal:

Command

claude mcp add --transport stdio mrgoonie-human-mcp node server.js

How to use

Human MCP is a comprehensive Model Context Protocol server that equips AI agents with human-like capabilities across four core areas: Eyes (visual analysis and document processing), Hands (content creation, image/video editing, and browser automation), Mouth (speech generation and narration), and Brain (advanced reasoning and problem solving). When you query the server, you can request tools such as eyes_analyze for analyzing images or videos, gemini_gen_image for image generation, mouth_speak for text-to-speech, and brain_reflect_enhanced for meta-cognitive analysis. The platform supports multiple providers per capability (e.g., Google Gemini, ZhipuAI, Minimax, ElevenLabs) and lets you override defaults per request with a provider field. This enables rich multimodal workflows, from identifying UI bugs in visuals to generating and narrating technical explanations or debugging code with structured reasoning.

To use the MCP, specify the desired capability and tool name, optionally selecting a provider and parameters. For example, you can request visual analysis with eyes_analyze and set provider to gemini by default, or switch to ZhipuAI for image inspection. For content creation and editing, you can generate images with gemini_gen_image or perform background removal with rmbg_remove_background. For spoken content, mouth_speak can render speech in 30+ voices and 24 languages, while brain_reflect_enhanced can surface deeper analyses and meta-cognitive insights. The combination of tools enables end-to-end multimodal workflows—from analyzing source documents and extracting data to producing narrations and debugging complex code with structured reasoning.

How to install

Prerequisites:

Node.js (recommended) and npm installed on your machine
Access to any required API keys or credentials for providers you plan to use (Google Gemini, ZhipuAI, Minimax, ElevenLabs, etc.)
Optional: Docker if you prefer containerized deployment

Step 1: Prepare environment

Ensure Node.js is installed. Verify: node -v and npm -v
Create a project directory and install dependencies as needed by the MCP server (package.json should be provided with the project or available in the repository). If a package manager file is present, follow its instructions.

Step 2: Install (Node.js path)

If this MCP ships as an npm package (e.g., npm install -g human-mcp or npm install), use the appropriate command. Since this repository references a runnable Node entry point, you can typically install dependencies and start the server:

# From project root
npm install

# Start the MCP server
node server.js

Step 3: Configure environment variables

Set API keys and provider defaults as environment variables or via per-request overrides. Example variables include:
- GOOGLE_GEMINI_API_KEY
- ZHIPUAI_API_KEY
- MINIMAX_API_KEY
- ELEVENLABS_API_KEY
- Other provider-specific keys as required by your deployment

Step 4: Run in Docker (optional)

If you prefer Docker:

# Build and run the container (adjust image name as needed)
docker build -t human-mcp .
docker run -i human-mcp

Step 5: Verify

Ensure the server starts without errors and is listening on the expected port. Test a few sample requests against eyes_analyze, gemini_gen_image, mouth_speak, and brain_reflect_enhanced to confirm end-to-end multimodal capabilities.

Additional notes

Tips and common considerations:

Provider configuration: You can set a default provider per capability using environment variables (SPEECH_PROVIDER, VIDEO_PROVIDER, VISION_PROVIDER, IMAGE_PROVIDER). You can override per-request by including { "provider": "minimax" } in the request payload.
API keys safety: Do not commit API keys to version control. Use environment variables or secret management solutions in production.
Rate limits: Be mindful of provider rate limits and implement retry/backoff logic in client requests.
Debugging: If a tool returns unexpected results, try switching providers or explicitly setting a specific model within a provider to isolate issues.
Documentation: Refer to the Gemini API docs and provider docs for model capabilities and parameter options to craft precise requests (e.g., document processing, 30+ TTS voices, image/video generation options).