human
Bringing Human Capabilities to AI Agents
claude mcp add --transport stdio mrgoonie-human-mcp node server.js
How to use
Human MCP is a comprehensive Model Context Protocol server that equips AI agents with human-like capabilities across four core areas: Eyes (visual analysis and document processing), Hands (content creation, image/video editing, and browser automation), Mouth (speech generation and narration), and Brain (advanced reasoning and problem solving). When you query the server, you can request tools such as eyes_analyze for analyzing images or videos, gemini_gen_image for image generation, mouth_speak for text-to-speech, and brain_reflect_enhanced for meta-cognitive analysis. The platform supports multiple providers per capability (e.g., Google Gemini, ZhipuAI, Minimax, ElevenLabs) and lets you override defaults per request with a provider field. This enables rich multimodal workflows, from identifying UI bugs in visuals to generating and narrating technical explanations or debugging code with structured reasoning.
To use the MCP, specify the desired capability and tool name, optionally selecting a provider and parameters. For example, you can request visual analysis with eyes_analyze and set provider to gemini by default, or switch to ZhipuAI for image inspection. For content creation and editing, you can generate images with gemini_gen_image or perform background removal with rmbg_remove_background. For spoken content, mouth_speak can render speech in 30+ voices and 24 languages, while brain_reflect_enhanced can surface deeper analyses and meta-cognitive insights. The combination of tools enables end-to-end multimodal workflows—from analyzing source documents and extracting data to producing narrations and debugging complex code with structured reasoning.
How to install
Prerequisites:
- Node.js (recommended) and npm installed on your machine
- Access to any required API keys or credentials for providers you plan to use (Google Gemini, ZhipuAI, Minimax, ElevenLabs, etc.)
- Optional: Docker if you prefer containerized deployment
Step 1: Prepare environment
- Ensure Node.js is installed. Verify: node -v and npm -v
- Create a project directory and install dependencies as needed by the MCP server (package.json should be provided with the project or available in the repository). If a package manager file is present, follow its instructions.
Step 2: Install (Node.js path)
- If this MCP ships as an npm package (e.g., npm install -g human-mcp or npm install), use the appropriate command. Since this repository references a runnable Node entry point, you can typically install dependencies and start the server:
# From project root
npm install
# Start the MCP server
node server.js
Step 3: Configure environment variables
- Set API keys and provider defaults as environment variables or via per-request overrides. Example variables include:
- GOOGLE_GEMINI_API_KEY
- ZHIPUAI_API_KEY
- MINIMAX_API_KEY
- ELEVENLABS_API_KEY
- Other provider-specific keys as required by your deployment
Step 4: Run in Docker (optional)
- If you prefer Docker:
# Build and run the container (adjust image name as needed)
docker build -t human-mcp .
docker run -i human-mcp
Step 5: Verify
- Ensure the server starts without errors and is listening on the expected port. Test a few sample requests against eyes_analyze, gemini_gen_image, mouth_speak, and brain_reflect_enhanced to confirm end-to-end multimodal capabilities.
Additional notes
Tips and common considerations:
- Provider configuration: You can set a default provider per capability using environment variables (SPEECH_PROVIDER, VIDEO_PROVIDER, VISION_PROVIDER, IMAGE_PROVIDER). You can override per-request by including { "provider": "minimax" } in the request payload.
- API keys safety: Do not commit API keys to version control. Use environment variables or secret management solutions in production.
- Rate limits: Be mindful of provider rate limits and implement retry/backoff logic in client requests.
- Debugging: If a tool returns unexpected results, try switching providers or explicitly setting a specific model within a provider to isolate issues.
- Documentation: Refer to the Gemini API docs and provider docs for model capabilities and parameter options to craft precise requests (e.g., document processing, 30+ TTS voices, image/video generation options).
Related MCP Servers
mcp -chart
🤖 A visualization mcp & skills contains 25+ visual charts using @antvis. Using for chart generation and data analysis.
skills
Skills, MCP servers, Custom Agents, Agents.md for SDKs to ground Coding Agents
rulesync
A Utility CLI for AI Coding Agents
bm.md
更好用的 Markdown 排版助手|一键适配微信公众号、网页与图片。
pinion-os
Client SDK, Claude plugin and skill framework for the Pinion protocol. x402 micropayments on Base.
codex-specialized-subagents
MCP server that lets Codex delegate to isolated codex exec sub-agents, selecting repo+global skills automatically