Get the FREE Ultimate OpenClaw Setup Guide →

multimodal-agents-course

An MCP Multimodal AI Agent with eyes and ears!

Installation
Run this command in your terminal to add the MCP server to Claude Code.
Run in terminal:
Command
claude mcp add --transport stdio the-ai-merge-multimodal-agents-course node kubrick-mcp/server.js \
  --env GROQ_API_KEY="Your Groq API key (if required)" \
  --env OPIK_API_URL="URL for Opik API (if used)" \
  --env OPENAI_API_KEY="Your OpenAI API key"

How to use

Kubrick is an MCP-based multimodal processing platform designed to expose video, image, audio, and text processing capabilities as MCP resources, tools, and prompts. The server acts as the central hub that coordinates multimodal pipelines, integrates with LLMs/vision models, and exposes endpoints that agents can call via the MCP protocol. With this server, you can register resources (e.g., video indexing, image feature extraction, audio transcription), define prompts and tools for agents to use, and assemble pipelines that an agent can orchestrate at runtime. The included modules emphasize building a production-ready MCP server for video search and multimodal processing, with observability and versioning support through integrations like Opik. To use it, deploy the kubrick-mcp server, ensure your environment provides access to any required ML models or APIs, and connect your MCP clients to the server to start issuing prompts, resources, and tool invocations.

How to install

Prerequisites:

  • Node.js (recommended LTS) installed on your machine
  • Basic knowledge of MCP concepts (Resources, Prompts, Tools, and Agents)
  • Access to any APIs/models required by your pipelines (e.g., OpenAI, Groq) and corresponding API keys

Installation steps:

  1. Clone the repository: git clone https://github.com/your-org/multimodal-agents-course.git cd multimodal-agents-course

  2. Install dependencies for the MCP server (example for Node.js-based server): npm install

  3. Configure environment variables (create a .env file or export variables): OPENAI_API_KEY=your-openai-api-key GROQ_API_KEY=your-groq-api-key # if using Groq OPIK_API_URL=https://api.opik.example # if using Opik

  4. Start the MCP server: npm run start # or node kubrick-mcp/server.js depending on the setup

  5. Verify the server is running by visiting the local endpoint or checking the console logs for MCP readiness.

Notes:

  • If your setup uses a Python backend instead of Node, adjust the commands accordingly (e.g., python -m kubrick_mcp.server).
  • Ensure any required models or services (e.g., video processing pipelines, LLM endpoints) are accessible from the runtime environment.

Additional notes

Tips and common issues:

  • Ensure your API keys and endpoints are correctly configured; missing credentials are a common startup failure.
  • If you encounter port conflicts, change the default MCP server port in the configuration.
  • For multimodal pipelines, verify that all referenced resources (video processors, image/vision models, audio processing components) are installed and accessible.
  • Use Opik or similar observability tools to version prompts and monitor prompt/response history for debugging.
  • When debugging, run in a local development mode to inspect MCP inspector outputs and traces before deploying to production.
  • If using Docker or containerized deployments, ensure appropriate resource limits (CPU/GPU) are set for video and model workloads.

Related MCP Servers

Sponsor this space

Reach thousands of developers