What is remotion-production?

A full video production workflow that orchestrates MCP servers to create Remotion videos from concept to render.

Which MCP tools are available?

Remotion media suite (generate_tts, generate_music, generate_sfx, generate_image, generate_video, generate_subtitles, list_assets), plus TwelveLabs, Pexels, ElevenLabs, and Replicate.

How do I start using it?

Review the production pipelines (production-pipeline.md, audio-integration.md, etc.), ensure assets are in public/, generate voiceover first, and render via Remotion CI or local render.

remotion-production

npx machina-cli add skill DojoCodingLabs/remotion-superpowers/remotion-production --openclaw

Files (1)

SKILL.md

4.4 KB

Remotion Production Workflow

This skill teaches how to produce complete videos with Remotion by orchestrating multiple MCP tools together. It covers the full pipeline from concept to rendered MP4.

Available MCP Tools

You have access to these MCP servers for media production:

remotion-media (via KIE)

generate_tts — Text-to-speech voiceovers (ElevenLabs TTS)
generate_music — Background music (Suno V3.5–V5)
generate_sfx — Sound effects (ElevenLabs SFX V2)
generate_image — AI images (Nano Banana Pro)
generate_video — AI video clips (Veo 3.1)
generate_subtitles — Transcribe audio/video to SRT (Whisper)
list_assets — List all generated media in the project

TwelveLabs (video understanding)

Index and analyze video files
Semantic search within videos ("find the part where...")
Scene detection, object detection, speaker identification
Video summarization

Pexels (stock footage)

searchPhotos — Search free stock photos
searchVideos — Search free stock videos
getVideo / getPhoto — Get details by ID
downloadVideo — Download video to project

ElevenLabs (optional — advanced voice)

Voice cloning from audio samples
Advanced TTS with custom voices
Audio isolation and processing
Transcription

Replicate (optional — 100+ AI models)

replicate_run — Run a model synchronously (images)
replicate_create_prediction — Start async prediction (video)
replicate_get_prediction — Poll prediction status
Image models: FLUX 1.1 Pro, Imagen 4, Ideogram v3, FLUX Kontext
Video models: Wan 2.5 (T2V, I2V), Kling 2.6 Pro

Production Pipeline

Read individual rule files for detailed workflows:

rules/production-pipeline.md — End-to-end workflow from concept to final render
rules/audio-integration.md — How to integrate generated audio into Remotion compositions
rules/voiceover-sync.md — Syncing TTS voiceovers with animations and captions
rules/music-scoring.md — Generating and timing background music
rules/stock-footage-workflow.md — Searching, downloading, and using stock footage in Remotion
rules/video-analysis.md — Using TwelveLabs to analyze and select clips from existing footage
rules/captions-workflow.md — TikTok-style animated captions using @remotion/captions and Whisper
rules/animation-presets.md — Reusable animation patterns (fade, slide, scale, typewriter, stagger)
rules/3d-content.md — Three.js and React Three Fiber via @remotion/three
rules/data-visualization.md — Animated charts, dashboards, and number counters
rules/visual-effects.md — Light leaks, Lottie, film grain, vignettes, Ken Burns
rules/ci-rendering.md — GitHub Actions workflows for automated video rendering
rules/replicate-models.md — Replicate MCP model catalog, usage, and decision guide
rules/image-generation.md — AI image prompt engineering, provider selection, Remotion integration
rules/video-generation.md — AI video clip generation, I2V pipeline, sequencing in Remotion
rules/sound-effects.md — SFX generation, prompt engineering, timing to visual events
rules/elevenlabs-advanced.md — Voice cloning, custom TTS parameters, multi-voice scripts
rules/asset-management.md — File organization, naming conventions, staticFile() reference

Key Principles

Audio drives timing — Generate voiceover first, get its duration, then set composition length to match.
Assets go in public/ — All generated media files (audio, video, images) must be saved to the project's public/ directory so Remotion can access them via staticFile().
Use Remotion's audio components — Always use <Audio> component with staticFile() for audio. Never use HTML <audio> tags.
Frame-based timing — Remotion uses frames, not seconds. Convert with fps * seconds. At 30fps, 1 second = 30 frames.
Progressive composition — Build the video in layers: visuals first, then voiceover, then music, then SFX.
Preview frequently — Use npm run dev to preview after each major change. The Remotion player updates live.

Source

git clone https://github.com/DojoCodingLabs/remotion-superpowers/blob/main/skills/remotion-production/SKILL.mdView on GitHub

Overview

Orchestrates multiple MCP servers to produce complete Remotion videos from concept to final MP4. It explains how to combine TTS, music, SFX, stock footage, and video analysis into cohesive compositions, guided by a library of production rules.

How This Skill Works

Technically, it uses remotion-media to generate TTS, music, SFX, and stock media; TwelveLabs to index and analyze video; and Pexels/ElevenLabs/Replicate to source stock assets and advanced voice work. The production flow is guided by rule files (production-pipeline.md, audio-integration.md, voiceover-sync.md) that set timing, asset sequencing, and captions. All generated media is placed in the project's public/ directory and accessed with Remotion's staticFile().

When to Use It

You need voiceovers or TTS narration synced to animations
You require timed background music and SFX aligned to scene actions
You want to incorporate stock footage or AI-generated clips into Remotion projects
You need to analyze existing footage to select clips or summarize scenes using TwelveLabs
You’re building videos with captions and TikTok-style animations that require synchronized captions

Quick Start

Step 1: Plan the concept and generate TTS to lock the video duration
Step 2: Gather media (music, SFX, stock footage) and place all assets in public/
Step 3: Build the Remotion composition, sync the voiceover with animations, and render the MP4

Best Practices

Generate the voiceover first to determine duration and set composition length
Save all outputs to public/ so Remotion can access via staticFile()
Leverage the rule files (production-pipeline, audio-integration, voiceover-sync, music-scoring) to structure workflows
Use TwelveLabs to guide edits with scene/object detection and summarization
Validate renders with CI renders (GitHub Actions) and test locally before publish

Example Use Cases

Marketing product launch video with TTS narration and synchronized captions
Tutorial video with background music, SFX cues, and stock footage overlays
Customer education clip using AI-generated voicework and consistent tone
Tech explainer leveraging AI video clips and scene detection to pace segments
TikTok-style short with animated captions and rapid scene changes

Frequently Asked Questions

Add this skill to your agents