inSaiAI Intelligent Editing
Scanned@liudu2326526
npx machina-cli add skill @liudu2326526/insaiai-intelligent-editing --openclawinSaiAI Intelligent Editing
Comprehensive guide for professional video and audio manipulation using FFmpeg and FFprobe.
Core Concepts
FFmpeg is the leading multimedia framework, able to decode, encode, transcode, mux, demux, stream, filter and play almost anything that humans and machines have created. It is a command-line tool that processes streams through a complex pipeline of demuxers, decoders, filters, encoders, and muxers.
Common Operations
# Basic Transcoding (MP4 to MKV)
ffmpeg -i input.mp4 output.mkv
# Change Video Codec (to H.265/HEVC)
ffmpeg -i input.mp4 -c:v libx265 -crf 28 -c:a copy output.mp4
# Extract Audio (No Video)
ffmpeg -i input.mp4 -vn -c:a libmp3lame -q:a 2 output.mp3
# Resize/Scale Video
ffmpeg -i input.mp4 -vf "scale=1280:720" output.mp4
# Cut Video (Start at 10s, Duration 30s)
ffmpeg -i input.mp4 -ss 00:00:10 -t 00:00:30 -c copy output.mp4
# Fast Precise Cut (Re-encoding only the cut points is complex, so standard re-encoding is safer for precision)
ffmpeg -ss 00:00:10 -i input.mp4 -to 00:00:40 -c:v libx264 -crf 23 -c:a aac output.mp4
# Concatenate Files (using demuxer)
# Create filelist.txt: file 'part1.mp4' \n file 'part2.mp4'
ffmpeg -f concat -safe 0 -i filelist.txt -c copy output.mp4
# Speed Up/Slow Down Video (2x speed)
ffmpeg -i input.mp4 -filter_complex "[0:v]setpts=0.5*PTS[v];[0:a]atempo=2.0[a]" -map "[v]" -map "[a]" output.mp4
Processing Categories & When to Use
Codecs & Quality
| Option | Use When |
|---|---|
-c:v libx264 | Standard H.264 encoding (best compatibility) |
-c:v libx265 | H.265/HEVC encoding (best compression/quality) |
-crf [0-51] | Constant Rate Factor (lower is higher quality, 18-28 recommended) |
-preset | Encoding speed vs compression (ultrafast, medium, veryslow) |
-c:a copy | Pass-through audio without re-encoding (saves time/quality) |
Filters & Manipulation
| Filter | Use When |
|---|---|
scale | Changing resolution (e.g., scale=1920:-1 for 1080p width) |
crop | Removing edges (e.g., crop=w:h:x:y) |
transpose | Rotating video (1=90deg CW, 2=90deg CCW) |
fps | Changing frame rate (e.g., fps=30) |
drawtext | Adding text overlays/watermarks |
overlay | Picture-in-picture or adding image watermarks |
fade | Adding fade-in/out effects (e.g., fade=in:0:30 for first 30 frames) |
volume | Adjusting audio levels (e.g., volume=1.5 for 150% volume) |
setpts | Changing video speed (e.g., setpts=0.5*PTS for double speed) |
atempo | Changing audio speed without pitch shift (0.5 to 2.0) |
Inspection & Metadata
| Tool/Option | Use When |
|---|---|
ffprobe -v error -show_format -show_streams | Getting detailed technical info of a file |
-metadata title="Name" | Setting global metadata tags |
-map | Selecting specific streams (e.g., -map 0:v:0 -map 0:a:1) |
Advanced: Complex Filtergraphs
Use filter_complex when you need to process multiple inputs or create non-linear filter chains.
# Example: Adding a watermark at the bottom right
ffmpeg -i input.mp4 -i watermark.png -filter_complex "overlay=main_w-overlay_w-10:main_h-overlay_h-10" output.mp4
# Example: Vertical Stack (2 videos)
ffmpeg -i top.mp4 -i bottom.mp4 -filter_complex "vstack=inputs=2" output.mp4
# Example: Side-by-Side (2 videos)
ffmpeg -i left.mp4 -i right.mp4 -filter_complex "hstack=inputs=2" output.mp4
# Example: Grid (4 videos 2x2)
ffmpeg -i v1.mp4 -i v2.mp4 -i v3.mp4 -i v4.mp4 -filter_complex "[0:v][1:v]hstack=inputs=2[top];[2:v][3:v]hstack=inputs=2[bottom];[top][bottom]vstack=inputs=2" output.mp4
# Example: Fade Transition (Simple crossfade between two clips)
# Requires manual offset calculation, using xfade is better
ffmpeg -i input1.mp4 -i input2.mp4 -filter_complex "xfade=transition=fade:duration=1:offset=9" output.mp4
Pro Editing Tips & Techniques
1. High-Quality GIF Creation
Standard conversion often results in poor colors. Use a palette for best results:
ffmpeg -i input.mp4 -vf "fps=15,scale=480:-1:flags=lanczos,split[s0][s1];[s0]palettegen[p];[s1][p]paletteuse" output.gif
2. Audio Mixing (Background Music + Voice)
Mix background music at 30% volume with the main audio:
ffmpeg -i voice.mp4 -i bgm.mp3 -filter_complex "[1:a]volume=0.3[bg];[0:a][bg]amix=inputs=2:duration=first" -c:v copy output.mp4
3. Video Stabilization
Two-pass process to fix shaky footage:
# Pass 1: Analyze
ffmpeg -i shaky.mp4 -vf vidstabdetect -f null -
# Pass 2: Transform
ffmpeg -i shaky.mp4 -vf vidstabtransform,unsharp=5:5:0.8:3:3:0.4 output.mp4
4. Color Correction & Enhancement
Adjust brightness, contrast, and saturation:
# brightness=0.05, contrast=1.1, saturation=1.2
ffmpeg -i input.mp4 -vf "eq=brightness=0.05:contrast=1.1:saturation=1.2" output.mp4
5. Automatic Thumbnail Sheet
Create a 3x3 grid of frames:
ffmpeg -i input.mp4 -vf "select='not(mod(n,100))',scale=320:-1,tile=3x3" -frames:v 1 preview.png
6. Remove Silence from Audio
Automatically cut silent parts from the beginning and end:
ffmpeg -i input.mp4 -af silenceremove=start_periods=1:start_silence=0.1:start_threshold=-50dB:stop_periods=1:stop_silence=0.1:stop_threshold=-50dB output.mp4
7. Hardsub Burning
Burn SRT/ASS subtitles directly into the video stream:
# Burn SRT
ffmpeg -i input.mp4 -vf "subtitles=subs.srt" output.mp4
# Burn ASS (supports advanced styling)
ffmpeg -i input.mp4 -vf "ass=subs.ass" output.mp4
8. Target File Size Compression
Calculate bitrate to fit a specific file size (e.g., 50MB for 60s video):
# Bitrate = (TargetSize_in_bits) / (Duration_in_seconds)
# 50MB = 400,000 bits. For 60s, bitrate ≈ 6600k
ffmpeg -i input.mp4 -b:v 6000k -maxrate 6000k -bufsize 12000k -c:a aac -b:a 128k output.mp4
9. Scene Change Detection
Extract frames where a scene change is detected (threshold 0.4):
ffmpeg -i input.mp4 -filter_complex "select='gt(scene,0.4)',metadata=print:file=scenes.txt" -vsync vfr scene_%03d.png
10. Extracting Frames at Specific Intervals
Extract one frame every 5 seconds:
ffmpeg -i input.mp4 -vf "fps=1/5" img_%03d.jpg
11. Batch Processing (Shell Snippet)
Convert all .mov files in a directory to .mp4:
for f in *.mov; do ffmpeg -i "$f" "${f%.mov}.mp4"; done
12. Live Streaming (RTMP)
Push a local file to a streaming server (YouTube/Twitch):
ffmpeg -re -i input.mp4 -c:v libx264 -preset veryfast -b:v 3000k -maxrate 3000k -bufsize 6000k -pix_fmt yuv420p -g 60 -c:a aac -b:a 128k -f flv rtmp://a.rtmp.youtube.com/live2/YOUR_STREAM_KEY
Hardware Acceleration
| Platform | Codec | Command |
|---|---|---|
| NVIDIA (NVENC) | H.264 | -c:v h264_nvenc |
| Intel (QSV) | H.264 | -c:v h264_qsv |
| Apple (VideoToolbox) | H.265 | -c:v hevc_videotoolbox |
Constraints & Error Handling
- Stream Mapping: Always use
-mapfor complex files to ensure you get the right audio/subtitle tracks. - Seeking: Put
-ssbefore-ifor fast seeking (input seeking), or after-ifor accurate seeking (output seeking). - Format Support: Ensure the output container (extension) supports the codecs you've chosen.
Overview
inSaiAI Intelligent Editing provides practical, command-line driven video and audio processing using FFmpeg and FFprobe. It covers transcoding, filtering, streaming, metadata manipulation, and complex filtergraph tasks to streamline professional media work.
How This Skill Works
FFmpeg acts as the core engine to decode, encode, transcode, filter, and mux streams through a chain of demuxers, decoders, filters, encoders, and muxers. Tasks are expressed as shell commands that combine input/output, codecs, filters, and metadata adjustments, with filter_complex enabling non-linear multi-input graphs.
When to Use It
- Transcode between formats or codecs (e.g., MP4 to MKV or H.265 encoding) for compatibility and size.
- Apply video/audio filters and effects (scale, crop, overlay, fade, volume changes) to edit media.
- Extract, replace, or rearrange streams using -map and audio/video separation.
- Edit and manage metadata for titles, track info, and embedded tags.
- Build complex, multi-input filtergraphs (watermarks, stacks, grids) with filter_complex.
Quick Start
- Step 1: Install FFmpeg/FFprobe on your system
- Step 2: Pick a task example (transcode, extract audio, scale) and note input/output
- Step 3: Run a basic command like: ffmpeg -i input.mp4 output.mkv
Best Practices
- Start with a short test clip to validate results before a full run.
- Select codecs and tuning parameters (-crf, -preset) to balance quality and speed.
- Use -c:a copy when you can to preserve audio quality and speed up processing.
- Validate outputs with ffprobe -v error -show_format -show_streams to confirm properties.
- Use filter_complex only when handling multiple inputs or non-linear graph chains to avoid unnecessary processing.
Example Use Cases
- ffmpeg -i input.mp4 output.mkv
- ffmpeg -i input.mp4 -c:v libx265 -crf 28 -c:a copy output.mp4
- ffmpeg -i input.mp4 -vn -c:a libmp3lame -q:a 2 output.mp3
- ffmpeg -i input.mp4 -vf scale=1280:720 output.mp4
- ffmpeg -i input.mp4 -ss 00:00:10 -t 00:00:30 -c copy output.mp4