youtube
A powerful Model Context Protocol (MCP) server for YouTube video transcription and metadata extraction.
claude mcp add --transport stdio mourad-ghafiri-youtube-mcp-server uvx run main.py
How to use
This MCP server exposes a YouTube-focused transcription and metadata toolset. Clients can query the server to fetch video metadata without downloading the video, or request transcriptions with optional translation. The core capabilities include: get_video_info, which returns title, description, views, duration, uploader, upload date, and thumbnails; and transcribe_video, which provides a time-aligned transcription with options for automatic language detection or translation into a specified language. Transcription uses an in-memory processing pipeline with Silero VAD for segmentation and OpenAI Whisper models (with optional GPU acceleration) for transcription. The server also caches results on disk to avoid redundant processing and supports multiple languages from a broad multilingual set. You’ll interact with the server through the provided SSE endpoint at http://127.0.0.1:8000/sse, and you can configure an MCP client to connect to this endpoint.
How to install
Prerequisites:
- Python 3.10+
- ffmpeg (for audio processing)
- git
- Clone the repository:
git clone https://github.com/mourad-ghafiri/youtube-mcp-server
cd youtube-mcp-server
- Install dependencies using the recommended UV runtime (as shown in the project):
uv sync
- Run the server:
uv run main.py
- Verify the server is running and reachable at http://127.0.0.1:8000/sse
Prerequisite notes:
- Ensure ffmpeg is installed and available in your PATH.
- If you plan to use GPU acceleration for Whisper, ensure CUDA/MPS support is configured on your system and the appropriate Whisper model is selected in config (see the README for details).
Additional notes
Tips and common considerations:
- Configurable defaults live in src/youtube_mcp_server/config.py. You can adjust transcript caching, selected Whisper model, VAD model, sampling rate, padding, and max workers.
- Supported Whisper model names include: tiny, base, small, medium, large, turbo. Larger models require more RAM and a capable GPU.
- The VAD uses Silero models; ensure you have network access to download model files if not cached locally.
- The server uses yt-dlp for video metadata extraction; it does not download full videos when only metadata is requested.
- If you encounter performance issues, tweak MAX_WORKERS (default 4) and ensure your hardware matches the scale of concurrent transcription tasks.
- For MCP clients, configure the server URL and use the provided mcpServers.youtube entry to route requests.
- If you need to customize directories, update TRANSCRIPTIONS_DIR in the config to your preferred cache location.
Related MCP Servers
mcp-vegalite
MCP server from isaacwasserman/mcp-vegalite-server
github-chat
A Model Context Protocol (MCP) for analyzing and querying GitHub repositories using the GitHub Chat API.
nautex
MCP server for guiding Coding Agents via end-to-end requirements to implementation plan pipeline
pagerduty
PagerDuty's official local MCP (Model Context Protocol) server which provides tools to interact with your PagerDuty account directly from your MCP-enabled client.
futu-stock
mcp server for futuniuniu stock
mcp -boilerplate
Boilerplate using one of the 'better' ways to build MCP Servers. Written using FastMCP