youtube

A powerful Model Context Protocol (MCP) server for YouTube video transcription and metadata extraction.

Installation

Run this command in your terminal to add the MCP server to Claude Code.

Run in terminal:

Command

claude mcp add --transport stdio mourad-ghafiri-youtube-mcp-server uvx run main.py

How to use

This MCP server exposes a YouTube-focused transcription and metadata toolset. Clients can query the server to fetch video metadata without downloading the video, or request transcriptions with optional translation. The core capabilities include: get_video_info, which returns title, description, views, duration, uploader, upload date, and thumbnails; and transcribe_video, which provides a time-aligned transcription with options for automatic language detection or translation into a specified language. Transcription uses an in-memory processing pipeline with Silero VAD for segmentation and OpenAI Whisper models (with optional GPU acceleration) for transcription. The server also caches results on disk to avoid redundant processing and supports multiple languages from a broad multilingual set. You’ll interact with the server through the provided SSE endpoint at http://127.0.0.1:8000/sse, and you can configure an MCP client to connect to this endpoint.

How to install

Prerequisites:

Python 3.10+
ffmpeg (for audio processing)
git

Clone the repository:

git clone https://github.com/mourad-ghafiri/youtube-mcp-server
cd youtube-mcp-server

Install dependencies using the recommended UV runtime (as shown in the project):

uv sync

Run the server:

uv run main.py

Verify the server is running and reachable at http://127.0.0.1:8000/sse

Prerequisite notes:

Ensure ffmpeg is installed and available in your PATH.
If you plan to use GPU acceleration for Whisper, ensure CUDA/MPS support is configured on your system and the appropriate Whisper model is selected in config (see the README for details).

Additional notes

Tips and common considerations:

Configurable defaults live in src/youtube_mcp_server/config.py. You can adjust transcript caching, selected Whisper model, VAD model, sampling rate, padding, and max workers.
Supported Whisper model names include: tiny, base, small, medium, large, turbo. Larger models require more RAM and a capable GPU.
The VAD uses Silero models; ensure you have network access to download model files if not cached locally.
The server uses yt-dlp for video metadata extraction; it does not download full videos when only metadata is requested.
If you encounter performance issues, tweak MAX_WORKERS (default 4) and ensure your hardware matches the scale of concurrent transcription tasks.
For MCP clients, configure the server URL and use the provided mcpServers.youtube entry to route requests.
If you need to customize directories, update TRANSCRIPTIONS_DIR in the config to your preferred cache location.

Related MCP Servers

mcp-vegalite

MCP server from isaacwasserman/mcp-vegalite-server

github-chat

A Model Context Protocol (MCP) for analyzing and querying GitHub repositories using the GitHub Chat API.

nautex

MCP server for guiding Coding Agents via end-to-end requirements to implementation plan pipeline

pagerduty

PagerDuty's official local MCP (Model Context Protocol) server which provides tools to interact with your PagerDuty account directly from your MCP-enabled client.

futu-stock

mcp server for futuniuniu stock

mcp -boilerplate

Boilerplate using one of the 'better' ways to build MCP Servers. Written using FastMCP