Get the FREE Ultimate OpenClaw Setup Guide →

Fast-Whisper

A high-performance speech recognition MCP server based on Faster Whisper, providing efficient audio transcription capabilities.

Installation
Run this command in your terminal to add the MCP server to Claude Code.
Run in terminal:
Command
claude mcp add --transport stdio biguncle-fast-whisper-mcp-server python whisper_server.py

How to use

Fast-Whisper is a high-performance MCP server built around Faster Whisper for efficient speech recognition. It exposes a set of tools to transcribe audio files, either individually or in batches, and to query model information. Core capabilities include get_model_info for listing available Whisper models, transcribe for single-file transcription, and batch_transcribe for processing multiple audio files within a folder. The server is designed to take advantage of CUDA acceleration when available and to dynamically adjust batch sizes to optimize throughput on GPU memory, delivering outputs in VTT, SRT, or JSON formats. To integrate with clients or automation, you can configure it in your MCP client profile under an mcpServers entry and invoke the provided tools through MCP commands or direct server calls.

How to install

Prerequisites:

  • Python 3.10+
  • faster-whisper>=0.9.0
  • torch==2.6.0+cu126 (or appropriate CUDA build) and torchaudio with matching CUDA version
  • mcp[cli]>=1.2.0

Installation steps:

  1. Clone or download the repository containing the Whisper MCP server.
  2. Create and activate a virtual environment (highly recommended):
    • python -m venv env
    • source env/bin/activate # on Unix/macOS
    • .\env\Scripts\activate # on Windows
  3. Install Python dependencies:
    pip install -r requirements.txt
    
  4. Install PyTorch and torchaudio matching your CUDA version. Example installations:
    • CUDA 12.6:
      pip install torch==2.6.0 torchvision==0.21.0 torchaudio==2.6.0 --index-url https://download.pytorch.org/whl/cu126
      
    • CUDA 12.1:
      pip install torch==2.5.1 torchvision==0.20.1 torchaudio==2.5.1 --index-url https://download.pytorch.org/whl/cu121
      
    • CPU:
      pip install torch==2.6.0 torchvision==0.21.0 torchaudio==2.6.0 --index-url https://download.pytorch.org/whl/cpu
      
  5. Install the rest of the dependencies (if not using requirements.txt):
    pip install faster-whisper>=0.9.0
    pip install mcp[cli]>=1.2.0
    
  6. Ensure you have a working CUDA environment if GPUs are available (verify with nvcc --version or nvidia-smi).
  7. Run the server:
    python whisper_server.py
    

Additional notes

Tips and considerations:

  • CUDA acceleration can significantly speed up transcription, especially for larger models and batch processing.
  • The server automatically adjusts batch size based on GPU memory; for very large audio sets, batch processing improves throughput.
  • Use VAD filtering and correct language settings to improve accuracy for long or noisy audio.
  • Ensure the model cache is accessible and that the server has permission to read/write temporary files and outputs.
  • If you run into GPU memory errors, try lowering the model size or reducing concurrent transcriptions via configuration adjustments.
  • When integrating with Claude Desktop or MCP tooling, reference the server name you configured (fast-whisper) in the mcpServers section.

Related MCP Servers

Sponsor this space

Reach thousands of developers