local-stt

A high-performance Model Context Protocol (MCP) server providing local speech-to-text transcription using whisper.cpp, optimized for Apple Silicon.

Installation

Run this command in your terminal to add the MCP server to Claude Code.

Run in terminal:

View docs

Command

claude mcp add --transport stdio smartlittleapps-local-stt-mcp node path/to/local-stt-mcp/mcp-server/dist/index.js \
  --env HF_TOKEN="your_token_here"

How to use

Local-stt is a local, privacy-first speech-to-text MCP server that runs entirely on your machine, optimized for Apple Silicon. It leverages whisper.cpp for fast transcription, with optional speaker diarization and automatic audio format conversion via ffmpeg. The server exposes a set of MCP tools such as transcribe, transcribe_long, transcribe_with_speakers, list_models, health_check, and version, enabling streamlined processing of audio inputs and retrieval of transcripts in multiple formats (txt, json, vtt, srt, csv). To begin, run the server using Node.js and point your MCP client to the server’s index.js entry, as shown in the installation steps. When using speaker diarization, supply a HuggingFace token ( HF_TOKEN ) to enable access to the diarization models.

Once the server is running, you can call tools like transcribe for standard transcription with automatic format conversion, transcribe_with_speakers for diarized transcripts with speaker labels, and transcribe_long for splitting long audio into manageable chunks. You can also query available models with list_models to see which whisper models are installed and supported, perform health checks with health_check to confirm system readiness, and fetch version information with version for debugging or compatibility checks.

How to install

Prerequisites

Node.js 18+ installed on your system
whisper.cpp installed (brew install whisper-cpp on macOS with Homebrew)
ffmpeg installed for audio format conversions (brew install ffmpeg)
Optional: HuggingFace token for speaker diarization (free token at huggingface.co)

Installation steps

Clone the repository and navigate to the MCP server folder

git clone https://github.com/your-username/local-stt-mcp.git
cd local-stt-mcp/mcp-server

Install dependencies and build

npm install
 npm run build

Download whisper models (if the project provides a script)

npm run setup:models

Set up HuggingFace token if you plan to use speaker diarization

export HF_TOKEN="your_token_here"  # Get a free token from huggingface.co

Run the MCP server (as described in the configuration and client usage)

npm run start

Note: If you’re integrating with an MCP client, configure the client to point to the Node entry file, for example:

{
  "mcpServers": {
    "local-stt": {
      "command": "node",
      "args": ["path/to/local-stt-mcp/mcp-server/dist/index.js"]
    }
  }
}

Additional notes

Tips and caveats:

For best Apple Silicon performance, ensure you use the optimized whisper.cpp setup and keep ffmpeg up to date.
If diarization is not needed, you can omit HF_TOKEN and related diarization models to save resources.
The server supports multiple output formats; specify the desired format in the MCP client requests via the tool parameters.
If you encounter memory issues, verify you’re on a macOS system with sufficient free RAM, as whisper processing can be memory intensive depending on the model size.
Ensure your environment paths are correctly resolved if running via custom paths; the dist/index.js file is the entry point after npm run build.

Related MCP Servers

xiaozhi-esp32 -java

1.1k

小智ESP32的Java企业级管理平台，提供设备监控、音色定制、角色切换和对话记录管理的前后端及服务端一体化解决方案

Wax

616

Sub-Millisecond RAG on Apple Silicon. No Server. No API. One File. Pure Swift

z-image-studio

A Cli, a webUI, and a MCP server for the Z-Image-Turbo text-to-image generation model (Tongyi-MAI/Z-Image-Turbo base model as well as quantized models)

furi

CLI & API for MCP management

bridge4simulator

An MCP (Model Context Protocol) server that enables AI assistants to control iOS Simulator. Seamlessly integrates with Claude Desktop, Cursor, Claude Code, and other MCP-compatible clients.

Pare

Dev tools, optimized for agents. Structured, token-efficient MCP servers for git, test runners, npm, Docker, and more.