Get the FREE Ultimate OpenClaw Setup Guide →

local-stt

A high-performance Model Context Protocol (MCP) server providing local speech-to-text transcription using whisper.cpp, optimized for Apple Silicon.

Installation
Run this command in your terminal to add the MCP server to Claude Code.
Run in terminal:
Command
claude mcp add --transport stdio smartlittleapps-local-stt-mcp node path/to/local-stt-mcp/mcp-server/dist/index.js \
  --env HF_TOKEN="your_token_here"

How to use

Local-stt is a local, privacy-first speech-to-text MCP server that runs entirely on your machine, optimized for Apple Silicon. It leverages whisper.cpp for fast transcription, with optional speaker diarization and automatic audio format conversion via ffmpeg. The server exposes a set of MCP tools such as transcribe, transcribe_long, transcribe_with_speakers, list_models, health_check, and version, enabling streamlined processing of audio inputs and retrieval of transcripts in multiple formats (txt, json, vtt, srt, csv). To begin, run the server using Node.js and point your MCP client to the server’s index.js entry, as shown in the installation steps. When using speaker diarization, supply a HuggingFace token ( HF_TOKEN ) to enable access to the diarization models.

Once the server is running, you can call tools like transcribe for standard transcription with automatic format conversion, transcribe_with_speakers for diarized transcripts with speaker labels, and transcribe_long for splitting long audio into manageable chunks. You can also query available models with list_models to see which whisper models are installed and supported, perform health checks with health_check to confirm system readiness, and fetch version information with version for debugging or compatibility checks.

How to install

Prerequisites

  • Node.js 18+ installed on your system
  • whisper.cpp installed (brew install whisper-cpp on macOS with Homebrew)
  • ffmpeg installed for audio format conversions (brew install ffmpeg)
  • Optional: HuggingFace token for speaker diarization (free token at huggingface.co)

Installation steps

  1. Clone the repository and navigate to the MCP server folder
git clone https://github.com/your-username/local-stt-mcp.git
cd local-stt-mcp/mcp-server
  1. Install dependencies and build
npm install
 npm run build
  1. Download whisper models (if the project provides a script)
npm run setup:models
  1. Set up HuggingFace token if you plan to use speaker diarization
export HF_TOKEN="your_token_here"  # Get a free token from huggingface.co
  1. Run the MCP server (as described in the configuration and client usage)
npm run start

Note: If you’re integrating with an MCP client, configure the client to point to the Node entry file, for example:

{
  "mcpServers": {
    "local-stt": {
      "command": "node",
      "args": ["path/to/local-stt-mcp/mcp-server/dist/index.js"]
    }
  }
}

Additional notes

Tips and caveats:

  • For best Apple Silicon performance, ensure you use the optimized whisper.cpp setup and keep ffmpeg up to date.
  • If diarization is not needed, you can omit HF_TOKEN and related diarization models to save resources.
  • The server supports multiple output formats; specify the desired format in the MCP client requests via the tool parameters.
  • If you encounter memory issues, verify you’re on a macOS system with sufficient free RAM, as whisper processing can be memory intensive depending on the model size.
  • Ensure your environment paths are correctly resolved if running via custom paths; the dist/index.js file is the entry point after npm run build.

Related MCP Servers

Sponsor this space

Reach thousands of developers