🗣️ Edge-TTS Skill using uvx
Scanned@al-one
npx machina-cli add skill @al-one/edge-tts-uvx --openclawEdge-TTS Skill
Generate high-quality text-to-speech audio using Microsoft Edge's neural TTS service via the node-edge-tts npm package. Supports multiple languages, voices, adjustable speed/pitch, and subtitle generation.
Usage
uvx edge-tts --text "{msg}" --write-media {tempdir}/{filename}.mp3
# With subtitles
uvx edge-tts --text "{msg}" --write-media {tempdir}/{filename}.mp3 --write-subtitles -
Changing rate(speed), volume and pitch
uvx edge-tts --text "{msg}" --write-media {tempdir}/{filename}.mp3 --rate=+50%
uvx edge-tts --text "{msg}" --write-media {tempdir}/{filename}.mp3 --volume=+50% --pitch=-50Hz
Changing the voice
uvx edge-tts --text "{msg}" --write-media {tempdir}/{filename}.mp3 --voice=zh-CN-XiaoxiaoNeural
Available voices
Name Gender ContentCategories VoicePersonalities
en-GB-LibbyNeural Female General Friendly, Positive
en-GB-RyanNeural Male General Friendly, Positive
en-GB-SoniaNeural Female General Friendly, Positive
en-GB-ThomasNeural Male General Friendly, Positive
en-HK-SamNeural Male General Friendly, Positive
en-HK-YanNeural Female General Friendly, Positive
en-US-AnaNeural Female Cartoon, Conversation Cute
en-US-AndrewMultilingualNeural Male Conversation, Copilot Warm, Confident, Authentic, Honest
en-US-AndrewNeural Male Conversation, Copilot Warm, Confident, Authentic, Honest
en-US-AriaNeural Female News, Novel Positive, Confident
en-US-AvaMultilingualNeural Female Conversation, Copilot Expressive, Caring, Pleasant, Friendly
en-US-AvaNeural Female Conversation, Copilot Expressive, Caring, Pleasant, Friendly
en-US-BrianMultilingualNeural Male Conversation, Copilot Approachable, Casual, Sincere
en-US-BrianNeural Male Conversation, Copilot Approachable, Casual, Sincere
en-US-ChristopherNeural Male News, Novel Reliable, Authority
en-US-EmmaMultilingualNeural Female Conversation, Copilot Cheerful, Clear, Conversational
en-US-EmmaNeural Female Conversation, Copilot Cheerful, Clear, Conversational
en-US-EricNeural Male News, Novel Rational
en-US-GuyNeural Male News, Novel Passion
en-US-JennyNeural Female General Friendly, Considerate, Comfort
en-US-MichelleNeural Female News, Novel Friendly, Pleasant
en-US-RogerNeural Male News, Novel Lively
en-US-SteffanNeural Male News, Novel Rational
fr-FR-DeniseNeural Female General Friendly, Positive
fr-FR-HenriNeural Male General Friendly, Positive
zh-CN-XiaoxiaoNeural Female News, Novel Warm
zh-CN-YunjianNeural Male Sports, Novel Passion
zh-CN-liaoning-XiaobeiNeural Female Dialect Humorous
zh-CN-shaanxi-XiaoniNeural Female Dialect Bright
zh-HK-HiuGaaiNeural Female General Friendly, Positive
zh-HK-WanLungNeural Male General Friendly, Positive
zh-TW-HsiaoChenNeural Female General Friendly, Positive
zh-TW-YunJheNeural Male General Friendly, Positive\
Retrieve all available voices using shell commands:
uvx edge-tts --list-voices
Overview
Edge-TTS Skill generates audio from text using Microsoft's Edge neural TTS through the node-edge-tts package. It supports many languages and voices, with adjustable speed and pitch, output formats, and optional subtitles for captions.
How This Skill Works
The skill executes the uvx edge-tts CLI, feeding text with --text and saving audio via --write-media. You can tailor the voice with --voice and adjust speed, volume, and pitch using --rate, --volume, and --pitch; --write-subtitles enables subtitle generation for accessibility.
When to Use It
- When a user requests audio output via the 'tts' trigger or keyword
- When content should be spoken rather than read (multitasking, accessibility, driving, cooking)
- When you need a specific voice, rate, pitch, or output format
- When subtitles or captions are required alongside speech
- When producing multilingual content with appropriate language voices
Quick Start
- Step 1: Prepare your text and choose a target voice
- Step 2: Run uvx edge-tts --text "{msg}" --write-media {dir}/{filename}.mp3
- Step 3: Optional: add --rate, --volume, --pitch, --voice, or --write-subtitles - to customize output
Best Practices
- Test multiple voices to match the content and audience
- Tune rate, volume, and pitch for natural-sounding speech
- Use --write-subtitles for accessibility and discoverability
- Cache or reuse common audio to reduce repeated TTS calls
- Verify language support and select the correct locale/voice for each target audience
Example Use Cases
- Read a blog post aloud as an MP3 using en-US-AriaNeural with subtitles
- Narrate cooking steps with en-US-AvaNeural at a +10% rate for clarity
- Provide bilingual product descriptions using English and Chinese voices
- Accessibility read-aloud for visually impaired users with adjustable pitch
- Podcast intro narration with a friendly voice and subtle pacing