Zvukogram
Scanned@erview
npx machina-cli add skill @erview/zvukogram --openclawZvukogram TTS
Speech generation via Zvukogram API with SSML markup support.
Requirements
To use this skill, you need:
- Zvukogram API token — get it at https://zvukogram.com/
- Zvukogram account email
Setup
Create file ~/.config/zvukogram/config.json:
mkdir -p ~/.config/zvukogram
{
"token": "your_api_token_here",
"email": "your_email@example.com"
}
Or use environment variables:
export ZVUKOGRAM_TOKEN=your_api_token_here
export ZVUKOGRAM_EMAIL=your_email@example.com
Quick Start
# Simple TTS
python3 scripts/tts.py --text "Hello, world!" --voice Алена --output hello.mp3
# With +20% speed
python3 scripts/tts.py --text "Fast text" --voice Алена --speed 1.2 --output fast.mp3
# Check balance
python3 scripts/balance.py
Features
- TTS generation — text to speech
- SSML support — stress marks, pauses, speed
- Audio merging — combine fragments via ffmpeg
- Transcription — proper pronunciation of English words
SSML Markup
Stress Marks
Use + before stressed vowel:
З+амок — stress on "a"
зам+ок — stress on "o"
Aliases (Transcription)
<sub alias="Оупен Эй Ай">OpenAI</sub>
<sub alias="Самсунг">Samsung</sub>
<sub alias="Ал+ьтман">Альтман</sub>
Speed
<prosody rate="1.2">20% faster</prosody>
<prosody rate="fast">Fast text</prosody>
Pauses
<break time="500ms"/>
Available Voices
- Алена — female, neutral (recommended)
- Андрей — male, neutral (recommended)
- Александра — female, soft
- Антон — male, business
Full list: see references/VOICES.md
Examples
See references/EXAMPLES.md for:
- Dialogs and podcasts
- News voiceover
- Voice notifications
- Long texts
Transcription
See references/TRANSCRIPTION.md for proper pronunciation:
- OpenAI → Оупен Эй Ай
- GPT → Джи Пи Ти
- Samsung → Самсунг
- Altman → Ал+ьтман
SSML Reference
See references/SSML_CHEATSHEET.md for quick tag lookup.
Troubleshooting
See references/TROUBLESHOOTING.md for:
- API errors
- Audio issues
- Diagnostics
API Limitations
- Max 1000 characters per request (
/text) - Up to 1M characters via
/longtext - SSML with
<voice>not supported via API (web only) - For multi-voice — merge fragments
Links
- API docs: https://zvukogram.com/node/api/
- Voice rating: https://zvukogram.com/rating/
- Support: https://t.me/zvukogram
Overview
Zvukogram TTS generates speech from text via the Zvukogram API, with SSML markup, speed control, stress marks, and English transcription. It supports audio fragment merging for longer content, making it ideal for podcasts, voice notifications, and other audio projects.
How This Skill Works
Send text and SSML to the Zvukogram API using your token and account email; the API returns synthesized audio. For longer content, merge fragments with ffmpeg. SSML enables stress marks, pauses, and speed adjustments to fine-tune pronunciation and prosody.
When to Use It
- Create podcasts or voiceover content from scripts
- Send voice notifications or alerts with precise timing
- Produce multilingual or accented content requiring transcription and pronunciation control
- Assemble long-form audio by merging multiple fragments
- Prototype and test different voices and prosody before final production
Quick Start
- Step 1: Create ~/.config/zvukogram/config.json with your token and email or export ZVUKOGRAM_TOKEN and ZVUKOGRAM_EMAIL
- Step 2: Run a simple TTS: python3 scripts/tts.py --text "Hello, world!" --voice Алена --output hello.mp3
- Step 3: Check balance (optional): python3 scripts/balance.py
Best Practices
- Use SSML stress marks and pauses to improve naturalness
- Leverage transcription aliases for tricky English words
- Experiment with <prosody> rate values to match desired pacing
- Split long scripts into fragments and merge them with ffmpeg
- Validate character limits: <=1000 chars per /text, up to 1M via /longtext
Example Use Cases
- Dialogs and podcasts with scripted narration
- News-like voiceover segments for quick updates
- Timed voice notifications for apps and devices
- Long-form audio projects created by merging fragments
- Pronunciation tweaks for English terms using transcription aliases