Aliyun TTS
Scanned@guang384
npx machina-cli add skill @guang384/aliyun-tts --openclawaliyun-tts
Alibaba Cloud Text-to-Speech synthesis service.
Configuration
Set the following environment variables:
ALIYUN_APP_KEY- Application KeyALIYUN_ACCESS_KEY_ID- Access Key IDALIYUN_ACCESS_KEY_SECRET- Access Key Secret (sensitive)
Option 1: CLI configuration (recommended)
# Configure App Key
clawdbot skills config aliyun-tts ALIYUN_APP_KEY "your-app-key"
# Configure Access Key ID
clawdbot skills config aliyun-tts ALIYUN_ACCESS_KEY_ID "your-access-key-id"
# Configure Access Key Secret (sensitive)
clawdbot skills config aliyun-tts ALIYUN_ACCESS_KEY_SECRET "your-access-key-secret"
Option 2: Manual configuration
Edit ~/.clawdbot/clawdbot.json:
{
skills: {
entries: {
"aliyun-tts": {
env: {
ALIYUN_APP_KEY: "your-app-key",
ALIYUN_ACCESS_KEY_ID: "your-access-key-id",
ALIYUN_ACCESS_KEY_SECRET: "your-access-key-secret"
}
}
}
}
}
Usage
# Basic usage
{baseDir}/bin/aliyun-tts "Hello, this is Aliyun TTS"
# Specify output file
{baseDir}/bin/aliyun-tts -o /tmp/voice.mp3 "Hello"
# Specify voice
{baseDir}/bin/aliyun-tts -v siyue "Use siyue voice"
# Specify format and sample rate
{baseDir}/bin/aliyun-tts -f mp3 -r 16000 "Audio parameters"
Options
| Flag | Description | Default |
|---|---|---|
-o, --output | Output file path | tts.mp3 |
-v, --voice | Voice name | siyue |
-f, --format | Audio format | mp3 |
-r, --sample-rate | Sample rate | 16000 |
Available Voices
Common voices: siyue, xiaoxuan, xiaoyun, etc. See Alibaba Cloud documentation for the full list.
Chat Voice Replies
When a user requests a voice reply:
# Generate audio
{baseDir}/bin/aliyun-tts -o /tmp/voice-reply.mp3 "Your reply content"
# Include in your response:
# MEDIA:/tmp/voice-reply.mp3
Overview
Aliyun TTS is Alibaba Cloud's Text-to-Speech service that converts text into natural-sounding audio. It supports multiple voices such as siyue, xiaoxuan, and xiaoyun, and outputs in common formats with configurable sample rates. This skill wraps the CLI usage and environment-based configuration to help you generate audio programmatically.
How This Skill Works
Credentials are provided via environment variables ALIYUN_APP_KEY, ALIYUN_ACCESS_KEY_ID, and ALIYUN_ACCESS_KEY_SECRET. You can configure them with the recommended CLI method (clawdbot skills config aliyun-tts ...) or manually in the ~/.clawdbot/clawdbot.json file. Once configured, use the aliyun-tts CLI to generate speech by specifying the voice, format, and sample rate, for example: {baseDir}/bin/aliyun-tts -o /tmp/voice.mp3 -v siyue -f mp3 -r 16000 text to synthesize.
When to Use It
- Generate audio responses for a chat bot or customer support assistant.
- Create IVR prompts or automated phone system greetings.
- Provide voice onboarding or tutorial narration in a mobile app.
- Add accessibility features by reading articles or content aloud.
- Produce media audio for chat voice replies and embed with MEDIA:/path
Quick Start
- Step 1: Configure credentials using the recommended CLI or manual config file.
- Step 2: Generate audio with the CLI by specifying output, voice, format, and sample rate, plus the text to synthesize.
- Step 3: Use the resulting audio file in your app or response (for example via MEDIA:/path).
Best Practices
- Store and protect ALIYUN_APP_KEY, ALIYUN_ACCESS_KEY_ID, and ALIYUN_ACCESS_KEY_SECRET securely; avoid logging them.
- Test multiple voices (for example siyue, xiaoxuan, xiaoyun) to pick the most natural fit.
- Specify a suitable format and a common sample rate (mp3 and 16000 Hz by default).
- Always set a dedicated output path with -o to manage generated files cleanly.
- Cache or reuse generated audio where possible to reduce repeated TTS requests and costs.
Example Use Cases
- Generate a chat response audio file at /tmp/chat-reply.mp3 using the siyue voice.
- Create an IVR greeting at /tmp/ivr-hello.mp3 with the siyue voice for a phone system.
- Produce onboarding narration for a mobile app in mp3 format with a 16000 Hz sample rate.
- Enable accessibility by reading a long article aloud and saving as /tmp/article.mp3.
- Prepare a chat voice reply and reference it with MEDIA:/tmp/voice-reply.mp3 in the response.