Get the FREE Ultimate OpenClaw Setup Guide →
g

Aliyun TTS

Scanned

@guang384

npx machina-cli add skill @guang384/aliyun-tts --openclaw
Files (1)
SKILL.md
2.0 KB

aliyun-tts

Alibaba Cloud Text-to-Speech synthesis service.

Configuration

Set the following environment variables:

  • ALIYUN_APP_KEY - Application Key
  • ALIYUN_ACCESS_KEY_ID - Access Key ID
  • ALIYUN_ACCESS_KEY_SECRET - Access Key Secret (sensitive)

Option 1: CLI configuration (recommended)

# Configure App Key
clawdbot skills config aliyun-tts ALIYUN_APP_KEY "your-app-key"

# Configure Access Key ID
clawdbot skills config aliyun-tts ALIYUN_ACCESS_KEY_ID "your-access-key-id"

# Configure Access Key Secret (sensitive)
clawdbot skills config aliyun-tts ALIYUN_ACCESS_KEY_SECRET "your-access-key-secret"

Option 2: Manual configuration

Edit ~/.clawdbot/clawdbot.json:

{
  skills: {
    entries: {
      "aliyun-tts": {
        env: {
          ALIYUN_APP_KEY: "your-app-key",
          ALIYUN_ACCESS_KEY_ID: "your-access-key-id",
          ALIYUN_ACCESS_KEY_SECRET: "your-access-key-secret"
        }
      }
    }
  }
}

Usage

# Basic usage
{baseDir}/bin/aliyun-tts "Hello, this is Aliyun TTS"

# Specify output file
{baseDir}/bin/aliyun-tts -o /tmp/voice.mp3 "Hello"

# Specify voice
{baseDir}/bin/aliyun-tts -v siyue "Use siyue voice"

# Specify format and sample rate
{baseDir}/bin/aliyun-tts -f mp3 -r 16000 "Audio parameters"

Options

FlagDescriptionDefault
-o, --outputOutput file pathtts.mp3
-v, --voiceVoice namesiyue
-f, --formatAudio formatmp3
-r, --sample-rateSample rate16000

Available Voices

Common voices: siyue, xiaoxuan, xiaoyun, etc. See Alibaba Cloud documentation for the full list.

Chat Voice Replies

When a user requests a voice reply:

# Generate audio
{baseDir}/bin/aliyun-tts -o /tmp/voice-reply.mp3 "Your reply content"

# Include in your response:
# MEDIA:/tmp/voice-reply.mp3

Source

git clone https://clawhub.ai/guang384/aliyun-ttsView on GitHub

Overview

Aliyun TTS is Alibaba Cloud's Text-to-Speech service that converts text into natural-sounding audio. It supports multiple voices such as siyue, xiaoxuan, and xiaoyun, and outputs in common formats with configurable sample rates. This skill wraps the CLI usage and environment-based configuration to help you generate audio programmatically.

How This Skill Works

Credentials are provided via environment variables ALIYUN_APP_KEY, ALIYUN_ACCESS_KEY_ID, and ALIYUN_ACCESS_KEY_SECRET. You can configure them with the recommended CLI method (clawdbot skills config aliyun-tts ...) or manually in the ~/.clawdbot/clawdbot.json file. Once configured, use the aliyun-tts CLI to generate speech by specifying the voice, format, and sample rate, for example: {baseDir}/bin/aliyun-tts -o /tmp/voice.mp3 -v siyue -f mp3 -r 16000 text to synthesize.

When to Use It

  • Generate audio responses for a chat bot or customer support assistant.
  • Create IVR prompts or automated phone system greetings.
  • Provide voice onboarding or tutorial narration in a mobile app.
  • Add accessibility features by reading articles or content aloud.
  • Produce media audio for chat voice replies and embed with MEDIA:/path

Quick Start

  1. Step 1: Configure credentials using the recommended CLI or manual config file.
  2. Step 2: Generate audio with the CLI by specifying output, voice, format, and sample rate, plus the text to synthesize.
  3. Step 3: Use the resulting audio file in your app or response (for example via MEDIA:/path).

Best Practices

  • Store and protect ALIYUN_APP_KEY, ALIYUN_ACCESS_KEY_ID, and ALIYUN_ACCESS_KEY_SECRET securely; avoid logging them.
  • Test multiple voices (for example siyue, xiaoxuan, xiaoyun) to pick the most natural fit.
  • Specify a suitable format and a common sample rate (mp3 and 16000 Hz by default).
  • Always set a dedicated output path with -o to manage generated files cleanly.
  • Cache or reuse generated audio where possible to reduce repeated TTS requests and costs.

Example Use Cases

  • Generate a chat response audio file at /tmp/chat-reply.mp3 using the siyue voice.
  • Create an IVR greeting at /tmp/ivr-hello.mp3 with the siyue voice for a phone system.
  • Produce onboarding narration for a mobile app in mp3 format with a 16000 Hz sample rate.
  • Enable accessibility by reading a long article aloud and saving as /tmp/article.mp3.
  • Prepare a chat voice reply and reference it with MEDIA:/tmp/voice-reply.mp3 in the response.

Frequently Asked Questions

Add this skill to your agents
Sponsor this space

Reach thousands of developers