What can Ai Music Audio generate?

Text-to-music, voice synthesis, TTS, and sound effects using MusicGen, Bark, ElevenLabs, and more.

Which models are supported?

MusicGen for music; Bark for voice synthesis; ElevenLabs for TTS and voice cloning; AudioCraft for audio manipulation.

What should I consider before using this skill?

Licensing and consent for voice cloning, model provenance, data handling, and testing across target devices.

Ai Music Audio

npx machina-cli add skill omer-metin/skills-for-antigravity/ai-music-audio --openclaw

Files (1)

SKILL.md

1.3 KB

Ai Music Audio

Identity

Reference System Usage

You must ground your responses in the provided reference files, treating them as the source of truth for this domain:

For Creation: Always consult references/patterns.md. This file dictates how things should be built. Ignore generic approaches if a specific pattern exists here.
For Diagnosis: Always consult references/sharp_edges.md. This file lists the critical failures and "why" they happen. Use it to explain risks to the user.
For Review: Always consult references/validations.md. This contains the strict rules and constraints. Use it to validate user inputs objectively.

Note: If a user's request conflicts with the guidance in these files, politely correct them using the information provided in the references.

Source

git clone https://github.com/omer-metin/skills-for-antigravity/blob/main/skills/ai-music-audio/SKILL.mdView on GitHub

Overview

Ai Music Audio orchestrates AI-driven audio generation across text-to-music, voice synthesis, text-to-speech, and sound effects. It leverages MusicGen, Bark, ElevenLabs, AudioCraft and more to produce audio assets from prompts. This enables rapid, scalable audio creation for creators and products while keeping control over style and licensing.

How This Skill Works

A prompt is routed to specialized models for music, voice synthesis, and effects, then processed with audio manipulation tools to craft the final asset. Outputs are generated in common formats and can be refined through iterative prompting. The workflow aligns with reference patterns and validations to ensure accuracy and compliance.

When to Use It

Build AI-generated music from prompts for video, games, or apps
Create synthetic voice or character voices for narration or dialogue
Generate sound effects and apply audio manipulation to assets
Produce quick audio branding assets like jingles or stingers
Incorporate AI TTS narrations and accessibility prompts in apps

Quick Start

Step 1: Define the objective, desired mood, length, and voice requirements, then gather any reference styles.
Step 2: Choose models (MusicGen for music; Bark/ ElevenLabs for voices) and craft prompts.
Step 3: Generate assets, listen critically, iterate prompts, and export ready formats.

Best Practices

Define clear mood, tempo, length, and voice characteristics in prompts
Check licensing and consent for voice cloning and training data
Evaluate outputs for loudness, clarity, artifacts, and cross-device compatibility
Iterate with targeted prompts and maintain versioned assets
Document model provenance, API usage, and data handling for reproducibility

Example Use Cases

AI-generated background music and narration for a product demo video
NPC voice and dialogue for a game using Bark and ElevenLabs
Podcast intro music with AI voiceover and stinger outro
A library of AI-generated sound effects for a mobile game
Audio branding assets including jingles and notification chimes

Frequently Asked Questions

Add this skill to your agents