Do I need AWS credentials configured?

Yes. The skill requires AWS credentials and an S3 bucket for temporary audio storage, with transcripts returned after processing.

Which audio formats are supported?

Supported formats include MP3, MP4, WAV, FLAC, M4A, OGG, and WebM.

How do I ensure speaker labels are included?

Always specify the language when running the transcription to enable speaker labels (spk_0, spk_1, ...).

Can I use the transcript for analysis?

Yes. After transcription, you can analyze for key topics, action items, speaking balance, questions, and engagement cues.

Meeting Transcription

Scanned

npx machina-cli add skill naity/professional-growth-agent/meeting-transcription --openclaw

Files (1)

SKILL.md

2.7 KB

Meeting Transcription Skill

This skill enables transcription of meeting audio files to text using AWS Transcribe.

When to Use This Skill

Use this skill when the user:

Provides an audio recording of a meeting
Asks to transcribe a meeting
Wants to analyze meeting content from an audio file
Mentions audio files with extensions like .mp3, .wav, .m4a, .mp4, .flac

How It Works

User provides path to an audio file
This skill calls the transcribe_audio.py script with optional language parameter
The script uploads audio to S3 and uses AWS Transcribe
Returns the full transcript as plain text with speaker labels (if applicable)
You (the agent) can then analyze the transcript

Usage

IMPORTANT: Always specify the language to get speaker labels (spk_0, spk_1).

# English meeting (default)
python transcribe_audio.py /path/to/audio.mp3

# Chinese meeting
python transcribe_audio.py /path/to/audio.mp3 --language zh-CN

# Other languages
python transcribe_audio.py /path/to/audio.mp3 --language es-ES

Supported Languages:

en-US: English (US) - default
zh-CN: Mandarin Chinese (Simplified)
zh-TW: Traditional Chinese (Taiwan)
es-ES: Spanish (Spain)
fr-FR: French
de-DE: German
ja-JP: Japanese
ko-KR: Korean

Speaker Labels: All transcriptions include speaker labels (spk_0, spk_1, spk_2, etc.) to identify different speakers in the conversation. You must know the language beforehand.

Supported Audio Formats

MP3
MP4
WAV
FLAC
M4A
OGG
WebM

Usage Example

When the user says: "Analyze my 1:1 meeting recording at ./recordings/meeting.mp3"

Use this skill to transcribe the audio first
Once you have the transcript, analyze it for insights
Provide actionable feedback to the user

What to Do After Transcription

After getting the transcript, analyze it for:

Key discussion topics: What were the main themes?
Action items: What tasks were assigned or agreed upon?
Speaking balance: Who spoke more? Is it balanced?
Questions: What questions were asked? Were they answered?
Communication patterns: Any interruptions, pauses, or unclear moments?
Tone and engagement: Is the conversation collaborative or one-sided?
Constructive feedback: What could be improved for next time?

Technical Details

Requires AWS credentials configured
Requires S3 bucket for temporary audio storage
Audio files are automatically cleaned up after transcription
Transcription job names are timestamped to avoid conflicts

Source

git clone https://github.com/naity/professional-growth-agent/blob/main/.claude/skills/meeting-transcription/SKILL.mdView on GitHub

Overview

This skill converts meeting audio files into text using AWS Transcribe. It supports common formats (MP3, WAV, M4A, MP4, FLAC) and returns a transcript with speaker labels to identify participants. Language options ensure accurate transcription and labeling.

How This Skill Works

User provides a path to the audio file. The skill runs transcribe_audio.py with an optional language parameter. The script uploads the audio to S3, starts an AWS Transcribe job, and returns the full transcript with speaker labels. The transcript can then be analyzed for insights and action items.

When to Use It

You have an audio recording of a meeting and need a text transcript.
You want to analyze meeting content from an audio file (topics, actions, questions).
You need speaker-labeled transcripts to identify who spoke when.
You’re working with audio files in formats like .mp3, .wav, .m4a, .mp4, or .flac.
You require transcripts in a specific language and want explicit language selection for accuracy.

Quick Start

Step 1: Provide the path to your audio file, e.g. /path/to/meeting.mp3
Step 2: Run python transcribe_audio.py /path/to/meeting.mp3 --language en-US
Step 3: Retrieve the transcript and proceed to analysis (topics, actions, tone)

Best Practices

Always specify the language to ensure speaker labels (spk_0, spk_1) are accurate.
Use only supported audio formats and ensure audio quality is clear for better transcription.
Verify AWS credentials and configure an S3 bucket for temporary storage before running.
Use timestamped job names to avoid conflicts and simplify retrieval.
After transcription, review for speaker label accuracy, interruptions, and action items.

Example Use Cases

Transcribe a 45-minute English team meeting (standup) with speaker labels for follow-up summaries.
Transcribe a bilingual sales call in zh-CN to capture key requirements and decisions.
Transcribe a product demo recorded in MP4 to generate minutes and a requirements list.
Transcribe a client discovery call in es-ES to extract questions and next steps.
Transcribe a training workshop audio (M4A) and extract action items and topics for recap.

Frequently Asked Questions

Add this skill to your agents