Get the FREE Ultimate OpenClaw Setup Guide →

transcription

Scanned
npx machina-cli add skill CaseMark/legal-plugin/transcription --openclaw
Files (1)
SKILL.md
3.2 KB

case.dev Transcription

Audio and video transcription with speaker diarization, ideal for depositions, hearings, and recorded proceedings. Supports MP3, WAV, M4A, FLAC, OGG, WEBM, and MP4 up to 5GB.

Requires the casedev CLI. See setup skill for installation and auth.

Start a Transcription

The source file must be in a vault:

casedev transcribe run --vault VAULT_ID --object OBJECT_ID --json

The object must be an audio or video file. Non-media files are rejected.

Flags:

  • --speaker-labels — enable speaker diarization
  • --speakers-expected N — hint for number of speakers
  • --language — language code (e.g., en, es)
  • --format — output format
  • --auto-highlights — extract key phrases

Uses focused vault if set via casedev focus set --vault.

Check Status

casedev transcribe status TRANSCRIPTION_ID --json

Returns: ID, status, vault, source/result object IDs, audio duration, word count, confidence.

Statuses: queued -> processing -> completed or failed.

Get Result

casedev transcribe result TRANSCRIPTION_ID --json

Returns the transcript text. If the result is stored as a vault object, it fetches the text automatically. Fails if the job is not yet completed.

Watch Until Complete

casedev transcribe watch TRANSCRIPTION_ID --json

Polls until done, then prints the result. Flags:

  • --interval / -i — poll interval in seconds (default: 3)
  • --timeout / -t — max wait in seconds (default: 900)

Common Workflow

Transcribe a deposition recording

# 1. Upload to vault
casedev vault object upload ./deposition-2024-01-15.mp3 --vault VAULT_ID --json

# 2. Start transcription with speaker labels
casedev transcribe run --vault VAULT_ID --object OBJECT_ID --speaker-labels --speakers-expected 4 --json

# 3. Watch until complete
casedev transcribe watch TRANSCRIPTION_ID --json

# 4. Or check status + get result separately
casedev transcribe status TRANSCRIPTION_ID --json
casedev transcribe result TRANSCRIPTION_ID --json

Using the job tracker

# Transcription jobs are auto-tracked
casedev jobs list --type transcribe --json
casedev jobs watch JOB_ID --type transcribe --json

Troubleshooting

"Invalid file type for transcribe": Only audio/video MIME types are accepted. Check the object with casedev vault object list --vault VAULT_ID. Upload audio files, not PDFs.

"Invalid object ID for this vault": Run casedev vault object list --vault VAULT_ID to see valid IDs.

"Transcription is not complete yet": Wait for the job to finish. Use casedev transcribe watch or casedev jobs watch.

Transcription failed: Check casedev transcribe status TRANSCRIPTION_ID --json for the error message. Common causes: corrupted file, unsupported codec, file too large.

Long audio (2+ hours): Increase timeout with --timeout 3600. Large files take proportionally longer.

Source

git clone https://github.com/CaseMark/legal-plugin/blob/main/transcription/SKILL.mdView on GitHub

Overview

case.dev transcription converts audio and video to text with speaker diarization, ideal for depositions, hearings, and recorded proceedings. It supports MP3, WAV, M4A, FLAC, OGG, WEBM, and MP4 up to 5GB. Requires the casedev CLI to run.

How This Skill Works

Upload your media to a vault, then start a transcription with casedev transcribe run using --vault and --object. Enable speaker diarization with --speaker-labels and estimate the number of speakers with --speakers-expected; optionally set language, output format, and auto-highlights. Use status or watch to monitor progress and result to fetch the transcript.

When to Use It

  • When you need a verbatim transcript of a deposition or hearing
  • When speaker labels are required to identify who spoke
  • When your media is in vault-stored MP3, WAV, M4A, FLAC, OGG, WEBM, or MP4 up to 5GB
  • When you want an auditable workflow with status and result retrieval
  • When you need an automated extract of key phrases with auto-highlights

Quick Start

  1. Step 1: Upload the media to a vault (example: casedev vault object upload ./video.mp4 --vault VAULT_ID --json)
  2. Step 2: Start transcription with speaker labels (example: casedev transcribe run --vault VAULT_ID --object OBJECT_ID --speaker-labels --speakers-expected 4 --json)
  3. Step 3: Watch until complete (example: casedev transcribe watch TRANSCRIPTION_ID --json)

Best Practices

  • Store media in a vault before transcribing for secure access
  • Enable --speaker-labels and provide --speakers-expected to improve diarization accuracy
  • Specify the language with --language to improve transcription precision
  • Use --auto-highlights to capture key phrases for quick review
  • Use transcribe status or transcribe watch to manage long transcriptions

Example Use Cases

  • Deposition transcription with four speakers using speaker labels
  • Hearing recording in English with two speakers
  • MP4 video transcript with diarization for a court proceeding
  • Large file transcription (up to 5GB) with timeout considerations
  • Retrieve transcript via transcribe result after completion

Frequently Asked Questions

Add this skill to your agents
Sponsor this space

Reach thousands of developers