Get the FREE Ultimate OpenClaw Setup Guide →

ocr

Scanned
npx machina-cli add skill CaseMark/legal-plugin/ocr --openclaw
Files (1)
SKILL.md
3.0 KB

case.dev OCR

Production-grade document OCR with table extraction and word-level positional data. Processes PDFs and images (PNG, JPG, TIFF, BMP, WEBP) up to 500MB.

Requires the casedev CLI. See setup skill for installation and auth.

Process a Document

Submit a document URL for OCR:

casedev ocr process --document-url "https://example.com/contract.pdf" --json

Flags:

  • --document-url / --url (required) — publicly accessible URL or presigned vault URL
  • --document-id — optional identifier to tag the job
  • --engine — OCR engine override

Returns a job ID and initial status.

Check Job Status

casedev ocr status JOB_ID --json

Returns: ID, status, page count, created/completed timestamps.

Statuses: queued -> processing -> completed or failed.

Watch Until Complete

casedev ocr watch JOB_ID --json

Polls until the job finishes. Flags:

  • --interval / -i — poll interval in seconds (default: 3)
  • --timeout / -t — max wait in seconds (default: 900)

Word-Level Data

Retrieve word-level OCR output for a vault object:

casedev ocr words --vault VAULT_ID --object OBJECT_ID --json

This requires the document to be in a vault and have completed OCR ingestion. The object must be a PDF or image (audio/video files are rejected).

Flags:

  • --page — specific page number
  • --word-start — starting word index
  • --word-end — ending word index

Returns per-page word arrays with text, word index, and confidence scores.

Uses focused vault if set via casedev focus set --vault.

Common Workflow

OCR a document from a vault

# 1. Upload to vault (triggers automatic ingestion + OCR)
casedev vault object upload ./scanned-contract.pdf --vault VAULT_ID --json

# 2. Check ingestion status
casedev vault object list --vault VAULT_ID --json

# 3. Get word-level data
casedev ocr words --vault VAULT_ID --object OBJECT_ID --json

# 4. Get specific page range
casedev ocr words --vault VAULT_ID --object OBJECT_ID --page 3 --json

OCR an external document

# 1. Submit
casedev ocr process --document-url "https://storage.example.com/doc.pdf" --json

# 2. Watch
casedev ocr watch JOB_ID --json

Troubleshooting

"Invalid file type for OCR": OCR only supports PDFs and images (application/pdf, image/*). Check the object's content type with casedev vault object list.

"Invalid object ID for this vault": Run casedev vault object list --vault VAULT_ID to see valid object IDs.

Job stuck in "processing": Increase watch timeout with --timeout 1800. Large documents (100+ pages) take longer.

"OCR job failed": The document may be corrupted or in an unsupported format. Re-upload and retry.

Source

git clone https://github.com/CaseMark/legal-plugin/blob/main/ocr/SKILL.mdView on GitHub

Overview

case.dev OCR delivers production-grade text extraction and table parsing with word-level positional data. It processes PDFs and images up to 500MB and returns per-page text, tables, and per-word metadata to enable precise search and data capture.

How This Skill Works

Submit a document URL or vault upload via the casedev CLI, trigger OCR, and monitor the job until completion. Once ingested, retrieve per-page word data including text, word index, and confidence, or access page-level results for structured data extraction.

When to Use It

  • Digitize and index paper documents to make them searchable
  • Extract text and tables from contracts or PDFs for data extraction
  • Obtain word-level positional data for precise highlighting, redaction, or analytics
  • Process documents stored in a vault or via public URLs for OCR ingestion
  • Handle large documents up to 500MB across multiple pages

Quick Start

  1. Step 1: Submit the document via casedev ocr process --document-url "https://example.com/doc.pdf" --json or upload to a vault object to trigger ingestion
  2. Step 2: Watch or check the status with casedev ocr watch JOB_ID --json
  3. Step 3: Retrieve word-level data with casedev ocr words --vault VAULT_ID --object OBJECT_ID --json

Best Practices

  • Use a publicly accessible URL or presigned vault URL with --document-url / --url when submitting
  • Prefer retrieving word-level data with casedev ocr words for detailed analysis
  • Verify input types are PDF or image/* before processing to avoid errors
  • Check status frequently (status or watch) for long-running jobs and plan timeouts accordingly
  • When dealing with large or multi-page documents, fetch data in page ranges to manage results efficiently

Example Use Cases

  • OCR a scanned contract from a vault to extract terms and table data
  • Digitize an external PDF invoice and capture line items with positions
  • Archive multi-page statements with per-page word arrays for search
  • Extract redaction-ready text from regulatory documents with word-level data
  • Process external documents and compare text across versions for version control

Frequently Asked Questions

Add this skill to your agents
Sponsor this space

Reach thousands of developers