ocr-web-service-automation
Scannednpx machina-cli add skill ComposioHQ/awesome-claude-skills/ocr-web-service-automation --openclawOCR Web Service Automation via Rube MCP
Automate OCR Web Service operations through Composio's OCR Web Service toolkit via Rube MCP.
Toolkit docs: composio.dev/toolkits/ocr_web_service
Prerequisites
- Rube MCP must be connected (RUBE_SEARCH_TOOLS available)
- Active OCR Web Service connection via
RUBE_MANAGE_CONNECTIONSwith toolkitocr_web_service - Always call
RUBE_SEARCH_TOOLSfirst to get current tool schemas
Setup
Get Rube MCP: Add https://rube.app/mcp as an MCP server in your client configuration. No API keys needed — just add the endpoint and it works.
- Verify Rube MCP is available by confirming
RUBE_SEARCH_TOOLSresponds - Call
RUBE_MANAGE_CONNECTIONSwith toolkitocr_web_service - If connection is not ACTIVE, follow the returned auth link to complete setup
- Confirm connection status shows ACTIVE before running any workflows
Tool Discovery
Always discover available tools before executing workflows:
RUBE_SEARCH_TOOLS
queries: [{use_case: "OCR Web Service operations", known_fields: ""}]
session: {generate_id: true}
This returns available tool slugs, input schemas, recommended execution plans, and known pitfalls.
Core Workflow Pattern
Step 1: Discover Available Tools
RUBE_SEARCH_TOOLS
queries: [{use_case: "your specific OCR Web Service task"}]
session: {id: "existing_session_id"}
Step 2: Check Connection
RUBE_MANAGE_CONNECTIONS
toolkits: ["ocr_web_service"]
session_id: "your_session_id"
Step 3: Execute Tools
RUBE_MULTI_EXECUTE_TOOL
tools: [{
tool_slug: "TOOL_SLUG_FROM_SEARCH",
arguments: {/* schema-compliant args from search results */}
}]
memory: {}
session_id: "your_session_id"
Known Pitfalls
- Always search first: Tool schemas change. Never hardcode tool slugs or arguments without calling
RUBE_SEARCH_TOOLS - Check connection: Verify
RUBE_MANAGE_CONNECTIONSshows ACTIVE status before executing tools - Schema compliance: Use exact field names and types from the search results
- Memory parameter: Always include
memoryinRUBE_MULTI_EXECUTE_TOOLcalls, even if empty ({}) - Session reuse: Reuse session IDs within a workflow. Generate new ones for new workflows
- Pagination: Check responses for pagination tokens and continue fetching until complete
Quick Reference
| Operation | Approach |
|---|---|
| Find tools | RUBE_SEARCH_TOOLS with OCR Web Service-specific use case |
| Connect | RUBE_MANAGE_CONNECTIONS with toolkit ocr_web_service |
| Execute | RUBE_MULTI_EXECUTE_TOOL with discovered tool slugs |
| Bulk ops | RUBE_REMOTE_WORKBENCH with run_composio_tool() |
| Full schema | RUBE_GET_TOOL_SCHEMAS for tools with schemaRef |
Powered by Composio
Source
git clone https://github.com/ComposioHQ/awesome-claude-skills/blob/master/composio-skills/ocr-web-service-automation/SKILL.mdView on GitHub Overview
Automate OCR Web Service operations using Composio's OCR toolkit through Rube MCP. The workflow starts by discovering current tool schemas via RUBE_SEARCH_TOOLS, establishing a working connection with RUBE_MANAGE_CONNECTIONS, and then executing tools with RUBE_MULTI_EXECUTE_TOOL. Keeping schemas fresh and connections ACTIVE ensures reliable OCR tasks.
How This Skill Works
The workflow begins by adding the MCP at rube.app/mcp and verifying availability with RUBE_SEARCH_TOOLS. Next, connect to the OCR toolkit via RUBE_MANAGE_CONNECTIONS and select a tool slug from the discovered schemas. Finally, execute the tool with RUBE_MULTI_EXECUTE_TOOL, passing memory:{} and a session_id to maintain state.
When to Use It
- When you need to automate OCR tasks and rely on current tool schemas
- To batch OCR jobs in a workflow with session reuse
- Before executing any tool, verify RUBE_MANAGE_CONNECTIONS shows ACTIVE
- When tool schemas change frequently, always refetch with RUBE_SEARCH_TOOLS
- For bulk OCR tasks using RUBE_MULTI_EXECUTE_TOOL or RUBE_REMOTE_WORKBENCH
Quick Start
- Step 1: Add MCP endpoint at https://rube.app/mcp and verify availability with RUBE_SEARCH_TOOLS
- Step 2: Discover OCR tools using RUBE_SEARCH_TOOLS for your use_case (e.g., OCR Web Service operations)
- Step 3: Check ACTIVE connection with RUBE_MANAGE_CONNECTIONS and run a tool via RUBE_MULTI_EXECUTE_TOOL (include memory and a session_id)
Best Practices
- Always call RUBE_SEARCH_TOOLS first to fetch current tool schemas
- Verify RUBE_MANAGE_CONNECTIONS reports ACTIVE before execution
- Use exact field names and types from search results
- Include memory:{} in RUBE_MULTI_EXECUTE_TOOL calls
- Reuse session IDs within a workflow and fetch pagination until complete
Example Use Cases
- Batch OCR on invoices: discover the invoice_ocr tool slug, ensure ACTIVE connection, then execute with a batch of image paths using a single session_id.
- Process scanned receipts: fetch the receipt_ocr tool from RUBE_SEARCH_TOOLS, verify connectivity, and run per-receipt arguments in one workflow.
- Ingest new document types: search for a doc_ocr tool, confirm ACTIVE, and run multiple documents in a session with RUBE_MULTI_EXECUTE_TOOL.
- Bulk archival OCR: use RUBE_REMOTE_WORKBENCH with run_composio_tool to parallelize many OCR tasks across docs.
- Pagination-safe fetch: for tools returning paged results, repeatedly call RUBE_SEARCH_TOOLS or related endpoints until tokens are exhausted.