Do I need API keys to use this skill?

No API keys are needed. Add the MCP server endpoint (eg. https://rube.app/mcp) and rely on RUBE MCP for authentication.

What if tool slugs or schemas change?

Always search tools first using RUBE_SEARCH_TOOLS and do not hardcode slugs or argument names.

How can I ensure safe execution?

Verify the connection is ACTIVE with RUBE_MANAGE_CONNECTIONS, include a memory object in each tool call, reuse session IDs within a workflow, and follow exact schemas from search results.

big-data-cloud-automation

npx machina-cli add skill ComposioHQ/awesome-claude-skills/big-data-cloud-automation --openclaw

Files (1)

SKILL.md

3.0 KB

Big Data Cloud Automation via Rube MCP

Automate Big Data Cloud operations through Composio's Big Data Cloud toolkit via Rube MCP.

Toolkit docs: composio.dev/toolkits/big_data_cloud

Prerequisites

Rube MCP must be connected (RUBE_SEARCH_TOOLS available)
Active Big Data Cloud connection via RUBE_MANAGE_CONNECTIONS with toolkit big_data_cloud
Always call RUBE_SEARCH_TOOLS first to get current tool schemas

Setup

Get Rube MCP: Add https://rube.app/mcp as an MCP server in your client configuration. No API keys needed — just add the endpoint and it works.

Verify Rube MCP is available by confirming RUBE_SEARCH_TOOLS responds
Call RUBE_MANAGE_CONNECTIONS with toolkit big_data_cloud
If connection is not ACTIVE, follow the returned auth link to complete setup
Confirm connection status shows ACTIVE before running any workflows

Tool Discovery

Always discover available tools before executing workflows:

RUBE_SEARCH_TOOLS
queries: [{use_case: "Big Data Cloud operations", known_fields: ""}]
session: {generate_id: true}

This returns available tool slugs, input schemas, recommended execution plans, and known pitfalls.

Core Workflow Pattern

Step 1: Discover Available Tools

RUBE_SEARCH_TOOLS
queries: [{use_case: "your specific Big Data Cloud task"}]
session: {id: "existing_session_id"}

Step 2: Check Connection

RUBE_MANAGE_CONNECTIONS
toolkits: ["big_data_cloud"]
session_id: "your_session_id"

Step 3: Execute Tools

RUBE_MULTI_EXECUTE_TOOL
tools: [{
  tool_slug: "TOOL_SLUG_FROM_SEARCH",
  arguments: {/* schema-compliant args from search results */}
}]
memory: {}
session_id: "your_session_id"

Known Pitfalls

Always search first: Tool schemas change. Never hardcode tool slugs or arguments without calling RUBE_SEARCH_TOOLS
Check connection: Verify RUBE_MANAGE_CONNECTIONS shows ACTIVE status before executing tools
Schema compliance: Use exact field names and types from the search results
Memory parameter: Always include memory in RUBE_MULTI_EXECUTE_TOOL calls, even if empty ({})
Session reuse: Reuse session IDs within a workflow. Generate new ones for new workflows
Pagination: Check responses for pagination tokens and continue fetching until complete

Quick Reference

Operation	Approach
Find tools	`RUBE_SEARCH_TOOLS` with Big Data Cloud-specific use case
Connect	`RUBE_MANAGE_CONNECTIONS` with toolkit `big_data_cloud`
Execute	`RUBE_MULTI_EXECUTE_TOOL` with discovered tool slugs
Bulk ops	`RUBE_REMOTE_WORKBENCH` with `run_composio_tool()`
Full schema	`RUBE_GET_TOOL_SCHEMAS` for tools with `schemaRef`

Powered by Composio

Source

git clone https://github.com/ComposioHQ/awesome-claude-skills/blob/master/composio-skills/big-data-cloud-automation/SKILL.md

View on GitHub

Overview

This skill automates Big Data Cloud operations using Composio's Rube MCP toolkit. It guides you to always search for current tool schemas first with RUBE_SEARCH_TOOLS to avoid hardcoding tool slugs or args. It covers setup, tool discovery, and core workflow steps to safely run workflows.

How This Skill Works

Connect Rube MCP and verify the connection with RUBE_SEARCH_TOOLS. Then discover available tools for your Big Data Cloud task, ensure the connection is ACTIVE with RUBE_MANAGE_CONNECTIONS, and finally execute the chosen tools using RUBE_MULTI_EXECUTE_TOOL with the required memory object and a session_id derived from discovery.

When to Use It

You are automating a Big Data Cloud workflow and need to adapt to changing tool schemas without manual reconfiguration.
You want to validate the Big Data Cloud connection is ACTIVE before running any workflows.
You follow a core pattern of discover, check, and execute for each data task.
You plan to perform bulk or multi-tool operations in a single workflow.
You need to reuse session IDs within a workflow instead of creating new ones for every step.

Quick Start

Step 1: Run RUBE_SEARCH_TOOLS with your Big Data Cloud use case to discover available tools and schemas.
Step 2: Run RUBE_MANAGE_CONNECTIONS for toolkit big_data_cloud and verify the status is ACTIVE.
Step 3: Run RUBE_MULTI_EXECUTE_TOOL with the discovered tool_slug, memory, and a valid session_id.

Best Practices

Always search first: tool schemas change, so never hardcode tool slugs or arguments.
Check connection: ensure RUBE_MANAGE_CONNECTIONS shows ACTIVE before executing tools.
Schema compliance: use exact field names and types from the search results.
Memory parameter: always include memory in RUBE_MULTI_EXECUTE_TOOL calls, even if empty.
Session reuse: reuse session IDs within a workflow to maintain context.

Example Use Cases

Schedule a nightly ETL by discovering tools with RUBE_SEARCH_TOOLS and then executing the appropriate tool slug with memory and a session_id.
Validate and establish a big_data_cloud connection before a data load, ensuring the connection remains ACTIVE.
Run a multi-step workflow in a single run by reusing the same session_id across tools.
Perform bulk data operations using RUBE_REMOTE_WORKBENCH and run_composio_tool for multiple tasks.
Fetch full tool schemas with RUBE_GET_TOOL_SCHEMAS to verify required fields before execution.

Frequently Asked Questions

Add this skill to your agents