What is the Execute skill used for?

To run code or commands deterministically while capturing stdout, stderr, exit code, and duration for verifiable results.

What if the command mutates state?

If mutation is detected, the system will suggest using mutate instead of execute, since execute is intended for read-only or non-mutating operations.

What data is returned after execution?

A structured object with result (success, exit_code, stdout, stderr, duration), execution metadata (command, language, started_at, completed_at), and an analysis summary with any warnings or errors.

execute

npx machina-cli add skill synaptiai/agent-capability-standard/execute --openclaw

Files (1)

SKILL.md

6.7 KB

Intent

Execute code or commands in a controlled manner, capturing output for verification. Unlike mutate, this capability is for operations that don't permanently change state (tests, queries, builds, analysis tools).

Success criteria:

Code/command executed successfully
Output captured completely
Exit code recorded
Errors properly surfaced

Compatible schemas:

schemas/output_schema.yaml

Inputs

Parameter	Required	Type	Description
`code`	Yes	string	Code or command to execute
`language`	No	string	Programming language or shell (bash, python, ruby, etc.)
`timeout`	No	string	Maximum execution time (default: "60s")
`environment`	No	object	Environment variables to set

Procedure

Validate execution request: Ensure code is safe to run
- Check for mutation operations (if found, suggest mutate instead)
- Verify timeout is reasonable
- Confirm execution environment
Prepare environment: Set up execution context
- Set required environment variables
- Ensure dependencies are available
- Create isolated context if needed
Execute code: Run the code/command
- Capture stdout and stderr
- Record start time
- Monitor for timeout
Capture results: Collect execution output
- Record exit code
- Capture complete stdout
- Capture complete stderr
- Note execution duration
Analyze output: Interpret results
- Identify success/failure from exit code
- Extract key information from output
- Note warnings or anomalies
Return results: Structure output for consumption
- Include all captured data
- Provide execution summary
- Reference evidence for assertions

Output Contract

Return a structured object:

result:
  success: boolean  # Exit code == 0
  exit_code: integer  # Process exit code
  stdout: string  # Standard output
  stderr: string  # Standard error
  duration: string  # Execution time
execution:
  command: string  # What was executed
  language: string  # Execution environment
  started_at: string  # ISO timestamp
  completed_at: string  # ISO timestamp
analysis:
  summary: string  # One-line result summary
  warnings: array[string]  # Notable warnings
  errors: array[string]  # Extracted error messages
evidence_anchors: ["command:output"]

Field Definitions

Field	Type	Description
`result.success`	boolean	Whether execution succeeded
`result.exit_code`	integer	Process exit code
`result.stdout`	string	Standard output
`result.stderr`	string	Standard error
`result.duration`	string	How long execution took
`execution`	object	Execution metadata
`analysis`	object	Interpreted results

Examples

Example 1: Run Tests

Input:

code: "npm test -- --grep 'UserService'"
language: "bash"
timeout: "120s"

Output:

result:
  success: true
  exit_code: 0
  stdout: |
    > project@1.0.0 test
    > jest --grep 'UserService'

    PASS src/services/__tests__/UserService.test.ts
      UserService
        ✓ creates user with valid data (45ms)
        ✓ validates email format (12ms)
        ✓ hashes password on save (23ms)

    Test Suites: 1 passed, 1 total
    Tests:       3 passed, 3 total
  stderr: ""
  duration: "2.3s"
execution:
  command: "npm test -- --grep 'UserService'"
  language: "bash"
  started_at: "2024-01-15T10:30:00Z"
  completed_at: "2024-01-15T10:30:02Z"
analysis:
  summary: "3 tests passed in UserService"
  warnings: []
  errors: []
evidence_anchors:
  - "command:npm test:output"

Example 2: Execute Query

Input:

code: "SELECT COUNT(*) as user_count FROM users WHERE created_at > '2024-01-01'"
language: "sql"
environment:
  DATABASE_URL: "${DATABASE_URL}"

Output:

result:
  success: true
  exit_code: 0
  stdout: |
    user_count
    ----------
    15423
    (1 row)
  stderr: ""
  duration: "0.15s"
execution:
  command: "psql -c \"SELECT COUNT(*)...\""
  language: "sql"
  started_at: "2024-01-15T10:35:00Z"
  completed_at: "2024-01-15T10:35:00Z"
analysis:
  summary: "Query returned 15423 users created in 2024"
  warnings: []
  errors: []
evidence_anchors:
  - "command:psql:query_result"

Example 3: Execution Failure

Input:

code: "python -c 'import nonexistent_module'"
language: "bash"

Output:

result:
  success: false
  exit_code: 1
  stdout: ""
  stderr: |
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
    ModuleNotFoundError: No module named 'nonexistent_module'
  duration: "0.05s"
execution:
  command: "python -c 'import nonexistent_module'"
  language: "bash"
  started_at: "2024-01-15T10:40:00Z"
  completed_at: "2024-01-15T10:40:00Z"
analysis:
  summary: "Import failed: module 'nonexistent_module' not found"
  warnings: []
  errors:
    - "ModuleNotFoundError: No module named 'nonexistent_module'"
evidence_anchors:
  - "command:python:error"

Verification

Verification tools: Bash (to verify command execution environment)

Safety Constraints

mutation: false
requires_checkpoint: false
requires_approval: true
risk: medium

Capability-specific rules:

Verify code does not mutate persistent state
Enforce timeout limits
Capture all output for audit trail
Do not execute code that requires elevated privileges without approval
Prefer mutate for state-changing operations

Composition Patterns

Commonly follows:

plan - Execute planned actions
generate - Execute generated code
transform - Execute transformation scripts

Commonly precedes:

verify - Verify execution results
detect - Detect patterns in output
audit - Record execution for audit

Anti-patterns:

Never use execute for persistent state changes (use mutate)
Avoid execute without timeout limits
Never execute untrusted code without sandboxing

Workflow references:

See reference/workflow_catalog.yaml#debug_code_change for execute in testing
See reference/workflow_catalog.yaml#digital_twin_sync_loop for execute in sync loops

Source

git clone https://github.com/synaptiai/agent-capability-standard/blob/main/skills/execute/SKILL.mdView on GitHub

Overview

Execute code or commands in a controlled, deterministic way and capture all output for verification. This is ideal for tests, builds, analysis tools, or read-only operations that must produce verifiable results.

How This Skill Works

The tool validates the execution request, prepares an isolated environment, runs the code or command, and captures stdout, stderr, exit code, and duration. It then analyzes the output to determine success or failure and returns a structured result for consumption.

When to Use It

Running unit or integration tests to verify behavior without mutating state
Executing build commands or scripts as part of CI pipelines
Invoking CLI tools or analysis utilities to extract verifiable results
Performing read-only queries or analyses that should not alter system state
Generating auditable execution records with timestamps for debugging and traceability

Quick Start

Step 1: Provide the code or command to execute along with language and optional timeout (e.g., { code, language, timeout }).
Step 2: The system validates safety, sets up the environment, and prepares dependencies as needed.
Step 3: The command runs; review stdout, stderr, exit_code, and duration in the structured result.

Best Practices

Validate the request to ensure no unintended mutations and confirm a reasonable timeout
Prepare an isolated execution environment and inject necessary environment variables
Capture both stdout and stderr and always record the exit code
Set a sensible timeout and monitor duration to detect hangs
Return a complete result object with an execution summary and evidence for assertions

Example Use Cases

code: "npm test -- --grep 'UserService'", language: "bash"
code: "make build" or "gradle build", language: "bash"
code: "eslint . --format compact", language: "bash"
code: "python generate_report.py", language: "python"
code: "curl -s https://api.example.com/status | jq .uptime", language: "bash"

Frequently Asked Questions

Add this skill to your agents