Get the FREE Ultimate OpenClaw Setup Guide →

step-functions

Scanned
npx machina-cli add skill itsmostafa/aws-agent-skills/step-functions --openclaw
Files (1)
SKILL.md
9.5 KB

AWS Step Functions

AWS Step Functions is a serverless orchestration service that lets you build and run workflows using state machines. Coordinate multiple AWS services into business-critical applications.

Table of Contents

Core Concepts

Workflow Types

TypeDescriptionPricing
StandardLong-running, durable, exactly-oncePer state transition
ExpressHigh-volume, short-durationPer execution (time + memory)

State Types

StateDescription
TaskExecute work (Lambda, API call)
ChoiceConditional branching
ParallelExecute branches concurrently
MapIterate over array
WaitDelay execution
PassPass input to output
SucceedEnd successfully
FailEnd with failure

Amazon States Language (ASL)

JSON-based language for defining state machines.

Common Patterns

Simple Lambda Workflow

{
  "Comment": "Process order workflow",
  "StartAt": "ValidateOrder",
  "States": {
    "ValidateOrder": {
      "Type": "Task",
      "Resource": "arn:aws:lambda:us-east-1:123456789012:function:ValidateOrder",
      "Next": "ProcessPayment"
    },
    "ProcessPayment": {
      "Type": "Task",
      "Resource": "arn:aws:lambda:us-east-1:123456789012:function:ProcessPayment",
      "Next": "FulfillOrder"
    },
    "FulfillOrder": {
      "Type": "Task",
      "Resource": "arn:aws:lambda:us-east-1:123456789012:function:FulfillOrder",
      "End": true
    }
  }
}

Create State Machine

AWS CLI:

aws stepfunctions create-state-machine \
  --name OrderWorkflow \
  --definition file://workflow.json \
  --role-arn arn:aws:iam::123456789012:role/StepFunctionsRole \
  --type STANDARD

boto3:

import boto3
import json

sfn = boto3.client('stepfunctions')

definition = {
    "Comment": "Order workflow",
    "StartAt": "ProcessOrder",
    "States": {
        "ProcessOrder": {
            "Type": "Task",
            "Resource": "arn:aws:lambda:...",
            "End": True
        }
    }
}

response = sfn.create_state_machine(
    name='OrderWorkflow',
    definition=json.dumps(definition),
    roleArn='arn:aws:iam::123456789012:role/StepFunctionsRole',
    type='STANDARD'
)

Start Execution

import boto3
import json

sfn = boto3.client('stepfunctions')

response = sfn.start_execution(
    stateMachineArn='arn:aws:states:us-east-1:123456789012:stateMachine:OrderWorkflow',
    name='order-12345',
    input=json.dumps({
        'order_id': '12345',
        'customer_id': 'cust-789',
        'items': [{'product_id': 'prod-1', 'quantity': 2}]
    })
)

execution_arn = response['executionArn']

Choice State (Conditional Logic)

{
  "StartAt": "CheckOrderValue",
  "States": {
    "CheckOrderValue": {
      "Type": "Choice",
      "Choices": [
        {
          "Variable": "$.total",
          "NumericGreaterThan": 1000,
          "Next": "HighValueOrder"
        },
        {
          "Variable": "$.priority",
          "StringEquals": "rush",
          "Next": "RushOrder"
        }
      ],
      "Default": "StandardOrder"
    },
    "HighValueOrder": {
      "Type": "Task",
      "Resource": "arn:aws:lambda:...:function:ProcessHighValue",
      "End": true
    },
    "RushOrder": {
      "Type": "Task",
      "Resource": "arn:aws:lambda:...:function:ProcessRush",
      "End": true
    },
    "StandardOrder": {
      "Type": "Task",
      "Resource": "arn:aws:lambda:...:function:ProcessStandard",
      "End": true
    }
  }
}

Parallel Execution

{
  "StartAt": "ProcessInParallel",
  "States": {
    "ProcessInParallel": {
      "Type": "Parallel",
      "Branches": [
        {
          "StartAt": "UpdateInventory",
          "States": {
            "UpdateInventory": {
              "Type": "Task",
              "Resource": "arn:aws:lambda:...:function:UpdateInventory",
              "End": true
            }
          }
        },
        {
          "StartAt": "SendNotification",
          "States": {
            "SendNotification": {
              "Type": "Task",
              "Resource": "arn:aws:lambda:...:function:SendNotification",
              "End": true
            }
          }
        },
        {
          "StartAt": "UpdateAnalytics",
          "States": {
            "UpdateAnalytics": {
              "Type": "Task",
              "Resource": "arn:aws:lambda:...:function:UpdateAnalytics",
              "End": true
            }
          }
        }
      ],
      "Next": "Complete"
    },
    "Complete": {
      "Type": "Succeed"
    }
  }
}

Map State (Iteration)

{
  "StartAt": "ProcessItems",
  "States": {
    "ProcessItems": {
      "Type": "Map",
      "ItemsPath": "$.items",
      "MaxConcurrency": 10,
      "Iterator": {
        "StartAt": "ProcessItem",
        "States": {
          "ProcessItem": {
            "Type": "Task",
            "Resource": "arn:aws:lambda:...:function:ProcessItem",
            "End": true
          }
        }
      },
      "ResultPath": "$.processedItems",
      "End": true
    }
  }
}

Error Handling

{
  "StartAt": "ProcessWithRetry",
  "States": {
    "ProcessWithRetry": {
      "Type": "Task",
      "Resource": "arn:aws:lambda:...:function:Process",
      "Retry": [
        {
          "ErrorEquals": ["Lambda.ServiceException", "Lambda.TooManyRequestsException"],
          "IntervalSeconds": 2,
          "MaxAttempts": 6,
          "BackoffRate": 2
        },
        {
          "ErrorEquals": ["States.Timeout"],
          "IntervalSeconds": 5,
          "MaxAttempts": 3,
          "BackoffRate": 1.5
        }
      ],
      "Catch": [
        {
          "ErrorEquals": ["CustomError"],
          "ResultPath": "$.error",
          "Next": "HandleCustomError"
        },
        {
          "ErrorEquals": ["States.ALL"],
          "ResultPath": "$.error",
          "Next": "HandleAllErrors"
        }
      ],
      "End": true
    },
    "HandleCustomError": {
      "Type": "Task",
      "Resource": "arn:aws:lambda:...:function:HandleCustom",
      "End": true
    },
    "HandleAllErrors": {
      "Type": "Fail",
      "Error": "ProcessingFailed",
      "Cause": "An error occurred during processing"
    }
  }
}

CLI Reference

State Machine Management

CommandDescription
aws stepfunctions create-state-machineCreate state machine
aws stepfunctions update-state-machineUpdate definition
aws stepfunctions delete-state-machineDelete state machine
aws stepfunctions list-state-machinesList state machines
aws stepfunctions describe-state-machineGet details

Executions

CommandDescription
aws stepfunctions start-executionStart execution
aws stepfunctions stop-executionStop execution
aws stepfunctions describe-executionGet execution details
aws stepfunctions list-executionsList executions
aws stepfunctions get-execution-historyGet execution history

Best Practices

Design

  • Keep states focused — one purpose per state
  • Use meaningful state names
  • Implement comprehensive error handling
  • Use Parallel for independent tasks
  • Use Map for batch processing

Performance

  • Use Express workflows for high-volume, short tasks
  • Set appropriate timeouts
  • Limit Map concurrency to avoid throttling
  • Use SDK integrations when possible (avoid Lambda wrapper)

Reliability

  • Retry transient errors
  • Catch and handle specific errors
  • Use idempotent operations
  • Enable X-Ray tracing

Cost Optimization

  • Use Express for short workflows (< 5 minutes)
  • Combine related operations to reduce transitions
  • Use Wait states instead of Lambda delays

Troubleshooting

Execution Failed

# Get execution history
aws stepfunctions get-execution-history \
  --execution-arn arn:aws:states:us-east-1:123456789012:execution:MyWorkflow:exec-123 \
  --query 'events[?type==`TaskFailed` || type==`ExecutionFailed`]'

Lambda Timeout

Causes:

  • Lambda running too long
  • Task timeout too short

Fix:

{
  "Type": "Task",
  "Resource": "arn:aws:lambda:...",
  "TimeoutSeconds": 300,
  "HeartbeatSeconds": 60
}

State Stuck

Check:

  • Task state waiting for callback
  • Wait state not yet elapsed
  • Activity worker not responding

Invalid State Machine

# Validate definition
aws stepfunctions validate-state-machine-definition \
  --definition file://workflow.json

References

Source

git clone https://github.com/itsmostafa/aws-agent-skills/blob/main/skills/step-functions/SKILL.mdView on GitHub

Overview

AWS Step Functions provides serverless workflow orchestration by coordinating AWS services through state machines. It supports Standard long-running flows and Express high-volume tasks, with built-in error handling, retries, and parallel execution. This skill helps you design, implement, and debug end-to-end workflows.

How This Skill Works

Define a workflow in Amazon States Language (ASL) as a JSON state machine. Step Functions executes states such as Task, Choice, Parallel, Map, and Wait, while managing retries, backoffs, and data flow between steps. It integrates with Lambda, API Gateway, and other AWS services, and you can monitor executions in the console.

When to Use It

  • Coordinating multiple AWS services into a business process (e.g., order processing, data pipelines).
  • Implementing error handling, retries with backoff, and failure paths.
  • Running long-lived workflows (Standard) or high-volume, short-duration tasks (Express).
  • Orchestrating parallel branches to improve throughput and efficiency.
  • Debugging and auditing executions with step-by-step state history.

Quick Start

  1. Step 1: Define your ASL state machine JSON that models the workflow.
  2. Step 2: Deploy the state machine using AWS CLI or boto3 to create it.
  3. Step 3: Start an execution with input data and monitor via the console or CloudWatch.

Best Practices

  • Choose Standard vs Express based on durability, latency, and scale needs.
  • Design idempotent tasks and implement retries with exponential backoff.
  • Keep state machine definitions focused; use Map/Parallel to handle batches.
  • Filter sensitive data from inputs/outputs and minimize payloads.
  • Enable CloudWatch logs, metrics, and alarms to monitor executions.

Example Use Cases

  • Simple Lambda Workflow: ValidateOrder → ProcessPayment → FulfillOrder.
  • Create and deploy a state machine via AWS CLI (create-state-machine) or boto3 (create_state_machine).
  • Start an execution with an input payload like an order and track its progress.
  • Routing with Choice state based on total value or priority (high-value or rush orders).
  • Processing items in parallel using Parallel or Map states to speed up workloads.

Frequently Asked Questions

Add this skill to your agents
Sponsor this space

Reach thousands of developers