Designing Workflow Orchestration with AWS Step Functions - Choosing Between Standard and Express

Clarify the selection criteria between Standard and Express workflows, and learn about declarative error handling with Retry/Catch and large-scale parallel processing with distributed maps.

Core Concepts of Step Functions

Step Functions is a serverless workflow orchestration service. Workflows are defined as state machines, where each state executes AWS service calls, conditional branching, or parallel processing. When chaining Lambda functions, you would normally need to implement invocation order, error handling, and retry logic within each function. With Step Functions, these are declaratively described in the workflow definition (ASL). The visual editor lets you build workflows with drag-and-drop and monitor execution status in real time through a graphical view.

Choosing Between Standard and Express Workflows

Standard workflows persist execution history at every state transition and support long-running executions up to 1 year. Pricing is approximately $0.000025 USD per state transition. They are suited for order processing, approval flows, and data pipelines where execution history auditing is required and execution frequency is relatively low. Express workflows do not persist execution history and are optimized for short-duration executions up to 5 minutes. Pricing is based on execution count and duration, making them significantly cheaper than Standard workflows for high-volume, short-lived executions. They are ideal for IoT device event processing, API Gateway backends, and streaming data transformations that require high throughput and short execution times.

Error Handling and Parallel Processing

Step Functions' Retry field allows you to declaratively define retry count, initial wait time, and backoff rate for each error type. For example, you can retry Lambda.ServiceException up to 3 times with exponential backoff, while routing business logic errors directly to another state via Catch without retrying. Errors caught by the Catch field can trigger fallback states for sending notifications or running cleanup tasks. The Map state executes the same processing in parallel for each element of an array. In distributed map mode, it can process CSV or JSON files in an S3 bucket as input, running up to 10,000 parallel executions for large-scale batch processing. To broaden your serverless knowledge, specialized books on Amazon are also a valuable resource.

Step Functions Pricing

Standard workflows cost approximately $0.000025 per state transition, or about $0.10 for 4,000 state transitions. Express workflows are priced based on a combination of execution count (approximately $1.00 per million executions) and execution duration (approximately $0.00001667 per GB-second). For high-throughput workloads running thousands of times per second, Standard workflow state transition charges escalate rapidly, making Express significantly cheaper. Conversely, for long-running workflows executed only a few times per day, Standard is more economical. The free tier includes 4,000 state transitions per month for Standard and 25,000 executions per month for Express.

Summary

Step Functions is a service that manages workflow complexity in serverless architectures. It separates orchestration logic from Lambda function code, enabling visualization and management through declarative workflow definitions. By choosing between Standard and Express to optimize costs, and standardizing error handling with Retry/Catch, you can build robust serverless applications.