Troubleshoot states in AWS Step Functions by using Amazon Bedrock
Created by Aniket Kurzadkar (AWS) and Sangam Kushwaha (AWS)
Summary
AWS Step Functions error handling capabilities can help you see an error that occurs during a state in a workflow, but it can still be a challenge to find the root cause of an error and debug it. This pattern addresses that challenge and shows how Amazon Bedrock can help you resolve errors that occur during states in Step Functions.
Step Functions provides workflow orchestration, making it easier for developers to automate processes. Step Functions also provides error handling functionality that provides the following benefits:
Developers can create more resilient applications that don't fail completely when something goes wrong.
Workflows can include conditional logic to handle different types of errors differently.
The system can automatically retry failed operations, perhaps with exponential backoff.
Alternative execution paths can be defined for error scenarios, allowing the workflow to adapt and continue processing.
When an error occurs in a Step Functions workflow, this pattern shows how the error message and context can be sent to a foundation model (FM) like Claude 3 that’s supported by Step Functions. The FM can analyze the error, categorize it, and suggest potential remediation steps.
Prerequisites and limitations
Prerequisites
An active AWS account
Basic understanding of AWS Step Functions and workflows
Amazon Bedrock API connectivity
Limitations
You can use this pattern’s approach for various AWS services. However, the results might vary according to the prompt created by AWS Lambda that’s subsequently evaluated by Amazon Bedrock.
Some AWS services aren’t available in all AWS Regions. For Region availability, see AWS services by Region
. For specific endpoints, see Service endpoints and quotas, and choose the link for the service.
Architecture
The following diagram shows the workflow and architecture components for this pattern.

The diagram shows the automated workflow for error handling and notification in a Step Functions state machine:
The developer starts a state machine’s execution.
The Step Functions state machine begins processing its states. There are two possible outcomes:
(a) If all states execute successfully, the workflow proceeds directly to Amazon SNS for an email success notification.
(b) If any state fails, the workflow moves to the error handling Lambda function.
In case of an error, the following occurs:
(a) The Lambda function (error handler) is triggered. The Lambda function extracts the error message from the event data that the Step Functions state machine passed to it. Then the Lambda function prepares a prompt based on this error message and sends the prompt to Amazon Bedrock. The prompt requests solutions and suggestions related to the specific error encountered.
(b) Amazon Bedrock, which hosts the generative AI model, processes the input prompt. (This pattern uses the Anthropic Claude 3 foundation model (FM), which is one of many FMs that Amazon Bedrock supports.) The AI model analyses the error context. Then the model generates a response that can include explanations of why the error occurred, potential solutions to resolve the error, and suggestions to avoid making the same mistakes in the future.
Amazon Bedrock returns its AI-generated response to the Lambda function. The Lambda function processes the response, potentially formatting it or extracting key information. Then the Lambda function sends the response to the state machine output.
After error handling or successful execution, the workflow concludes by triggering Amazon SNS to send an email notification.
Tools
AWS services
Amazon Bedrock is a fully managed service that makes high-performing foundation models (FMs) from leading AI startups and Amazon available for your use through a unified API.
AWS Lambda is a compute service that helps you run code without needing to provision or manage servers. It runs your code only when needed and scales automatically, so you pay only for the compute time that you use.
Amazon Simple Notification Service (Amazon SNS) helps you coordinate and manage the exchange of messages between publishers and clients, including web servers and email addresses.
AWS Step Functions is a serverless orchestration service that helps you combine AWS Lambda functions and other AWS services to build business-critical applications.
Best practices
Given that Amazon Bedrock is a generative AI model that learns from trained data, it also uses that data to train and generate context. As a best practice, conceal any private information that might lead to data leak problems.
Although generative AI can provide valuable insights, critical error-handling decisions should still involve human oversight, especially in production environments.
Epics
Task | Description | Skills required |
---|---|---|
Create a state machine. | To create a state machine that’s appropriate for your workflow, do the following:
| AWS DevOps |
Task | Description | Skills required |
---|---|---|
Create a Lambda function. | To create a Lambda function, do the following:
| AWS DevOps |
Set up the required logic in the Lambda code. |
| AWS DevOps |
Task | Description | Skills required |
---|---|---|
Set up Lambda to handle errors in Step Functions. | To set up Step Functions to handle errors without disrupting the workflow, do the following:
| AWS DevOps |
Troubleshooting
Issue | Solution |
---|---|
Lambda cannot access the Amazon Bedrock API (Not authorized to perform) | This error occurs when the Lambda role doesn’t have permission to access the Amazon Bedrock API. To resolve this issue, add the |
Lambda timeout error | Sometimes it might take more than 30 seconds to generate a response and send it back, depending on the prompt. To resolve this issue, increase the configuration time. For more information, see Configure Lambda function timeout in the AWS Lambda Developer Guide. |