Select your cookie preferences

We use essential cookies and similar tools that are necessary to provide our site and services. We use performance cookies to collect anonymous statistics, so we can understand how customers use our site and make improvements. Essential cookies cannot be deactivated, but you can choose “Customize” or “Decline” to decline performance cookies.

If you agree, AWS and approved third parties will also use cookies to provide useful site features, remember your preferences, and display relevant content, including relevant advertising. To accept or decline all non-essential cookies, choose “Accept” or “Decline.” To make more detailed choices, choose “Customize.”

Troubleshoot states in AWS Step Functions by using Amazon Bedrock - AWS Prescriptive Guidance

Troubleshoot states in AWS Step Functions by using Amazon Bedrock

Created by Aniket Kurzadkar (AWS) and Sangam Kushwaha (AWS)

Summary

AWS Step Functions error handling capabilities can help you see an error that occurs during a state in a workflow, but it can still be a challenge to find the root cause of an error and debug it. This pattern addresses that challenge and shows how Amazon Bedrock can help you resolve errors that occur during states in Step Functions.

Step Functions provides workflow orchestration, making it easier for developers to automate processes. Step Functions also provides error handling functionality that provides the following benefits:

  • Developers can create more resilient applications that don't fail completely when something goes wrong.

  • Workflows can include conditional logic to handle different types of errors differently.

  • The system can automatically retry failed operations, perhaps with exponential backoff.

  • Alternative execution paths can be defined for error scenarios, allowing the workflow to adapt and continue processing.

When an error occurs in a Step Functions workflow, this pattern shows how the error message and context can be sent to a foundation model (FM) like Claude 3 that’s supported by Step Functions. The FM can analyze the error, categorize it, and suggest potential remediation steps.

Prerequisites and limitations

Prerequisites

Limitations

  • You can use this pattern’s approach for various AWS services. However, the results might vary according to the prompt created by AWS Lambda that’s subsequently evaluated by Amazon Bedrock.

  • Some AWS services aren’t available in all AWS Regions. For Region availability, see AWS services by Region. For specific endpoints, see Service endpoints and quotas, and choose the link for the service.

Architecture

The following diagram shows the workflow and architecture components for this pattern.

Workflow for error handling and notification using Step Functions, Amazon Bedrock, and Amazon SNS.

The diagram shows the automated workflow for error handling and notification in a Step Functions state machine:

  1. The developer starts a state machine’s execution.

  2. The Step Functions state machine begins processing its states. There are two possible outcomes:

    • (a) If all states execute successfully, the workflow proceeds directly to Amazon SNS for an email success notification.

    • (b) If any state fails, the workflow moves to the error handling Lambda function.

  3. In case of an error, the following occurs:

    • (a) The Lambda function (error handler) is triggered. The Lambda function extracts the error message from the event data that the Step Functions state machine passed to it. Then the Lambda function prepares a prompt based on this error message and sends the prompt to Amazon Bedrock. The prompt requests solutions and suggestions related to the specific error encountered.

    • (b) Amazon Bedrock, which hosts the generative AI model, processes the input prompt. (This pattern uses the Anthropic Claude 3 foundation model (FM), which is one of many FMs that Amazon Bedrock supports.) The AI model analyses the error context. Then the model generates a response that can include explanations of why the error occurred, potential solutions to resolve the error, and suggestions to avoid making the same mistakes in the future.

      Amazon Bedrock returns its AI-generated response to the Lambda function. The Lambda function processes the response, potentially formatting it or extracting key information. Then the Lambda function sends the response to the state machine output.

  4. After error handling or successful execution, the workflow concludes by triggering Amazon SNS to send an email notification.

Tools

AWS services

  • Amazon Bedrock is a fully managed service that makes high-performing foundation models (FMs) from leading AI startups and Amazon available for your use through a unified API.

  • AWS Lambda is a compute service that helps you run code without needing to provision or manage servers. It runs your code only when needed and scales automatically, so you pay only for the compute time that you use.

  • Amazon Simple Notification Service (Amazon SNS) helps you coordinate and manage the exchange of messages between publishers and clients, including web servers and email addresses.

  • AWS Step Functions is a serverless orchestration service that helps you combine AWS Lambda functions and other AWS services to build business-critical applications.

Best practices

  • Given that Amazon Bedrock is a generative AI model that learns from trained data, it also uses that data to train and generate context. As a best practice, conceal any private information that might lead to data leak problems.

  • Although generative AI can provide valuable insights, critical error-handling decisions should still involve human oversight, especially in production environments.

Epics

TaskDescriptionSkills required

Create a state machine.

To create a state machine that’s appropriate for your workflow, do the following:

  1. Sign in to the AWS Management Console, and open the AWS Step Functions console.

  2. From the left navigation pane, choose State machines.

  3. Choose Create state machine.

  4. Choose a template according to your use case, or choose Blank to create a template according to your requirements.

AWS DevOps

Create a state machine for your workflow

TaskDescriptionSkills required

Create a state machine.

To create a state machine that’s appropriate for your workflow, do the following:

  1. Sign in to the AWS Management Console, and open the AWS Step Functions console.

  2. From the left navigation pane, choose State machines.

  3. Choose Create state machine.

  4. Choose a template according to your use case, or choose Blank to create a template according to your requirements.

AWS DevOps
TaskDescriptionSkills required

Create a Lambda function.

To create a Lambda function, do the following:

  1. In the AWS Management Console, navigate to the AWS Lambda console.

  2. In the left navigation pane, choose Functions and then choose Create function.

  3. On the Create function page, choose from the options to create a function. Then, enter a name in Function name and choose the appropriate language from the dropdown list in Runtime.

  4. Choose Create function.

AWS DevOps

Set up the required logic in the Lambda code.

  • To connect to the Amazon Bedrock API by using the AWS SDK for Python (Boto3), use the following code.

    This code sets up a client for Amazon Bedrock, prepares the necessary parameters, and then sends a request to the Claude 3 model with a specified prompt.

    This pattern invokes the Claude 3 model. For more information about all the supported foundation models including related model IDs, see Supported foundation models in Amazon Bedrock in the Amazon Bedrock documentation.

client = boto3.client( service_name="bedrock-runtime", region_name="selected-region" ) # Invoke Claude 3 with the text prompt model_id = "your-model-id" # Select your Model ID, Based on the Model Id, Change the body format try: response = client.invoke_model( modelId=model_id, body=json.dumps( { "anthropic_version": "bedrock-2023-05-31", "max_tokens": 1024, "messages": [ { "role": "user", "content": [{"type": "text", "text": prompt}], } ], } ), )
  • (Optional) Replace the AWS account IDs with placeholder account IDs. For security purposes, this approach can be useful for sanitizing logs, error messages, or other output that might contain sensitive account information.

    The following code will find any occurrence of a 12-digit number enclosed in colons (which is the format of AWS account IDs in Amazon Resource Names (ARNs) and some other AWS identifiers) and replace it with the placeholder account ID ":123456789012:".

    def replace_account_id(input_string): # Use a regular expression to find the AWS account ID pattern account_id_pattern = r'(:\d{12}:)' # Replace the matched pattern with ":123456789012:" modified_string = re.sub(account_id_pattern, ":123456789012:", input_string) return modified_string
AWS DevOps

Create a Lambda function

TaskDescriptionSkills required

Create a Lambda function.

To create a Lambda function, do the following:

  1. In the AWS Management Console, navigate to the AWS Lambda console.

  2. In the left navigation pane, choose Functions and then choose Create function.

  3. On the Create function page, choose from the options to create a function. Then, enter a name in Function name and choose the appropriate language from the dropdown list in Runtime.

  4. Choose Create function.

AWS DevOps

Set up the required logic in the Lambda code.

  • To connect to the Amazon Bedrock API by using the AWS SDK for Python (Boto3), use the following code.

    This code sets up a client for Amazon Bedrock, prepares the necessary parameters, and then sends a request to the Claude 3 model with a specified prompt.

    This pattern invokes the Claude 3 model. For more information about all the supported foundation models including related model IDs, see Supported foundation models in Amazon Bedrock in the Amazon Bedrock documentation.

client = boto3.client( service_name="bedrock-runtime", region_name="selected-region" ) # Invoke Claude 3 with the text prompt model_id = "your-model-id" # Select your Model ID, Based on the Model Id, Change the body format try: response = client.invoke_model( modelId=model_id, body=json.dumps( { "anthropic_version": "bedrock-2023-05-31", "max_tokens": 1024, "messages": [ { "role": "user", "content": [{"type": "text", "text": prompt}], } ], } ), )
  • (Optional) Replace the AWS account IDs with placeholder account IDs. For security purposes, this approach can be useful for sanitizing logs, error messages, or other output that might contain sensitive account information.

    The following code will find any occurrence of a 12-digit number enclosed in colons (which is the format of AWS account IDs in Amazon Resource Names (ARNs) and some other AWS identifiers) and replace it with the placeholder account ID ":123456789012:".

    def replace_account_id(input_string): # Use a regular expression to find the AWS account ID pattern account_id_pattern = r'(:\d{12}:)' # Replace the matched pattern with ":123456789012:" modified_string = re.sub(account_id_pattern, ":123456789012:", input_string) return modified_string
AWS DevOps
TaskDescriptionSkills required

Set up Lambda to handle errors in Step Functions.

To set up Step Functions to handle errors without disrupting the workflow, do the following:

  1. In the Step Functions console, navigate to the state machine that you created earlier.

  2. Choose Edit, and then choose the service that you want to set up error handling for and choose Error Handling.

  3. Choose Add new catcher, and for Fallback state, choose Lambda and then choose the Lambda function that you created earlier. For more information, see Catch errors in the Step Functions documentation.

AWS DevOps

Integrate Step Functions with Lambda

TaskDescriptionSkills required

Set up Lambda to handle errors in Step Functions.

To set up Step Functions to handle errors without disrupting the workflow, do the following:

  1. In the Step Functions console, navigate to the state machine that you created earlier.

  2. Choose Edit, and then choose the service that you want to set up error handling for and choose Error Handling.

  3. Choose Add new catcher, and for Fallback state, choose Lambda and then choose the Lambda function that you created earlier. For more information, see Catch errors in the Step Functions documentation.

AWS DevOps

Troubleshooting

IssueSolution

Lambda cannot access the Amazon Bedrock API (Not authorized to perform)

This error occurs when the Lambda role doesn’t have permission to access the Amazon Bedrock API. To resolve this issue, add the AmazonBedrockFullAccess policy for the Lambda role. For more information, see AmazonBedrockFullAccess in the AWS Managed Policy Reference Guide.

Lambda timeout error

Sometimes it might take more than 30 seconds to generate a response and send it back, depending on the prompt. To resolve this issue, increase the configuration time. For more information, see Configure Lambda function timeout in the AWS Lambda Developer Guide.

Related resources

PrivacySite termsCookie preferences
© 2025, Amazon Web Services, Inc. or its affiliates. All rights reserved.