Retrying Failed Tasks

There are a number of ways to retry failed activity or workflow tasks, or even arbitrary methods or blocks of code with the AWS Flow Framework for Ruby:

Configuring a Task for Automatic Exponential Retries#

To configure an activity or workflow to automatically retry when it fails, pass in a block of ExponentialRetryOptions in the exponential_retry section of the options block when you declare the type. For example:

activity :unreliable_activity_with_retry_options do
  {
    version: "1.0",
    default_task_list: "activity_tasklist",
    default_task_schedule_to_start_timeout: 30,
    default_task_start_to_close_timeout: 30,
    exponential_retry: { maximum_attempts: 5 }
  }
end

In the preceding snippet, the activity will automatically be retried (up to 5 times) using an exponential retry algorithm if any exception occurs.

You can also pass exponential retry options when scheduling the task:

client.send(:unreliable_activity_without_retry_options) do
  {
    exponential_retry: {
      maximum_attempts: 5,
      exceptions_to_include: [ArgumentError],
    }
  }
end

In this example, the optional parameter exceptions_to_include was specified, restricting retry attempts to occur only in the case of an ArgumentError. This overrides the default behavior, which attempts a retry after any exception.

Exponential Retry Attempts and Jitter Logic#

When you specify exponential retry options, the AWS Flow Framework for Ruby applies a jitter function by default, to add some randomization to the retry attempts. This helps to reduce the chance that many activities will be retried at exactly the same time.

If you want to use your own jitter logic when using exponential retries, you can use the jitter_function option to set your own jitter function:

activity_client(:client) do
  {
    from_class: "RetryActivities",
    exponential_retry: {
      should_jitter: true,
      jitter_function: lambda do |seed, max_value|
        Random.new(seed.to_i).rand(max_value)
      end,
      maximum_attempts: 5,
      exceptions_to_include: [StandardError],
    }
  }
end

Tip

If you don't want any jitter function applied to exponential retry attempts, you can turn it off by specifying False for the should_jitter option.

Retrying Methods Using an Activity or Workflow Client#

You can retry tasks that were not configured at declaration by using the client methods:

You can add exponential retry options using send, as described in Configuring a Task for Automatic Exponential Retries, or by using the exponential_retry or retry methods of the GenericClient class (inherited by both GenericActivityClient and WorkflowClient). You can retry the method with either the built-in exponential retry algorithm or by supplying your own retry method.

To use the exponential retry algorithm, use exponential_retry with a method to retry, arguments for the method, and a block of ExponentialRetryOptions:

activity_client.exponential_retry(:my_activity_method, activity_input) {
  exponential_retry: {
    maximum_attempts: 2,
    exceptions_to_include: [ArgumentError],
  }
}

If you want to define your own retry algorithm, use the retry method by sending it the method to retry, your own retry function, arguments for the method to retry, and a block of RetryOptions:

retry_time_secs = lambda do |first_attempt_time, failure_time, num_retries|
  secs_in_day = 3600 * 24
  if ((failure_time - first_attempt_time) > secs_in_day) then
    return -1
  else
    # Constant rate: just divide the total number of seconds by the number of
    # retries.
    return secs_in_day / num_retries
  end
end

activity_client.retry(:my_activity_method, retry_time_secs, activity_input) {
  exponential_retry: {
    maximum_attempts: 2,
    exceptions_to_include: [ArgumentError],
  }
}

Retrying an Arbitrary Block of Code#

Using the with_retry method, you can execute any block of code with retries in the AWS Flow context, by supplying a set of RetryOptions and the block of code to execute.

In this example, with_retry is used to add retry options to an activity that was registered without them:

def handle_unreliable_activity
  retry_options = {
    exponential_retry: {
      maximum_attempts: 5,
      exceptions_to_include: [ArgumentError],
    }
  }

  AWS::Flow::with_retry(retry_options) do
    client.unreliable_activity_without_retry_options
  end
end

Providing your own Retry Logic#

Although you can provide a list of errors to automatically retry in the exceptions_to_include RetryOptions, and a list of errors to automatically exclude from retry attempts in the exceptions_to_exclude option, there might be times where you want more control over which conditions will initiate a retry attempt.

To provide your own retry logic, use an exception handling strategy and initiate your own retries in a function called by the code that handles the exception.

Synchronous Example#

In a synchronous workflow, you can use the standard begin/rescue/ensure pattern:

def handle_unreliable_activity
  begin
    client.unreliable_activity_without_retry_options
  rescue StandardError => e
    retry_on_failure(e)
  end
end

def retry_on_failure(ex)
  handle_unreliable_activity if should_retry(ex)
end

def should_retry(ex)
  # custom logic to decide to retry the activity or not according to 'ex'
  return true
end

Asynchronous Example#

For an asynchronous workflow, you can use a similar technique, using error_handler and wait_for_all to handle the details of error handling for, and waiting for the results of, an asynchronous task.

def handle_unreliable_activity
  failure = Future.new
  error_handler do |t|
    t.begin do
      client.send_async(:unreliable_activity_without_retry_options)
    end
    t.rescue Exception do |e|
      failure.set(e)
    end
    t.ensure do
      failure.set unless failure.set?
    end
  end
  wait_for_all(failure)
  retry_on_failure(failure)
end

def retry_on_failure(failure)
  ex = failure.get
  handle_unreliable_activity if !ex.nil? && should_retry(ex)
end

def should_retry(ex)
  # insert your custom logic here.
  return true
end

Additional Information and Examples#

Refer to the following resources for more information about the subjects in this topic:

  • For more information about error handling in the AWS Flow Framework for Ruby, see Handling Errors.
  • For more information about programming asynchronous tasks, see Executing Tasks Asynchronously.
  • To view additional examples of retrying tasks, see the retry_activity recipe in the public AWS Flow Framework for Ruby Samples project on GitHub.