Amazon SWF Timeout Types
This topic provides information about the various timeouts that you can set in your workflows and activities to control your workflow behavior.
To ensure that workflow executions run correctly, Amazon Simple Workflow Service enables you to set different types of timeouts. Some timeouts specify how long the workflow can run in its entirety. Other timeouts specify how long activity tasks can take before being assigned to a worker and how long they can take to complete from the time they are scheduled. All timeouts in the Amazon SWF API are specified in seconds. Amazon SWF also supports the string “NONE” as a timeout value, which indicates no timeout.
For timeouts related to decision tasks and activity tasks, Amazon SWF adds an event to the workflow execution history. The attributes of the event provide information about what type of timeout occurred and which decision task or activity task was affected. Amazon SWF also schedules a decision task. When the decider receives the new decision task, it will see the timeout event in the history and take an appropriate action by calling the RespondDecisionTaskCompleted action.
A task is considered open from the time that it is scheduled until it is closed. Therefore a task is reported as open while a worker is processing it. A task is closed when a worker reports it as completed, canceled, or failed. A task may also be closed by Amazon SWF as the result of a timeout.
The following diagram shows how workflow execution and workflow (decider) timeouts are related to the lifetime of a workflow:
There are two timeout types that are relevant to workflow and decision tasks:
- Execution Start to Close
This timeout specifies the maximum time that a workflow execution can take to complete. It is set as a default during workflow registration, but it can be overridden with a different value when the workflow is started. If this timeout is exceeded, Amazon SWF closes the workflow execution and adds an event of type WorkflowExecutionTimedOut to the workflow execution history.
In addition to the timeoutType, the event attributes specify the childPolicy that is in effect for this workflow execution. The child policy specifies how child workflow executions are handled if the parent workflow execution times out or otherwise terminates. For example, if the childPolicy is set to TERMINATE, then child workflow executions will be terminated.
Once a workflow execution has timed out, you cannot take any action on it other than visibility calls.
- Task Start to Close
This timeout specifies the maximum time that the corresponding decider can take to complete a decision task. It is set during workflow type registration. If this timeout is exceeded, the task is marked as timed out in the workflow execution history, and Amazon SWF adds an event of type DecisionTaskTimedOut to the workflow history.
The event attributes will include the IDs for the events that correspond to when this decision task was scheduled (scheduledEventId) and when it was started (startedEventId). In addition to adding the event, Amazon SWF also schedules a new decision task to alert the decider that this decision task timed out.
After this timeout occurs, an attempt to complete the timed-out decision task using RespondDecisionTaskCompleted will fail.
The following diagram shows how timeouts are related to the lifetime of an activity task:
There are four timeout types that are relevant to activity tasks:
- Activity Task Start to Close
- This timeout specifies the maximum time that an activity worker can take to process a task after the worker has received the task. Attempts to close a timed out activity task using RespondActivityTaskCanceled, RespondActivityTaskCompleted, and RespondActivityTaskFailed will fail.
- Activity Task Heartbeat
- This timeout specifies the maximum time that a task can run before providing its progress through the RecordActivityTaskHeartbeat action.
- Activity Task Schedule to Start
- This timeout specifies how long Amazon SWF waits before timing out the activity task if no workers are available to perform the task. Once timed out, the expired task will not be assigned to another worker.
- Activity Task Schedule to Close
- This timeout specifies how long the task can take from the time it is scheduled to the time it is complete. As a best practice, this value should not be greater than the sum of the task schedule-to-start timeout and the task start-to-close timeout.
Each of the timeout types has a default value, which is generally set to NONE (infinite). The maximum time for any activity execution is limited to one year, however.
You set default values for these during activity type registration, but you can override them with new values when you schedule the activity task. When one of these timeouts occurs, Amazon SWF will add an event of type ActivityTaskTimedOut to the workflow history. The timeoutType value attribute of this event will specify which of these timeouts occurred. For each of the timeouts, the value of timeoutType is shown in parentheses. The event attributes will also include the IDs for the events that correspond to when the activity task was scheduled (scheduledEventId) and when it was started (startedEventId). In addition to adding the event, Amazon SWF also schedules a new decision task to alert the decider that the timeout occurred.