Introduction to Amazon SWF
A growing number of applications are relying on asynchronous and distributed processing. The scalability of such applications is the primary motivation for using this approach. By designing autonomous distributed components, developers have the flexibility to deploy and scale out parts of the application independently if the load on the application increases. Another motivation is the availability of cloud services. As application developers start taking advantage of cloud computing, they have a need to combine their existing on-premises assets with additional cloud-based assets. Yet another motivation for the asynchronous and distributed approach is the inherent distributed nature of the process being modeled by the application; for example, the automation of an order fulfillment business process may span several systems and human tasks.
Developing such applications can be complicated. It requires that you coordinate the execution of multiple distributed components and deal with the increased latencies and unreliability inherent in remote communication. To accomplish this, you would typically need to write complicated infrastructure involving message queues and databases, along with the complex logic to synchronize them.
The Amazon Simple Workflow Service (Amazon SWF) makes it easier to develop asynchronous and distributed applications by providing a programming model and infrastructure for coordinating distributed components and maintaining their execution state in a reliable way. By relying on Amazon SWF, you are freed to focus on building the aspects of your application that differentiate it.
Simple Workflow Concepts
The basic concepts necessary for understanding Amazon SWF workflows are introduced below and are explained further in the subsequent sections of this guide. The following discussion is a high-level overview of the structure and components of a workflow.
The fundamental concept in Amazon SWF is the workflow. A workflow is a set of activities that carry out some objective, together with logic that coordinates the activities. For example, a workflow could receive a customer order and take whatever actions are necessary to fulfill it. Each workflow runs in an AWS resource called a domain, which controls the workflow's scope. An AWS account can have multiple domains, each of which can contain multiple workflows, but workflows in different domains cannot interact.
When designing an Amazon SWF workflow, you precisely define each of the required activities. You then register each activity with Amazon SWF as an activity type. When you register the activity, you provide information such as a name and version, and some timeout values based on how long you expect the activity to take. For example, a customer may have an expectation that an order will ship within 24 hours. Such expectations would inform the timeout values that you specify when registering your activities.
In the process of carrying out the workflow, some activities may need to be performed more than once, perhaps with varying inputs. For example, in a customer-order workflow, you might have an activity that handles purchased items. If the customer purchases multiple items, then this activity would have to run multiple times. Amazon SWF has the concept of an activity task that represents one invocation of an activity. In our example, the processing of each item would be represented by a single activity task.
An activity worker is a program that receives activity tasks, performs them, and provides results back. Note that the task itself might actually be performed by a person, in which case the person would use the activity worker software for the receipt and disposition of the task. An example might be a statistical analyst, who receives sets of data, analyzes them, and then sends back the analysis.
Activity tasks—and the activity workers that perform them—can run synchronously or asynchronously. They can be distributed across multiple computers, potentially in different geographic regions, or they can all run on the same computer. Different activity workers can be written in different programming languages and run on different operating systems. For example, one activity worker might be running on a desktop computer in Asia, whereas a different activity worker might be running on a hand-held computer device in North America.
The coordination logic in a workflow is contained in a software program called a decider. The decider schedules activity tasks, provides input data to the activity workers, processes events that arrive while the workflow is in progress, and ultimately ends (or closes) the workflow when the objective has been completed.
The role of the Amazon SWF service is to function as a reliable central hub through which data is exchanged between the decider, the activity workers, and other relevant entities such as the person administering the workflow. Amazon SWF also maintains the state of each workflow execution, which saves your application from having to store the state in a durable way.
The decider directs the workflow by receiving decision tasks from Amazon SWF and responding back to Amazon SWF with decisions. A decision represents an action or set of actions which are the next steps in the workflow. A typical decision would be to schedule an activity task. Decisions can also be used to set timers to delay the execution of an activity task, to request cancellation of activity tasks already in progress, and to complete or close the workflow.
The mechanism by which both the activity workers and the decider receive their tasks (activity tasks and decision tasks respectively) is by polling the Amazon SWF service.
Amazon SWF informs the decider of the state of the workflow by including with each decision task, a copy of the current workflow execution history. The workflow execution history is composed of events, where an event represents a significant change in the state of the workflow execution. Examples of events would be the completion of a task, notification that a task has timed out, or the expiration of a timer that was set earlier in the workflow execution. The history is a complete, consistent, and authoritative record of the workflow's progress.
A user must have authorized AWS access keys to run workflows in your account. However, access keys provide full access to all of the resources in your account and are difficult to revoke, so they are not appropriate for all applications. Amazon SWF access control uses AWS Identity and Access Management (IAM), which allows you to provide access to AWS resources in a controlled and limited way that does not expose your access keys. For example, you can allow a user to access your account, but only to run certain workflows in a particular domain.
Bringing together the ideas discussed in the preceding sections, here is an overview of the steps to develop and run a workflow in Amazon SWF:
Write activity workers that implement the processing steps in your workflow.
Write a decider to implement the coordination logic of your workflow.
Register your activities and workflow with Amazon SWF.
You can do this step programmatically or by using the AWS Management Console.
Start your activity workers and decider.
These actors can run on any computing device that can access an Amazon SWF endpoint. For example, you could use compute instances in the cloud, such as Amazon Elastic Compute Cloud (Amazon EC2); servers in your data center; or even a mobile device, to host a decider or activity worker. Once started, the decider and activity workers should start polling Amazon SWF for tasks.
Start one or more executions of your workflow.
Executions can be initiated either programmatically or via the AWS Management Console.
Each execution runs independently and you can provide each with its own set of input data. When an execution is started, Amazon SWF schedules the initial decision task. In response, your decider begins generating decisions which initiate activity tasks. Execution continues until your decider makes a decision to close the execution.
View workflow executions using the AWS Management Console.
You can filter and view complete details of running as well as completed executions. For example, you can select an open execution to see which tasks have completed and what their results were.