Amazon Elastic MapReduce
Developer Guide (API Version 2009-03-31)
Did this page help you?  Yes | No |  Tell us about it...
« PreviousNext »
View the PDF for this guide.Go to the AWS Discussion Forum for this product.Go to the Kindle Store to download this guide in Kindle format.

Life Cycle of a Cluster

The following diagram shows the life cycle of a cluster and how each stage maps to a particular cluster state.

Amazon EMR Cluster Life Cycle

A successful Amazon Elastic MapReduce (Amazon EMR) cluster follows this process: Amazon EMR first provisions a Hadoop cluster. During this phase, the cluster state is STARTING. Next, any user-defined bootstrap actions are run. During this phase, the cluster state is BOOTSTRAPPING. After all bootstrap actions are completed, the cluster state is RUNNING. The job flow sequentially runs all cluster steps during this phase.

If you configured your cluster as a long-running cluster by enabling keep alive, the cluster will go into a WAITING state after processing is done and wait for the next set of instructions. For more information, see How to Send Work to a Cluster and Choose the Cluster Lifecycle: Long-Running or Transient. You will have to manually terminate the cluster when you no longer require it.

If you configured your cluster as a transient cluster, it will automatically shut down after all of the steps complete.

When a cluster terminates without encountering an error, the state transitions to SHUTTING_DOWN and the cluster shuts down, terminating the virtual server instances. All data stored on the cluster is deleted. Information stored elsewhere, such as in your Amazon S3 bucket, persists. Finally, when all cluster activity is complete, the cluster state is marked as COMPLETED.

Unless termination protection is enabled, any failure during the cluster process terminates the cluster and all its virtual server instances. Any data stored on the cluster is deleted. The cluster state is marked as FAILED. For more information, see Managing Cluster Termination.

For a complete list of cluster states, see the JobFlowExecutionStatusDetail data type in the Amazon Elastic MapReduce (Amazon EMR) API Reference.