Amazon EMR
Developer Guide

Life Cycle of a Cluster

The following diagram shows the life cycle of a cluster and how each stage maps to a particular cluster state.

Amazon EMR Cluster Life Cycle

A successful Amazon EMR (Amazon EMR) cluster follows this process: Amazon EMR first provisions a Hadoop cluster. During this phase, the cluster state is STARTING. Next, any user-defined bootstrap actions are run. During this phase, the cluster state is BOOTSTRAPPING.


Once the cluster reaches this phase, you are being billed for the EC2 instances provisioned.

After all bootstrap actions are completed, the cluster state is RUNNING. The job flow sequentially runs all cluster steps during this phase.

If you configured your cluster as a long-running cluster by enabling keep alive, the cluster will go into a WAITING state after processing is done and wait for the next set of instructions. For more information, see How to Send Work to a Cluster and Choose the Cluster Lifecycle: Long-Running or Transient. You will have to manually terminate the cluster when you no longer require it.

If you configured your cluster as a transient cluster, it will automatically shut down after all of the steps complete.

When a cluster terminates without encountering an error, the state transitions to SHUTTING_DOWN and the cluster shuts down, terminating the virtual server instances. All data stored on the cluster is deleted. Information stored elsewhere, such as in your Amazon S3 bucket, persists. Finally, when all cluster activity is complete, the cluster state is marked as COMPLETED.

Unless termination protection is enabled, any failure during the cluster process terminates the cluster and all its virtual server instances. Any data stored on the cluster is deleted. The cluster state is marked as FAILED. For more information, see Managing Cluster Termination.

For a complete list of cluster states, see the JobFlowExecutionStatusDetail data type in the Amazon EMR (Amazon EMR) API ReferenceAmazon EMR API Reference.