Amazon Elastic MapReduce
Developer Guide (API Version 2009-03-31)
« PreviousNext »
View the PDF for this guide.Go to the AWS Discussion Forum for this product.Go to the Kindle Store to download this guide in Kindle format.Did this page help you?  Yes | No |  Tell us about it...

Protect a Cluster from Termination

Termination protection ensures that the EC2 instances in your job flow are not shut down by an accident or error. This protection is especially useful if your cluster contains data in instance storage that you need to recover before those instances are terminated.

By default, termination protection is disabled on clusters. When termination protection is not enabled, you can terminate clusters either through calls to the TerminateJobFlows API, through the Amazon EMR console, or by using the command line interface. In addition, the master node may terminate a task node that has become unresponsive or has returned an error.

When termination protection is enabled, you must explicitly remove termination protection from the cluster before you can terminate the cluster. With termination protection enabled, TerminateJobFlows can't terminate the cluster and users can't terminate the cluster using the CLI. Users terminating the cluster using the Amazon EMR console receive an extra confirmation box asking if they want to remove termination protection before terminating the cluster.

If you attempt to terminate a protected cluster with the API or CLI, the API returns an error, and the CLI exits with a non-zero return code.

The ActionOnFailure setting determines what the cluster does in response to any errors. The possible values for this setting are:

  • TERMINATE_JOB_FLOW: If the step fails, terminate the job flow. If the job flow has termination protection enabled AND keep alive enabled, it will not terminate.

  • CANCEL_AND_WAIT: If the step fails, cancel the remaining steps. If the cluster has keep alive enabled, the cluster will not terminate.

  • CONTINUE: If the step fails, continue to the next step.

Note

Use cluster termination protection judiciously because it can lead to additional charges for the persistent EC2 instances.

Termination Protection in Amazon EMR and Amazon EC2

Termination protection of clusters in Amazon EMR is analogous to setting the disableAPITermination flag on an EC2 instance. In the event of a conflict between the termination protection set in Amazon EC2 and that set in Amazon EMR, the Amazon EMR cluster protection status overrides that set by Amazon EC2 on the given instance. For example, if you use the Amazon EC2 console to enable termination protection on an EC2 instance in an Amazon EMR cluster that has termination protection disabled, Amazon EMR turns off termination protection on that EC2 instance and shuts down the instance when the rest of the cluster terminates.

Termination Protection and Spot Instances

Amazon EMR termination protection does not prevent an EC2 Spot Instance from terminating when the Spot price rises above the maximum bid price. For more information about the behavior of EC2 Spot Instances in Amazon EMR, see Lower Costs with Spot Instances (Optional).

Termination Protection and Keep Alive

Enabling termination protection on a cluster is similar to enabling keep alive on a cluster (using the --alive argument in the CLI), but the protections each offers are different. Keep alive causes instances in a cluster to persist after the cluster has successfully completed, but still allows the cluster to be terminated by calls to TerminateJobFlows and errors. Termination protection allows the job to terminate after successful completion, but keeps it persistent in the case of user actions, errors, and TerminateJobFlow calls.

The following table compares the protections offered by termination protection and keep alive.

Protects against termination from...Termination ProtectionKeep Alive
Successful completion 
User actions
 
TerminateJobFlows API
 
Errors
 

Protecting a New Cluster

You can specify that a new cluster be protected from termination during the cluster creation.

Launch a cluster with termination protection using the Amazon EMR console

  1. Open the Amazon Elastic MapReduce console at https://console.aws.amazon.com/elasticmapreduce/.

  2. Click Create cluster.

  3. In the Cluster Configuration section, set the Termination protection switch to Yes.

  4. Continue through the configuration sections, following the directions for the type of cluster you are launching. For more information, see Plan an Amazon EMR Cluster.

Launch a cluster with termination protection using the CLI

  • Specify --with-termination-protection during the cluster creation call. The following example shows setting termination protection on the WordCount sample application.

    In the directory where you installed the Amazon EMR CLI, run the following from the command line. For more information, see the Command Line Interface Reference for Amazon EMR.

    Note

    The Hadoop streaming syntax is different between Hadoop 1.x and Hadoop 2.x.

    For Hadoop 2.x, use the following command:

    • Linux, UNIX, and Mac OS X users:

      ./elastic-mapreduce --create --alive --ami-version 3.0.3 \
      							--instance-type m1.xlarge --num-instances 2 \
      							--stream --arg "-files" --arg "s3://elasticmapreduce/samples/wordcount/wordSplitter.py" \
      							--input s3://elasticmapreduce/samples/wordcount/input \
      							--output s3://myawsbucket/output/2014-01-16 --mapper wordSplitter.py --reducer aggregate \
      							--with-termination-protection
    • Windows users:

      ruby elastic-mapreduce --create --alive --ami-version 3.0.3 --instance-type m1.xlarge --num-instances 2 --stream --arg "-files" --arg "s3://elasticmapreduce/samples/wordcount/wordSplitter.py" --input s3://elasticmapreduce/samples/wordcount/input --output s3://myawsbucket/output/2014-01-16 --mapper wordSplitter.py --reducer aggregate --with-termination-protection

    For Hadoop 1.x, use the following command:

    • Linux, UNIX, and Mac OS X users:

      ./elastic-mapreduce --create --alive /
      --instance-type m1.xlarge --num-instances 2 --stream /
      --input s3://elasticmapreduce/samples/wordcount/input /
      --output s3://myawsbucket/wordcount/output/2011-03-25 /
      --mapper s3://elasticmapreduce/samples/wordcount/wordSplitter.py --reducer aggregate /
      --with-termination-protection
    • Windows users:

      ruby elastic-mapreduce --create --alive --instance-type m1.xlarge --num-instances 2 --stream --input s3://elasticmapreduce/samples/wordcount/input --output s3://myawsbucket/wordcount/output/2011-03-25 --mapper s3://elasticmapreduce/samples/wordcount/wordSplitter.py --reducer aggregate --with-termination-protection

    For more information about launching clusters using the CLI, see Plan an Amazon EMR Cluster.

Protecting an Existing Cluster

You can add termination protection to an already running cluster using either the CLI or the API.

Note

You cannot currently add termination protection to a running cluster using the Amazon EMR console.

To enable termination protection for an existing cluster using the CLI

  • Set the --set-termination-protection flag to true. This is shown in the following example, where JobFlowID is the identifier of the cluster on which to enable termination protection.

    In the directory where you installed the Amazon EMR CLI, run the following from the command line. For more information, see the Command Line Interface Reference for Amazon EMR.

    • Linux, UNIX, and Mac OS X users:

      ./elastic-mapreduce --set-termination-protection true --jobflow JobFlowID
    • Windows users:

      ruby elastic-mapreduce --set-termination-protection true --jobflow JobFlowID

\

Terminating a Protected Cluster

To terminate a protected cluster, you must first disable termination protection. After termination protection is disabled, you can terminate the cluster from the Amazon EMR console, CLI, or programmatically using the TerminateJobFlows API.

To terminate a cluster with termination protection set using the Amazon EMR console.

  1. Sign in to the AWS Management Console and open the Amazon Elastic MapReduce console at https://console.aws.amazon.com/elasticmapreduce/.

  2. Select the cluster to terminate.

  3. Click Terminate.

  4. Click Terminate on the confirmation dialog box, to confirm that you wish to disable termination protection and terminate the cluster.

To terminate a cluster with termination protection set using the CLI

  1. Disable termination protection by setting the --set-termination-protection to false. This is shown in the following example, where JobFlowID is the identifier of the cluster on which to disable termination protection.

    elastic-mapreduce --set-termination-protection false --jobflow JobFlowID
  2. Terminate the cluster using the --terminate parameter and specifying the cluster identifier of the cluster to terminate.

    In the directory where you installed the Amazon EMR CLI, run the following from the command line. For more information, see the Command Line Interface Reference for Amazon EMR.

    • Linux, UNIX, and Mac OS X users:

      ./elastic-mapreduce --terminate JobFlowID
    • Windows users:

      ruby elastic-mapreduce --terminate JobFlowID