Menu
Amazon EMR
Management Guide

Configure Transient and Long-Running Clusters

You can run your cluster as a transient process: one that launches the cluster, loads the input data, processes the data, stores the output results, and then automatically shuts down. This is the standard model for a cluster that is performing a periodic processing task. Shutting down the cluster automatically ensures that you are only billed for the time required to process your data.

The other model for running a cluster is as a long-running cluster. In this model, the cluster launches and loads the input data. From there you might interactively query the data, use the cluster as a data warehouse, or do periodic processing on a data set so large that it would be inefficient to load the data into new clusters each time. In this model, the cluster persists even when there are no tasks queued for processing.

If you want your cluster to be long-running, you must disable auto-termination when you launch the cluster. You can do this when you launch a cluster using the console, the AWS CLI, or programmatically. Another option you might want to enable on a long-running cluster is termination protection. This protects your cluster from being terminated accidentally or in the event that an error occurs. For more information, see Managing Cluster Termination.

To launch a long-running cluster using the console

  1. Open the Amazon EMR console at https://console.aws.amazon.com/elasticmapreduce/.

  2. Choose Create cluster.

  3. Choose Go to advanced options.

  4. In the Steps section, in the Auto-terminate field, choose No, which runs the cluster until you terminate it.

    Remember to terminate the cluster when it is done so you do not continue to accrue charges on an idle cluster.

  5. Proceed with creating the cluster as described in Plan and Configure Clusters.

To launch a long-running cluster using the AWS CLI

By default, clusters created using the AWS CLI are long-running. If you wish, you may optionally specify the --no-auto-terminate parameter when you use the create-cluster subcommand.

  • To launch a long-running cluster using the --no-auto-terminate parameter, type the following command and replace myKey with the name of your EC2 key pair.

    Copy
    aws emr create-cluster --name "Test cluster" --release-label emr-4.0.0 --applications Name=Hive Name=Pig --use-default-roles --ec2-attributes KeyName=myKey --instance-type m3.xlarge --instance-count 3 --no-auto-terminate

Note

If you have not previously created the default EMR service role and EC2 instance profile, type aws emr create-default-roles to create them before typing the create-cluster subcommand.

For more information on using Amazon EMR commands in the AWS CLI, see http://docs.aws.amazon.com/cli/latest/reference/emr.