Configure Transient and Long-Running Clusters
You can run your cluster as a transient process: one that launches the cluster, loads the input data, processes the data, stores the output results, and then automatically shuts down. This is the standard model for a cluster that is performing a periodic processing task. Shutting down the cluster automatically ensures that you are only billed for the time required to process your data.
The other model for running a cluster is as a long-running cluster. In this model, the cluster launches and loads the input data. From there you might interactively query the data, use the cluster as a data warehouse, or do periodic processing on a data set so large that it would be inefficient to load the data into new clusters each time. In this model, the cluster persists even when there are no tasks queued for processing.
If you want your cluster to be long-running, you must disable auto-termination when you launch the cluster. You can do this when you launch a cluster using the console, the AWS CLI, or programmatically. Another option you might want to enable on a long-running cluster is termination protection. This protects your cluster from being terminated accidentally or in the event that an error occurs. For more information, see Managing Cluster Termination.
To launch a long-running cluster using the console
Open the Amazon EMR console at https://console.aws.amazon.com/elasticmapreduce/.
Choose Create cluster.
Choose Go to advanced options.
In the Steps section, in the Auto-terminate field, choose No, which runs the cluster until you terminate it.
Remember to terminate the cluster when it is done so you do not continue to accrue charges on an idle cluster.
Proceed with creating the cluster as described in Plan and Configure Clusters.
To launch a long-running cluster using the AWS CLI
By default, clusters created using the AWS CLI are long-running. If you wish, you may
optionally specify the
--no-auto-terminate parameter when you use the
To launch a long-running cluster using the
--no-auto-terminateparameter, type the following command and replace
myKeywith the name of your EC2 key pair.Copy
aws emr create-cluster --name "
Test cluster" --release-label
Pig--use-default-roles --ec2-attributes KeyName=
If you have not previously created the default EMR service role and EC2 instance
profile, type aws
emr create-default-roles to create them before typing the
For more information on using Amazon EMR commands in the AWS CLI, see http://docs.aws.amazon.com/cli/latest/reference/emr.