Menu
Amazon EMR
Management Guide

Manually Resizing a Running Cluster

You can resize the core instance group in a running cluster by adding Amazon EC2 instances using the AWS Management Console, AWS CLI, or the Amazon EMR API. Applications can use newly provisioned Amazon EC2 instance capacity to host nodes as soon as the new nodes become available. You can also shrink the size of the instance group. A cluster can either be configured to terminate Amazon EC2 instances at the instance-hour boundary (the default in clusters created with Amazon EMR version 5.1.0 and later), or to terminate Amazon EC2 instances only after all processes on the nodes have completed. For more information, see Configure Cluster Scale-Down.

Task nodes also run your Hadoop jobs. After a cluster is running, you can increase or decrease the number of task nodes, and you can add additional task instance groups using the AWS Management Console, AWS CLI, or the Amazon EMR API.

When your cluster runs, Hadoop determines the number of mapper and reducer tasks needed to process the data. Larger clusters should have more tasks for better resource use and shorter processing time. Typically, an EMR cluster remains the same size during the entire cluster; you set the number of tasks when you create the cluster. When you resize a running cluster, you can vary the processing during the cluster execution. Therefore, instead of using a fixed number of tasks, you can vary the number of tasks during the life of the cluster. There are two configuration options to help set the ideal number of tasks:

  • mapred.map.tasksperslot

  • mapred.reduce.tasksperslot

You can set both options in the mapred-conf.xml file. When you submit a job to the cluster, the job client checks the current total number of map and reduce slots available clusterwide. The job client then uses the following equations to set the number of tasks:

  • mapred.map.tasks = mapred.map.tasksperslot * map slots in cluster

  • mapred.reduce.tasks = mapred.reduce.tasksperslot * reduce slots in cluster

The job client only reads the tasksperslot parameter if the number of tasks is not configured. You can override the number of tasks at any time, either for all clusters via a bootstrap action or individually per job by adding a step to change the configuration.

Amazon EMR withstands slave node failures and continues cluster execution even if a slave node becomes unavailable. Amazon EMR automatically provisions additional slave nodes to replace those that fail.

You can have a different number of slave nodes for each cluster step. You can also add a step to a running cluster to modify the number of slave nodes. Because all steps are guaranteed to run sequentially by default, you can specify the number of running slave nodes for any step.

Resize a Cluster Using the Console

You can use the Amazon EMR console to resize a running cluster.

To resize a running cluster using the console

  1. From the Cluster List page, choose a cluster to resize.

  2. On the Cluster Details page, choose Resize. Alternatively, you can expand the Hardware Configuration section, and choose the Resize button adjacent to the core or task groups.

  3. To add nodes to the core group or to one or more task groups, choose the Resize link in the Count column, change the number of instances, and choose the green check mark.

    To add additional task instance groups, choose Add task instance group, choose the task node type, the number of task nodes, and whether the task nodes are Spot instances. If you attempt to add more than 48 task groups, you receive an error message. If your cluster was launched without a task group, choose Add task nodes to add one.

Note

You can only increase the number of core nodes using the console, but you can both increase and decrease the number of task nodes.

When you make a change to the number of nodes, the Amazon EMR console updates the status of the instance group through the Provisioning and Resizing states until they are ready and indicate in brackets the newly requested number of nodes. When the change to the node count finishes, the instance groups return to the Running state.

Resize a Cluster Using the AWS CLI

You can use the AWS CLI to resize a running cluster. You can increase or decrease the number of task nodes, and you can increase the number of core nodes in a running cluster. It is also possible to terminate an instance in the core instance group using the AWS CLI or the API. This should be done with caution. Terminating an instance in the core instance group risks data loss, and the instance is not automatically replaced.

In addition to resizing the core and task groups, you can also add one or more task instance groups to a running cluster using the AWS CLI.

To resize a cluster by changing the instance count using the AWS CLI

You can add instances to the core group or task group, and you can remove instances from the task group using the AWS CLI modify-instance-groups subcommand with the InstanceCount parameter. To add instances to the core or task groups, increase the InstanceCount. To reduce the number of instances in the task group, decrease the InstanceCount. Changing the instance count of the task group to 0 removes all instances but not the instance group.

  • To increase the number of instances in the task instance group from 3 to 4, type the following command and replace ig-31JXXXXXXBTO with the instance group ID.

    aws emr modify-instance-groups --instance-groups InstanceGroupId=ig-31JXXXXXXBTO,InstanceCount=4

    To retrieve the InstanceGroupId, use the describe-cluster subcommand. The output is a JSON object called Cluster that contains the ID of each instance group. To use this command, you need the cluster ID (which you can retrieve using the aws emr list-clusters command or the console). To retrieve the instance group ID, type the following command and replace j-2AXXXXXXGAPLF with the cluster ID.

    aws emr describe-cluster --cluster-id j-2AXXXXXXGAPLF

    Using the AWS CLI, you can also terminate an instance in the core instance group with the --modify-instance-groups subcommand. This should be done with caution. Terminating an instance in the core instance group risks data loss, and the instance is not automatically replaced. To terminate a specific instance you need the instance group ID (returned by the aws emr describe-cluster --cluster-id subcommand) and the instance ID (returned by the aws emr list-instances --cluster-id subcommand), type the following command, replace ig-6RXXXXXX07SA with the instance group ID and replace i-f9XXXXf2 with the instance ID.

    aws emr modify-instance-groups --instance-groups InstanceGroupId=ig-6RXXXXXX07SA,EC2InstanceIdsToTerminate=i-f9XXXXf2

    For more information on using Amazon EMR commands in the AWS CLI, see http://docs.aws.amazon.com/cli/latest/reference/emr.

To resize a cluster by adding task instance groups using the AWS CLI

Using the AWS CLI, you can add between one and 48 task instance groups to a cluster with the --add-instance-groups subcommand. Task instances groups can only be added to a cluster containing a master instance group and a core instance group. When using the AWS CLI, you can add up to 5 task instance groups each time you use the --add-instance-groups subcommand.

  1. To add a single task instance group to a cluster, type the following command and replace j-JXBXXXXXX37R with the cluster ID.

    aws emr add-instance-groups --cluster-id j-JXBXXXXXX37R --instance-groups InstanceCount=6,InstanceGroupType=task,InstanceType=m1.large
  2. To add multiple task instance groups to a cluster, type the following command and replace j-JXBXXXXXX37R with the cluster ID. You can add up to 5 task instance groups in a single command.

    aws emr add-instance-groups --cluster-id j-JXBXXXXXX37R --instance-groups InstanceCount=6,InstanceGroupType=task,InstanceType=m1.large InstanceCount=10,InstanceGroupType=task,InstanceType=m3.xlarge

    For more information on using Amazon EMR commands in the AWS CLI, see http://docs.aws.amazon.com/cli/latest/reference/emr.

Interrupting a Resize

Note

This feature is for Amazon EMR releases 4.1.0 or greater.

You can issue a resize in the midst of an existing resize operation. Additionally, you can stop a previously submitted resize request or submit a new request to override a previous request without waiting for it to finish. You can also stop an existing resize from the console or using the ModifyInstanceGroups API call with the current count as the target count of the cluster.

The following screenshot shows a task instance group that is resizing but can be stopped by choosing Stop.

To interrupt a resize using the CLI

You can use the AWS CLI to stop a resize by using the modify-instance-groups subcommand. Assume you have six instances in your instance group and you want to increase this to 10. You later decide that you would like to cancel this request:

  • The initial request:

    aws emr modify-instance-groups --instance-groups InstanceGroupId=ig-myInstanceGroupId,InstanceCount=10

    The second request to stop the first request:

    aws emr modify-instance-groups --instance-groups InstanceGroupId=ig-myInstanceGroupId,InstanceCount=6

Note

Because this process is asynchronous, you may see instance counts change with respect to previous API requests before subsequent requests are honored. In the case of shrinking, it is possible that if you have work running on the nodes, the instance group may not shrink until nodes have completed their work.

Arrested State

An instance group goes into arrested state if it encounters too many errors while trying to start the new cluster nodes. For example, if new nodes fail while performing bootstrap actions, the instance group goes into an ARRESTED state, rather than continuously provisioning new nodes. After you resolve the underlying issue, reset the desired number of nodes on the cluster's instance group, and then the instance group resumes allocating nodes. Modifying an instance group instructs Amazon EMR to attempt to provision nodes again. No running nodes are restarted or terminated.

In the AWS CLI, the list-instances subcommand returns all instances and their states as does the describe-cluster subcommand. If Amazon EMR detects a fault with an instance group, it changes the group's state to ARRESTED.

To reset a cluster in an ARRESTED state using the AWS CLI

Type the describe-cluster subcommand with the --cluster-id parameter to view the state of the instances in your cluster.

  • To view information on all instances and instance groups in a cluster, type the following command and replace j-3KVXXXXXXY7UG with the cluster ID.

    aws emr describe-cluster --cluster-id j-3KVXXXXXXY7UG

    The output displays information about your instance groups and the state of the instances:

    {
        "Cluster": {
            "Status": {
                "Timeline": {
                    "ReadyDateTime": 1413187781.245,
                    "CreationDateTime": 1413187405.356
                },
                "State": "WAITING",
                "StateChangeReason": {
                    "Message": "Waiting after step completed"
                }
            },
            "Ec2InstanceAttributes": {
                "Ec2AvailabilityZone": "us-west-2b"
            },
            "Name": "Development Cluster",
            "Tags": [],
            "TerminationProtected": false,
            "RunningAmiVersion": "3.2.1",
            "NormalizedInstanceHours": 16,
            "InstanceGroups": [
                {
                    "RequestedInstanceCount": 1,
                    "Status": {
                        "Timeline": {
                            "ReadyDateTime": 1413187775.749,
                            "CreationDateTime": 1413187405.357
                        },
                        "State": "RUNNING",
                        "StateChangeReason": {
                            "Message": ""
                        }
                    },
                    "Name": "MASTER",
                    "InstanceGroupType": "MASTER",
                    "InstanceType": "m1.large",
                    "Id": "ig-3ETXXXXXXFYV8",
                    "Market": "ON_DEMAND",
                    "RunningInstanceCount": 1
                },
                {
                    "RequestedInstanceCount": 1,
                    "Status": {
                        "Timeline": {
                            "ReadyDateTime": 1413187781.301,
                            "CreationDateTime": 1413187405.357
                        },
                        "State": "RUNNING",
                        "StateChangeReason": {
                            "Message": ""
                        }
                    },
                    "Name": "CORE",
                    "InstanceGroupType": "CORE",
                    "InstanceType": "m1.large",
                    "Id": "ig-3SUXXXXXXQ9ZM",
                    "Market": "ON_DEMAND",
                    "RunningInstanceCount": 1
                }
    ...
    }

    To view information on a particular instance group, type the list-instances subcommand with the --cluster-id and --instance-group-types parameters. You can view information for the MASTER, CORE, or TASK groups.

    aws emr list-instances --cluster-id j-3KVXXXXXXY7UG --instance-group-types "CORE"

    Use the modify-instance-groups subcommand with the --instance-groups parameter to reset a cluster in the ARRESTED state. The instance group id is returned by the describe-cluster subcommand.

    aws emr modify-instance-groups --instance-groups InstanceGroupId=ig-3SUXXXXXXQ9ZM,InstanceCount=3