| « PreviousNext » | |
![]() ![]() ![]() | Did this page help you? Yes | No | Tell us about it... |
You can increase or decrease the number of nodes in a running cluster. A cluster contains a single master node. The master node controls any slave nodes that are present. There are two types of slave nodes: core nodes, which hold data to process in the Hadoop Distributed File System (HDFS), and task nodes, which do not contain HDFS. After a cluster is running, you can increase, but not decrease, the number of core nodes. Task nodes also run your Hadoop jobs. After a cluster is running, you can both increase or decrease the number of task nodes.
You can modify the size of a running cluster using either the API or the CLI. The Amazon EMR console allows you to monitor clusters that you resized, but it does not provide the option to resize clusters.
Nodes within a cluster are managed by instance groups. All clusters require a master instance group containing a single master node. Clusters using slave nodes require a core instance group that contains at least one core node. Additionally, if a cluster has a core instance group, it can also have a task instance group containing one or more task nodes.
When your cluster runs, Hadoop determines the number of mapper and reducer tasks needed to process the data. Larger clusters should have more tasks for better resource use and shorter processing time. Typically, an Amazon EMR cluster remains the same size during the entire cluster; you set the number of tasks when you create the cluster. When you resize a running cluster, you can vary the processing during the cluster execution. Therefore, instead of using a fixed number of tasks, you can vary the number of tasks during the life of the cluster. There are two configuration options to help set the ideal number of tasks.
mapred.map.tasksperslot
mapred.reduce.tasksperslot
You can set both options in the mapred-conf.xml file. When you submit a job
flow to the cluster, the job client checks the current total number of map and reduce slots
available cluster wide. The job client then uses the following equations to set the number of
tasks:
mapred.map.tasks = mapred.map.tasksperslot * map slots in
cluster
mapred.reduce.tasks = mapred.reduce.tasksperslot * reduce
slots in cluster
The job client only reads the tasksperslot parameter if the number
of tasks is not configured. You can override the number of tasks at any time, either for all
clusters via a bootstrap action or individually per job by adding a step to change the
configuration.
Amazon EMR withstands slave node failures and continues cluster execution even if a slave node becomes unavailable. Amazon EMR automatically provisions additional slave nodes to replace those that fail.
You can have a different number of slave nodes for each cluster step. You can also add a step to a running cluster to modify the number of its slave nodes. Because all steps are guaranteed to run sequentially, you can specify the number of running slave nodes for any job flow step.
The Amazon EMR CLI provides parameters so you can control how you resize a running cluster.
You can increase or decrease the number of nodes in a running cluster. The parameters are listed in the following table.
| Parameter | Description |
|---|---|
--modify-instance-group
| Modify an existing instance group. |
--instance-count
|
Set the count of nodes for an instance group. Note You are only allowed to increase the number of nodes in a core instance group. You can increase or decrease the number of nodes in a task instance group. Master instance groups can not be modified. |
You can add an instance group to your running cluster. The parameters are listed in the following table.
| Parameter | Description |
|---|---|
--add-instance-group
| Add an instance group to an existing cluster. The role may be TASK only. Currently, Amazon Elastic MapReduce (Amazon EMR) does not permit adding core or master instance groups to a running cluster. |
--instance-count
| Set the count of nodes for an instance group. |
--instance-type
| Set the type of EC2 instance to create nodes for an instance group. |
You can specify instance groups when you create a cluster. The parameters are listed in the following table.
| Parameter | Description |
|---|---|
--instance-group | Set the instance group type. A type is MASTER, CORE, or TASK |
--instance-count
| Set the count of nodes for an instance group. |
--instance-type
| Set the type of EC2 instance for nodes in an instance group. |
The --describe command describes all instance groups and node
types. If you run elastic-mapreduce
--jobflow JobFlowID
--describe, you see a section called
InstanceGroups. You can see that your cluster contains a
master instance group and, potentially, core and task instance groups.
The following CLI commands show how to display information about the instance groups of a running cluster. In the directory where you installed the Amazon EMR CLI, run the following from the command line. For more information, see the Command Line Interface Reference for Amazon EMR.
Linux, UNIX, and Mac OS X users:
./elastic-mapreduce --jobflow JobFlowID --describeWindows users:
ruby elastic-mapreduce --jobflow JobFlowID --describeThis returns information about the cluster similar to the following:
{
"JobFlows": [
{
"Name": "Development Job Flow (requires manual termination)",
"LogUri": "s3n:\/\/myawsbucket\/FileName\/",
"ExecutionStatusDetail": {
"StartDateTime": null,
"EndDateTime": null,
"LastStateChangeReason": "Starting instances",
"CreationDateTime": DateTimeStamp,
"State": "STARTING",
"ReadyDateTime": null
},
"Steps": [],
"Instances": {
"MasterInstanceId": null,
"Ec2KeyName": "KeyName",
"NormalizedInstanceHours": 0,
"InstanceCount": 5,
"Placement": {
"AvailabilityZone": "us-east-1a"
},
"SlaveInstanceType": "m1.small",
"HadoopVersion": "0.20",
"MasterPublicDnsName": null,
"KeepJobFlowAliveWhenNoSteps": true,
"InstanceGroups": [
{
"StartDateTime": null,
"SpotPrice": null,
"Name": "Master Instance Group",
"InstanceRole": "MASTER",
"EndDateTime": null,
"LastStateChangeReason": "",
"CreationDateTime": DateTimeStamp,
"LaunchGroup": null,
"InstanceGroupId": "InstanceGroupID",
"State": "PROVISIONING",
"Market": "ON_DEMAND",
"ReadyDateTime": null,
"InstanceType": "m1.small",
"InstanceRunningCount": 0,
"InstanceRequestCount": 1
},
{
"StartDateTime": null,
"SpotPrice": null,
"Name": "Task Instance Group",
"InstanceRole": "TASK",
"EndDateTime": null,
"LastStateChangeReason": "",
"CreationDateTime": DateTimeStamp,
"LaunchGroup": null,
"InstanceGroupId": "InstanceGroupID",
"State": "PROVISIONING",
"Market": "ON_DEMAND",
"ReadyDateTime": null,
"InstanceType": "m1.small",
"InstanceRunningCount": 0,
"InstanceRequestCount": 2
},
{
"StartDateTime": null,
"SpotPrice": null,
"Name": "Core Instance Group",
"InstanceRole": "CORE",
"EndDateTime": null,
"LastStateChangeReason": "",
"CreationDateTime": DateTimeStamp,
"LaunchGroup": null,
"InstanceGroupId": "InstanceGroupID",
"State": "PROVISIONING",
"Market": "ON_DEMAND",
"ReadyDateTime": null,
"InstanceType": "m1.small",
"InstanceRunningCount": 0,
"InstanceRequestCount": 2
}
],
"MasterInstanceType": "m1.small"
},
"bootstrapActions": [],
"JobFlowId": "JobFlowID"
}
]
}An instance group goes into arrested state if it encounters too many errors while trying to start the new cluster nodes. For example, if new nodes fail while performing bootstrap actions, the instance group goes into an arrested state, rather than continuously provision new nodes. After you resolve the underlying issue, reset the desired number of nodes on the cluster's instance group, and then the instance group resumes allocating nodes.
The command --describe returns all instance groups and node
types, and so you can see the state of the instance groups for the cluster. If
Amazon EMR detects any kind of fault with an instance group, it changes the group's
state to ARRESTED.
Use the --modify-instance-group command to reset a cluster in
the ARRESTED state.
Modifying the instance group instructs Amazon EMR to attempt to provision nodes again. No running nodes are restarted or terminated.
To reset a cluster in an arrested state
Enter the --modify-instance-group command as follows:
In the directory where you installed the Amazon EMR CLI, run the following from the command line. For more information, see the Command Line Interface Reference for Amazon EMR.
Linux, UNIX, and Mac OS X users:
./elastic-mapreduce --modify-instance-groupInstanceGroupID\ -–instance-countCOUNT
Windows users:
ruby elastic-mapreduce --modify-instance-groupInstanceGroupID-–instance-countCOUNT
The <InstanceGroupID>/<InstanceGroupID>
is the ID of the arrested instance group and
is the number of nodes
you want in the instance group.<COUNT>
Tip
You do not need to change the number of nodes from the original configuration to free
a running cluster. Set -–instance-count to the same count as the
original setting.
Before October 2010, Amazon EMR did not have the concept of instance
groups. Clusters developed for Amazon EMR that were built before the
option to resize running clusters was available are considered legacy
clusters. Previously, the Amazon EMR architecture did not use instance groups to
manage nodes and only one type of slave node existed. Legacy clusters reference
slaveInstanceType and other now deprecated fields. Amazon EMR
continues to support the legacy clusters; you do not need to modify them to run them
correctly.
If you run a legacy cluster and only configure master and slave nodes, you observe
a slaveInstanceType and other deprecated fields associated with your clusters.
Before October 2010, all cluster nodes were either master nodes or slave nodes. An Amazon EMR configuration could typically be represented like the following diagram.

Old Amazon EMR Model
| 1 | A legacy cluster launches and a request is sent to Amazon EMR to start the cluster. |
| 2 | Amazon EMR creates a Hadoop cluster. |
| 3 | The legacy cluster runs on a cluster consisting of a single master node and the specified number of slave nodes. |
Clusters created using the older model are fully supported and function as originally designed. The Amazon EMR API and commands map directly to the new model. Master nodes remain master nodes and become part of the master instance group. Slave nodes still run HDFS and become core nodes and join the core instance group.
Note
No task instance group or task nodes are created as part of a legacy cluster, however you can add them to a running cluster at any time.
The following diagram illustrates how a legacy cluster now maps to master and core instance groups.

Old Amazon EMR Model Remapped to Current Architecture
| 1 | A request is sent to Amazon EMR to start a cluster. |
| 2 | Amazon EMR creates an Hadoop cluster with a master instance group and core instance group. |
| 3 | The master node is added to the master instance group. |
| 4 | The slave nodes are added to the core instance group. |
Amazon EMR provides a library file containing a JAR file to create a cluster step programmatically instead of directly through the CLI.
The JAR file to programmatically resize a running cluster is available at
s3://elasticmapreduce/libs/resize-job-flow/0.1/resize-job-flow.jar and
supports the optional arguments described in the following table.
| Option | Description |
|---|---|
|
| List all help information. |
--modify-instance-group
| Apply changes to the named instance group, specified by either role or
Instance Group ID. Instance group roles: MASTER,
CORE, or TASK. |
--set-instance-count
| Change the number of nodes of the named instance group. |
--add-instance-group
| Apply operations to the named instance group. Instance group roles:
TASK. Currently, Amazon EMR does not permit adding core or master
instance groups to a running cluster. |
--instance-count
| Specify the number of nodes for the named instance group. |
--instance-type
| Specify the type of EC2 instances used to create nodes in the new instance group. |
--no-wait | The cluster continues in the RUNNING state after the step
makes a request to create or resize an instance group. |
--on-failure
| Step state if one of the resizing actions fails: FAIL or
CONTINUE. |
--on-arrested
| Cluster state if an instance group enters the ARRESTED state:
FAIL, WAIT, or CONTINUE. |
The JAR file is configured to write to stderr. Only error and fatal
messages are reported. The JAR file includes the source code.
The cluster step looks similar to:
s3://elasticmapreduce/libs/resize-job-flow/0.1/resize-job-flow.jar \
--add-instance-group task --instance-type InstanceType --instance-count 10
For more information about how to add a cluster step, see Add Steps to a Cluster.