Menu
Amazon EMR
Developer Guide

Scaling Cluster Resources

This documentation is for AMI versions 2.x and 3.x of Amazon EMR. For information about Amazon EMR releases 4.0.0 and above, see the Amazon EMR Release Guide. For information about managing the Amazon EMR service in 4.x releases, see the Amazon EMR Management Guide.

You can adjust the number of Amazon EC2 instances available to an EMR cluster automatically or manually in response to workloads that have varying demands. The following options are available:

  • You can configure automatic scaling for the core instance group and task instance groups when you first create them or after it's running. Amazon EMR automatically configures Auto Scaling parameters according to rules you specify, and then adds and removes instances based on a CloudWatch metric.

  • You can manually resize the core instance group and task instance groups by manually adding or removing Amazon EC2 instances.

  • You can add a new task instance group to the cluster.

The option to specify the Amazon EC2 instance type is only available during initial configuration of an instance group, so you can change the Amazon EC2 instance type only by adding a new task. A cluster-wide configuration allows you to specify whether Amazon EC2 instances removed from a cluster are terminated at the instance-hour boundary, or when tasks on the Amazon EC2 instance are complete. For more information, see Configure Cluster Scale-Down.

Before you choose one of the methods for scaling described in this section, you should be familiar with some important concepts. First, you should understand the role of node types in an EMR cluster and how instance groups are used to manage them. For more information about the function of node types, see What is Amazon EMR?, and for more information about instance groups, see Instance Groups. You should also develop a strategy for right-sizing cluster resources based on the nature of your workload. For more information, see Cluster Configuration Guidelines.

Note

The master instance group in an EMR cluster always consists of a single node running on a single Amazon EC2 instance, so it can't scale after you initially configure it. You work with the core instance groups and task instance groups to scale out and scale in a cluster. It's possible to have a cluster with only a master node, and no core or task nodes. You must have at least one core node at cluster creation in order to scale the cluster. In other words, single node clusters cannot be resized.