Amazon EMR
Developer Guide

Submit Work to a Cluster

This documentation is for AMI versions 2.x and 3.x of Amazon EMR. For information about Amazon EMR releases 4.0.0 and above, see the Amazon EMR Release Guide. For information about managing the Amazon EMR service in 4.x releases, see the Amazon EMR Management Guide.

This section describes the methods for submitting work to an Amazon EMR cluster. You can submit work to a cluster by adding steps or by interactively submitting Hadoop jobs to the master node. You can add steps to a cluster using the AWS CLI, the Amazon EMR API, or the console. You can submit Hadoop jobs interactively by establishing an SSH connection to the master node (using an SSH client such as PuTTY or OpenSSH) or by using the ssh subcommand in the AWS CLI.

The maximum number of PENDING and ACTIVE steps allowed in a cluster is 256 (this includes system steps such as install Pig, install Hive, install HBase, and configure debugging). You can submit an unlimited number of steps over the lifetime of a long-running cluster created using these AMIs, but only 256 steps can be ACTIVE or PENDING at any given time.

The total number of step records you can view (regardless of status) is 1,000. This total includes both user-submitted and system steps. As the status of user-submitted steps changes to COMPLETED or FAILED, additional user-submitted steps can be added to the cluster until the 1,000 step limit is reached. After 1,000 steps have been added to a cluster, the submission of additional steps causes the removal of older, user-submitted step records. These records are not removed from the log files. They are removed from the console display, and they do not appear when you use the CLI or API to retrieve cluster information. System step records are never removed.

The step information you can view depends on the mechanism used to retrieve cluster information. The following tables indicate the step information returned by each of the available options.

OptionDescribeJobFlow or --describe --jobflowListSteps or list-steps
SDK256 steps1,000 steps
Amazon EMR CLI256 stepsNA
AWS CLINA1,000 steps
API256 steps1,000 steps

For clusters created using AMI version 3.1.0 and earlier (Hadoop 2.x) or AMI version 2.4.7 and earlier (Hadoop 1.x), the total number of steps available over the lifetime of a cluster is limited to 256. For more information about how to overcome this limitation, see Add More than 256 Steps to a Cluster.