Amazon Elastic MapReduce
Developer Guide (API Version 2009-03-31)
« PreviousNext »
View the PDF for this guide.Go to the AWS Discussion Forum for this product.Go to the Kindle Store to download this guide in Kindle format.Did this page help you?  Yes | No |  Tell us about it...

Add Steps to a Cluster

This section describes the methods for adding steps to a cluster.

You can add steps to a running cluster only if you set the KeepJobFlowAliveWhenNoSteps parameter to True when you create the cluster. This value keeps the Hadoop cluster engaged even after the completion of a cluster.

The following procedure creates a simple cluster and then adds a step to the cluster.

To add a step to a cluster using the CLI

  1. Create a cluster:

    In the directory where you installed the Amazon EMR CLI, run the following from the command line. For more information, see the Command Line Interface Reference for Amazon EMR.

    • Linux, UNIX, and Mac OS X users:

      ./elastic-mapreduce --create --alive 
    • Windows users:

      ruby elastic-mapreduce --create --alive 

    The --alive parameter keeps the cluster running even when all steps have been completed, unless you explicitly terminate it.

    The output looks similar to the following.

    Created cluster JobFlowID
  2. Add a step:

    • Linux, UNIX, and Mac OS X users:

      ./elastic-mapreduce -j JobFlowID \
      --jar s3n://elasticmapreduce/samples/cloudburst/cloudburst.jar \
      --arg s3n://elasticmapreduce/samples/cloudburst/input/s_suis.br \
      --arg s3n://elasticmapreduce/samples/cloudburst/input/100k.br \
      --arg hdfs:///cloudburst/output/1 \
      --arg 36 --arg 3 --arg 0 --arg 1 --arg 240 --arg 48 --arg 24 --arg 24 --arg 128 --arg 16
    • Windows users:

      ruby elastic-mapreduce -j JobFlowID --jar s3n://elasticmapreduce/samples/cloudburst/cloudburst.jar --arg s3n://elasticmapreduce/samples/cloudburst/input/s_suis.br --arg s3n://elasticmapreduce/samples/cloudburst/input/100k.br --arg hdfs:///cloudburst/output/1 --arg 36 --arg 3 --arg 0 --arg 1 --arg 240 --arg 48 --arg 24 --arg 24 --arg 128 --arg 16

This command runs an example cluster step that downloads and runs the JAR file. The arguments are passed to the main function in the JAR file. If your JAR file does not have a manifest, specify the JAR file's main class using --main-class option.

Note

The maximum number of steps allowed in a cluster is 256. The debugging option uses additional steps to function, so it can exceed your step limit quickly. For more information about how to overcome this limitation, see Add More than 256 Steps to a Cluster.