Amazon Elastic MapReduce
Developer Guide (API Version 2009-03-31)
« PreviousNext »
View the PDF for this guide.Go to the AWS Discussion Forum for this product.Go to the Kindle Store to download this guide in Kindle format.Did this page help you?  Yes | No |  Tell us about it...

Wait for Steps to Complete

When you submit steps to a cluster using the command line interface (CLI), you can specify that the CLI should wait until the cluster has completed all pending steps before accepting additional commands. This can be useful, for example, if you are using a step to copy data from Amazon S3 into HDFS and need to be sure that the copy operation is complete before you run the next step in the cluster. You do this by specifying the --wait-for-steps parameter after you submit the copy step.

The --wait-for-steps parameter does not ensure that the step completes successfully, just that it has finished running. If, as in the earlier example, you need to ensure the step was successful before submitting the next step, check the cluster status. If the step failed, the cluster will be in the FAILED status.

Although you can add the --wait-for-steps parameter in the same CLI command that adds a step to the cluster, it is best to add it in a separate CLI command. This ensures that the --wait-for-steps argument is parsed and applied after the step is created. This is illustrated in the example that follows.

To wait until a step completes

  • Add the --wait-for-steps parameter to the cluster. This is illustrated in the following example, where JobFlowID is the cluster identifier that Amazon EMR returned when you created the cluster. The JAR, main class, and arguments specified in the first CLI command are from the Word Count sample application; this command adds a step to the cluster. The second CLI command causes the cluster to wait until all of the currently pending steps have completed before accepting additional commands.

    In the directory where you installed the Amazon EMR CLI, run the following from the command line. For more information, see the Command Line Interface Reference for Amazon EMR.

    • Linux, UNIX, and Mac OS X users:

      ./elastic-mapreduce -j JobFlowID \
          --jar s3n://elasticmapreduce/samples/cloudburst/cloudburst.jar \
          --main-class org.myorg.WordCount \
          --arg s3n://elasticmapreduce/samples/cloudburst/input/s_suis.br \
          --arg s3n://elasticmapreduce/samples/cloudburst/input/100k.br \
          --arg hdfs:///cloudburst/output/1 \
          --arg 36 --arg 3 --arg 0 --arg 1 --arg 240 --arg 48 --arg 24 \
          --arg 24 --arg 128 --arg 16 				
      				
      ./elastic-mapreduce -j JobFlowID \
          --wait-for-steps
    • Windows users:

      ruby elastic-mapreduce -j JobFlowID --jar s3n://elasticmapreduce/samples/cloudburst/cloudburst.jar --main-class org.myorg.WordCount --arg s3n://elasticmapreduce/samples/cloudburst/input/s_suis.br --arg s3n://elasticmapreduce/samples/cloudburst/input/100k.br --arg hdfs:///cloudburst/output/1 --arg 36 --arg 3 --arg 0 --arg 1 --arg 240 --arg 48 --arg 24 --arg 24 --arg 128 --arg 16 				
      				
      ruby elastic-mapreduce -j JobFlowID --wait-for-steps