Enhanced step debugging - Amazon EMR

Enhanced step debugging

If an Amazon EMR step fails and you submitted your work using the Step API operation with an AMI of version 5.x or later, Amazon EMR can identify and return the root cause of the step failure in some cases, along with the name of the relevant log file and a portion of the application stack trace via API. For example, the following failures can be identified:

  • A common Hadoop error such as the output directory already exists, the input directory does not exist, or an application runs out of memory.

  • Java errors such as an application that was compiled with an incompatible version of Java or run with a main class that is not found.

  • An issue accessing objects stored in Amazon S3.

This information is available using the DescribeStep and ListSteps API operations. The FailureDetails field of the StepSummary returned by those operations. To access the FailureDetails information, use the AWS CLI, console, or AWS SDK.

Note

We’ve redesigned the Amazon EMR console to make it easier to use. See Amazon EMR console to learn about the differences between the old and new console experiences.

New console

The new Amazon EMR console doesn't offer step debugging. However, you can view cluster termination details with the following steps.

To view failure details with the new console
  1. Sign in to the AWS Management Console, and open the Amazon EMR console at https://console.aws.amazon.com/emr.

  2. Under EMR on EC2 in the left navigation pane, choose Clusters, and then select the cluster that you want to view.

  3. Note the Status value in the Summary section of the cluster details page. If the status is Terminated with errors, hover over the text to view cluster failure details.

Old console
To view failure details with the old console
  1. Navigate to the new Amazon EMR console and select Switch to the old console from the side navigation. For more information on what to expect when you switch to the old console, see Using the old console.

  2. Choose Cluster List and select a cluster.

  3. Select the arrow icon next to each step to view more details. If the step has failed and Amazon EMR can identify the root cause, you see the details of the failure.

CLI
To view failure details with the AWS CLI
  • To get failure details for a step with the AWS CLI, use the describe-step command.

    aws emr describe-step --cluster-id j-1K48XXXXXHCB --step-id s-3QM0XXXXXM1W

    The output will look similar to the following:

    { "Step": { "Status": { "FailureDetails": { "LogFile": "s3://myBucket/logs/j-1K48XXXXXHCB/steps/s-3QM0XXXXXM1W/stderr.gz", "Message": "org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory s3://myBucket/logs/beta already exists", "Reason": "Output directory already exists." }, "Timeline": { "EndDateTime": 1469034209.143, "CreationDateTime": 1469033847.105, "StartDateTime": 1469034202.881 }, "State": "FAILED", "StateChangeReason": {} }, "Config": { "Args": [ "wordcount", "s3://myBucket/input/input.txt", "s3://myBucket/logs/beta" ], "Jar": "s3://myBucket/jars/hadoop-mapreduce-examples-2.7.2-amzn-1.jar", "Properties": {} }, "Id": "s-3QM0XXXXXM1W", "ActionOnFailure": "CONTINUE", "Name": "ExampleJob" } }