Step 1: Gather data about the issue - Amazon EMR

Step 1: Gather data about the issue

The first step in troubleshooting a cluster is to gather information about what went wrong and the current status and configuration of the cluster. This information will be used in the following steps to confirm or rule out possible causes of the issue.

Define the problem

A clear definition of the problem is the first place to begin. Some questions to ask yourself:

  • What did I expect to happen? What happened instead?

  • When did this problem first occur? How often has it happened since?

  • Has anything changed in how I configure or run my cluster?

Cluster details

The following cluster details are useful in helping track down issues. For more information on how to gather this information, see View cluster status and details.

  • Identifier of the cluster. (Also called a job flow identifier.)

  • AWS Region and Availability Zone the cluster was launched into.

  • State of the cluster, including details of the last state change.

  • Type and number of EC2 instances specified for the master, core, and task nodes.