What tools are available for troubleshooting an Amazon EMR cluster?
To identify and fix cluster errors, you can use the tools described on this page. You might need to initialize some of the tools when you launch the cluster. Other tools are available for every cluster by default.
Topics
View EMR cluster details
You can use the AWS Management Console, AWS CLI, or EMR API to retrieve detailed information about an EMR cluster and job execution. For more information about using the AWS Management Console and AWS CLI, see View Amazon EMR cluster status and details.
Amazon EMR console details pane
In the Clusters list on the Amazon EMR console, you can see high-level information about the status of each cluster in your account and AWS Region. The list displays all active and terminated clusters that you launched in the past two months. From the Clusters list, you can select a cluster Name to view cluster details. This information is organized in different categories to make it easy to navigate.
The Application user interfaces available in the cluster details page can be useful to troubleshoot clusters. It provides status of YARN applications, and for some, such as Spark applications you can drill into different metrics and facets such as jobs, stages, and executors. For more information, see View Amazon EMR application history. This feature is available only for Amazon EMR releases 5.8.0 and higher.
Amazon EMR command line interface
You can locate details about a cluster from the AWS CLI with the
--describe
argument.
Amazon EMR API
You can locate details about a cluster from the API using the
DescribeJobFlows
action.
View EMR cluster error details
When an EMR cluster terminates with an error, the DescribeCluster
and
ListClusters
APIs return an error code and an error message. For select
cluster errors, the ErrorDetail
data array can help you troubleshoot the
failure.
For a list of error codes that include ErrorDetail
data, see Error codes with ErrorDetail
information in Amazon EMR.
Note
We continuously refine our error messages so that you receive the most recent and
pertinent information. We don't recommend that you parse the text from
ErrorMessage
because this text is subject to change.
Run scripts and configure Amazon EMR processes
As part of your troubleshooting process, you might find it helpful to run custom scripts on your cluster or view and configure cluster processes.
View and restart application processes
It can be helpful to view running processes on your cluster in order to diagnose potential issues. You can stop and restart cluster processes by connecting to the master node of your cluster. For more information, see View and restart Amazon EMR and application processes (daemons).
Run commands and scripts without an SSH connection
To run a command or a script on your cluster as a step, you can use the
command-runner.jar
or script-runner.jar
tools without
establishing an SSH connection to the master node. For more information, see Run
commands and scripts on an Amazon EMR cluster.
View log files
Amazon EMR and Hadoop both generate log files as the cluster runs. You can access these log files from several different tools, depending on the configuration that you specified when you launched the cluster. For more information, see Configure Amazon EMR cluster logging and debugging.
Log files on the master node
Every cluster publishes logs files to the /mnt/var/log/ directory on the master node. These log files are only available while the cluster is running.
Log files archived to Amazon S3
If you launch the cluster and specify an Amazon S3 log path, the cluster copies the log files stored in /mnt/var/log/ on the master node to Amazon S3 in 5-minute intervals. This ensures that you have access to the log files even after the cluster is terminated. Because the files are archived in 5-minute intervals, the last few minutes of an suddenly terminated cluster may not be available.
Monitor EMR cluster performance
Amazon EMR provides several tools to monitor the performance of your cluster.
Hadoop web interfaces
Every cluster publishes a set of web interfaces on the master node that contain information about the cluster. You can access these web pages by using an SSH tunnel to connect them on the master node. For more information, see View web interfaces hosted on Amazon EMR clusters.
CloudWatch metrics
Every cluster reports metrics to CloudWatch. CloudWatch is a web service that tracks metrics, and which you can use to set alarms on those metrics. For more information, see Monitoring Amazon EMR metrics with CloudWatch.