Amazon Elastic MapReduce
Developer Guide (API Version 2009-03-31)
« PreviousNext »
View the PDF for this guide.Go to the AWS Discussion Forum for this product.Go to the Kindle Store to download this guide in Kindle format.Did this page help you?  Yes | No |  Tell us about it...

View Web Interfaces Hosted on Amazon EMR Clusters

Hadoop and other applications you install on your Amazon EMR cluster, publish user interfaces as web sites hosted on the master node. For security reasons, these web sites are only available on the master node's local web server and are not publicly available over the Internet. Hadoop also publishes user interfaces as web sites hosted on the core and task (slave) nodes. These web sites are also only available on local web servers on the nodes.

The following table lists web interfaces you can view on the master node. The Hadoop interfaces are available on all clusters. Other web interfaces such as Ganglia and HBase are only available if you install additional applications on your cluster. To access the following interfaces, replace master-public-dns-name in the URI with the DNS name of the master node after creating an SSH tunnel. For more information about retrieving the master public DNS name, see Retrieve the Public DNS Name of the Master Node. For more information about creating an SSH tunnel, see Option 2, Part 1: Set Up an SSH Tunnel to the Master Node Using Dynamic Port Forwarding.

Name of Interface

URI

Hadoop version 2.x
Hadoop ResourceManager http://master-public-dns-name:9026/
Hadoop HDFS NameNode http://master-public-dns-name:9101/
Ganglia Metrics Reports http://master-public-dns-name/ganglia/
HBase Interface http://master-public-dns-name:60010/master-status
Impala Statestore http://master-public-dns-name:25000
Impalad http://master-public-dns-name:25010
Impala Catalog http://master-public-dns-name:25020
Hadoop version 1.x
Hadoop MapReduce JobTracker http://master-public-dns-name:9100/
Hadoop HDFS NameNode http://master-public-dns-name:9101/
Ganglia Metrics Reports http://master-public-dns-name/ganglia/
HBase Interface http://master-public-dns-name:60010/master-status

For more information about the Ganglia web interface, see Monitor Performance with Ganglia. For more information about Impala web interfaces, see Accessing Impala Web User Interfaces

The following table lists web interfaces you can view on the core and task nodes. These Hadoop interfaces are available on all clusters. To access the following interfaces, replace slave-public-dns-name in the URI with the public DNS name of the node. For more information about retrieving the public DNS name of a core or task node instance, see Connecting to Your Linux/Unix Instances Using SSH in the Amazon Elastic Compute Cloud User Guide for Linux. In addition to retrieving the public DNS name of the core or task node, you must also edit the ElasticMapReduce-slave security group to allow SSH access over TCP port 22. For more information about modifying security group rules, see Adding Rules to a Security Group in the Amazon Elastic Compute Cloud User Guide for Linux.

Name of Interface

URI

Hadoop version 2.x
Hadoop NodeManager http://slave-public-dns-name:9035/
Hadoop HDFS DataNode http://slave-public-dns-name:9102/
Hadoop version 1.x
Hadoop HDFS DataNode (core nodes only) http://slave-public-dns-name:9102/
Hadoop MapReduce TaskTracker http://slave-public-dns-name:9103/

Note

You can change the configuration of the Hadoop version 2.x web interfaces by editing the conf/hdfs-site.xml file. You can change the configuration of the Hadoop version 1.x web interfaces by editing the conf/hadoop-default.xml file.

Because there are several application-specific interfaces available on the master node that are not available on the core and task nodes, the instructions in this document are specific to the Amazon EMR master node. Accessing the web interfaces on the core and task nodes can be done in the same manner as you would access the web interfaces on the master node.

There are several ways you can access the web interfaces on the master node. The easiest and quickest method is to use SSH to connect to the master node and use the text-based browser, Lynx, to view the web sites in your SSH client. However, Lynx is a text-based browser with a limited user interface that cannot display graphics. The following example shows how to open the Hadoop ResourceManager interface using Lynx on AMI 3.1.1 and later (Lynx URLs are also provided when you log into the master node using SSH).

lynx http://ip-###-##-##-###.us-west-2.compute.internal:9026/

There are two remaining options for accessing web interfaces on the master node that provide full browser functionality. Choose one of the following:

  • Option 1 (recommended for more technical users): Use an SSH client to connect to the master node, configure SSH tunneling with local port forwarding, and use an Internet browser to open web interfaces hosted on the master node. This method allows you to configure web interface access without using a SOCKS proxy.

  • Option 2 (recommended for new users): Use an SSH client to connect to the master node, configure SSH tunneling with dynamic port forwarding, and configure your Internet browser to use an add-on such as FoxyProxy or SwitchySharp to manage your SOCKS proxy settings. This method allows you to automatically filter URLs based on text patterns and to limit the proxy settings to domains that match the form of the master node's DNS name. The browser add-on automatically handles turning the proxy on and off when you switch between viewing websites hosted on the master node, and those on the Internet. For more information about how to configure FoxyProxy for Firefox and Google Chrome, see Option 2, Part 2: Configure Proxy Settings to View Websites Hosted on the Master Node.