View Web Interfaces Hosted on Amazon EMR Clusters
Hadoop and other applications you install on your Amazon EMR cluster, publish user interfaces as web sites hosted on the master node. For security reasons, these web sites are only available on the master node's local web server and are not publicly available over the Internet. Hadoop also publishes user interfaces as web sites hosted on the core and task (slave) nodes. These web sites are also only available on local web servers on the nodes.
The following table lists web interfaces that you can view on the core and
task nodes. These Hadoop interfaces are available on all clusters. To access the
following interfaces, replace
slave-public-dns-name in the
URI with the public DNS name of the node. For more information about retrieving the
public DNS name of a core or task node instance, see Connecting to Your Linux/Unix
Instances Using SSH in the Amazon EC2 User Guide for Linux Instances. In
addition to retrieving the public DNS name of the core or task node, you must also edit
the ElasticMapReduce-slave security group to allow SSH access over TCP port 22. For more
information about modifying security group rules, see Adding Rules to a Security
Group in the Amazon EC2 User Guide for Linux Instances.
|Name of interface||
|Hadoop HDFS NameNode||http://|
|Hadoop HDFS DataNode||http://|
Because there are several application-specific interfaces available on the master node that are not available on the core and task nodes, the instructions in this document are specific to the Amazon EMR master node. Accessing the web interfaces on the core and task nodes can be done in the same manner as you would access the web interfaces on the master node.
There are several ways you can access the web interfaces on the master node. The easiest and quickest method is to use SSH to connect to the master node and use the text-based browser, Lynx, to view the web sites in your SSH client. However, Lynx is a text-based browser with a limited user interface that cannot display graphics. The following example shows how to open the Hadoop ResourceManager interface using Lynx (Lynx URLs are also provided when you log into the master node using SSH).
There are two remaining options for accessing web interfaces on the master node that provide full browser functionality. Choose one of the following:
Option 1 (recommended for more technical users): Use an SSH client to connect to the master node, configure SSH tunneling with local port forwarding, and use an Internet browser to open web interfaces hosted on the master node. This method allows you to configure web interface access without using a SOCKS proxy.
Option 2 (recommended for new users): Use an SSH client to connect to the master node, configure SSH tunneling with dynamic port forwarding, and configure your Internet browser to use an add-on such as FoxyProxy or SwitchySharp to manage your SOCKS proxy settings. This method allows you to automatically filter URLs based on text patterns and to limit the proxy settings to domains that match the form of the master node's DNS name. The browser add-on automatically handles turning the proxy on and off when you switch between viewing websites hosted on the master node, and those on the Internet. For more information about how to configure FoxyProxy for Firefox and Google Chrome, see Option 2, Part 2: Configure Proxy Settings to View Websites Hosted on the Master Node.
- Option 1: Set Up an SSH Tunnel to the Master Node Using Local Port Forwarding
- Option 2, Part 1: Set Up an SSH Tunnel to the Master Node Using Dynamic Port Forwarding
- Option 2, Part 2: Configure Proxy Settings to View Websites Hosted on the Master Node
- Access the Web Interfaces on the Master Node Using the Console