Amazon Elastic MapReduce
Developer Guide (API Version 2009-03-31)
« PreviousNext »
View the PDF for this guide.Go to the AWS Discussion Forum for this product.Go to the Kindle Store to download this guide in Kindle format.Did this page help you?  Yes | No |  Tell us about it...

Step 3: SSH into the Master Node

When the cluster’s status is WAITING, the master node is ready for you to connect to it. With an active SSH session into the master node, you can execute command line operations.

To locate the public DSN name of the master node

  • In the Amazon EMR console, select the cluster from the list of running clusters in the WAITING state. Details about the cluster appear in the lower pane.

    Get the DNS Name

    The DNS name you use to connect to the instance is listed on the Description tab as Master Public DNS Name.

To connect to the master node using Linux/Unix/Mac OS X

  1. Open a terminal window. This is found at Applications/Utilities/Terminal on Mac OS X and at Applications/Accessories/Terminal on many Linux distributions.

  2. Set the permissions on the PEM file for your Amazon EC2 key pair so that only the key owner has permissions to access the key. For example, if you saved the file as mykeypair.pem in the user's home directory, the command is:

    chmod og-rwx ~/mykeypair.pem 

    If you do not perform this step, SSH returns an error saying that your private key file is unprotected and rejects the key. You only need to perform this step the first time you use the private key to connect.

  3. To establish the connection to the master node, enter the following command line, which assumes the PEM file is in the user's home directory. Replace master-public-dns-name with the Master Public DNS Name of your cluster and replace ~/mykeypair.pem with the location and filename of your PEM file.

    ssh hadoop@master-public-dns-name -i ~/mykeypair.pem 
                        

    A warning states that the authenticity of the host you are connecting to can't be verified.

  4. Type yes to continue.

    Note

    If you are asked to log in, enter hadoop.

To install and configure PuTTY on Windows

  1. Download PuTTYgen.exe and PuTTY.exe to your computer from http://www.chiark.greenend.org.uk/~sgtatham/putty/download.html.

  2. Launch PuTTYgen.

  3. Click Load.

  4. Select the PEM file you created earlier. Note that you may have to change the search parameters from file of type “PuTTY Private Key Files (*.ppk) to “All Files (*.*)”.

  5. Click Open.

  6. Click OK on the PuTTYgen notice telling you the key was successfully imported.

  7. Click Save private key to save the key in the PPK format.

  8. When PuTTYgen prompts you to save the key without a pass phrase, click Yes.

  9. Enter a name for your PuTTY private key, such as mykeypair.ppk.

  10. Click Save.

  11. Close PuTTYgen.

To connect to the master node using PuTTY on Windows

  1. Start PuTTY.

  2. Select Session in the Category list. Enter hadoop@DNS in the Host Name field. The input looks similar to hadoop@ec2-184-72-128-177.compute-1.amazonaws.com.

  3. In the Category list, expand Connection, expand SSH, and then select Auth. The Options controlling the SSH authentication pane appears.

    SSH Options in PuTTy

  4. For Private key file for authentication, click Browse and select the private key file you generated earlier. If you are following this guide, the file name is mykeypair.ppk.

  5. Click Open.

    A PuTTY Security Alert pops up.

  6. Click Yes for the PuTTY Security Alert.

    Note

    If you are asked to log in, enter hadoop.

After you connect to the master node using either SSH or PuTTY, you should see a Hadoop command prompt and you are ready to start a Hive interactive session.