Implementation - Teaching Big Data Skills with Amazon EMR

Implementation

To set up an EMR cluster to be multi-tenant, the EMR Master node must have SSH keys and users created so that each user has their own profile on the EMR cluster. To do this, create a Linux user for each user and set up each user with their own SSH keys. For more information on how to connect to the EMR cluster, see Use an Amazon EC2 Key Pair for SSH Credentials.

Once the EMR cluster is ready and available to use, login to the EMR cluster using the Hadoop username using the associated SSH key-pair of the EMR cluster. After successfully logging in to the EMR cluster, use the following steps to create the users and enable each user with SSH keys. Two forms of usernames are referenced as examples in code and command samples:

  • Hadoop: the master management user to access the EMR cluster for administration.

  • student01: the test user created.

Table 2: Sample command flow to show user creation and SSH key association:

Step Command Description
1 sudo adduser student01 Creates new user student01.
2 sudo su - student01 Switch to newly created user student01.
3 mkdir .ssh Create new directory to store SSH key.
4 chmod 700 .ssh Update read/write/execute permissions for the directory.
5 touch .ssh/authorized_keys Create a new file called authorized_keys.
7 chmod 600 .ssh/authorized_keys Update read/write/execute permissions on the authorized_keys file.
8 cat >> .ssh/authorized_keys Copy and paste the SSH key that you would like to use and save the file (Ctrl+D).