Menu
Amazon EMR
Management Guide

Configure Kerberos

Set up Kerberos on Amazon EMR by following these steps.

Step 1: Create a Security Configuration that Enables Kerberos and Optional Cross-Realm Trust Configuration

You can create a security configuration that specifies Kerberos attributes using the EMR console, the AWS CLI, or the EMR API. The security configuration can also contain other security options, such as encryption. For more information, see Create a Security Configuration. When you create a Kerberized cluster, you specify the security configuration together with Kerberos attributes that are specific to the cluster. You can't specify one set without the other or an error occurs.

The following Kerberos parameters are set using the security configuration:

Parameter Description

Enable Kerberos

Specifies that Kerberos is enabled for clusters that use this security configuration. If a cluster uses this security configuration, the cluster must also have Kerberos settings specified or an error occurs.

Ticket Lifetime

The period for which a Kerberos ticket issued by the cluster-dedicated KDC is valid. Ticket lifetimes are limited for security reasons. Cluster applications and services auto-renew tickets after they expire. Users who connect to the cluster over SSH using Kerberos credentials need to run kinit from the master node command line to renew after a ticket expires.

Cross-realm trust

If you provide a cross-realm trust configuration, principals (typically users) from another realm are authenticated to clusters that use this configuration. Additional configuration in the other Kerberos realm is also required. For more information, see Configure a Cross-Realm Trust.

Admin server

The fully qualified domain name (FQDN) of the other Kerberos admin server in the trust relationship. The admin server and KDC typically run on the same server. Optionally, you can specify the port used to communicate with Kerberos admin server. If not specified, port 749 is used, which is the Kerberos default.

KDC server

The fully qualified domain name (FQDN) of the KDC in the other realm of the trust relationship. Optionally, you can specify the port used to communicate with the KDC server. If not specified, port 88 is used, which is the Kerberos default.

Domain name

The domain name of the other realm in the trust relationship.

The following examples demonstrate the same configurations specified in the EMR console and using a JSON structure for the create-security-configuration command from the AWS CLI. The KDC and admin services in the cross-trust realm are hosted on the same server, ad.domain.com and the default Kerberos ports are used: 749 for the KDC, and 88 for administrative services. If your application uses customized ports, use the form ad.domain.com:portnumber.

Example Console Configuration

Example JSON Snippet For create-security-configuration

Copy
{ "AuthenticationConfiguration": { "KerberosConfiguration": { "Provider": "ClusterDedicatedKdc", "ClusterDedicatedKdcConfiguration": { "TicketLifetimeInHours": 24, "CrossRealmTrustConfiguration": { "Realm": "AD.DOMAIN.COM", "Domain": "ad.domain.com", "AdminServer": "ad.domain.com", "KdcServer": "ad.domain.com" } } } } }

Step 2: Configure Kerberos Attributes for a Cluster

Specify Kerberos attributes for a particular cluster along with the Kerberos security configuration when you create the cluster. You must specify cluster Kerberos settings and a Kerberos security configuration together or an error occurs. You can use the EMR Console, the AWS CLI, or the EMR API.

The following Kerberos attributes are specified using the cluster configuration:

Attribute Description

Realm

The Kerberos realm name for the cluster. The Kerberos convention is to set this to be the same as the domain name, but in uppercase. For example, for the domain ec2.internal, using EC2.INTERNAL as the realm name.

KDC admin password

The password used within the cluster for kadmin or kadmin.local. These are command-line interfaces to the Kerberos V5 administration system, which maintains Kerberos principals, password policies, and keytabs for the cluster.

Cross-realm trust principal password (optional)

Required when establishing a cross-realm trust. The cross-realm principal password, which must be identical across realms. Use a very strong password.

AD domain join user (optional)

Required when establishing a cross-realm trust with an Active Directory (AD) domain. This is User logon name of an AD account with sufficient privileges to join computers to the domain. Amazon EMR uses this identity to join the cluster to the domain. For more information see Step 3: Add User Accounts to the Domain for the EMR Cluster.

AD domain join password (optional)

The password for the AD user that has sufficient privileges to join the cluster to the AD domain. For more information see Step 3: Add User Accounts to the Domain for the EMR Cluster.

The following examples demonstrate the same configurations specified in the EMR console and using the create-cluster command from the AWS CLI.

Example Console Configuration

Note

Linux line continuation characters (\) are included for readability. They can be removed or used in Linux commands. For Windows, remove them or replace with a caret (^).

Example JSON Snippet For create-security-configuration

Copy
aws emr create-cluster --name "MyKerberosCluster" \ --release-label emr-5.10.0 \ --instance-type m3.xlarge \ --instance-count 3 \ --use-default-roles \ --ec2-attributes KeyName=MyEC2KeyPair \ --security-configuration KerberosSecurityConfig.json \ --applications Name=Hadoop Name=Hive Name=Oozie Name=Hue Name=HCatalog Name=Spark \ --kerberos-attributes Realm=EC2.INTERNAL,KdcAdminPassword=MyVeryStrongPassword,\ CrossRealmTrustPrincipalPassword=MyVeryStrongMatchingPassword,\ ADDomainJoinUser=ADUser,ADDomainJoinPassword=MyADUserPassword

Step 3: Add Kerberos-Authenticated Users

Amazon EMR creates Kerberos-authenticated clients for applications that run on the cluster, for example, the Hadoop user, Spark user, and others. You can also add users who are authenticated to cluster processes using Kerberos. Authenticated users can connect to the cluster with their Kerberos credentials.

You can add users in either of the following ways:

  • Configure a cross-realm trust to authenticate users from a different Kerberos realm, such as an AD directory. For more information, see Configure a Cross-Realm Trust.

  • Add Linux accounts on the local cluster and add principals to the cluster-dedicated KDC for those accounts. For more information, see Configure a Cluster-Dedicated KDC.

    Important

    The KDC, along with the database of principals, is lost when the master node terminates because the master node uses ephemeral storage. If you create users for SSH connections, we recommend that you establish a cross-realm trust with an external KDC configured for high-availability. Alternatively, if you create users for SSH connections using Linux user accounts, automate the account creation process using bootstrap actions and scripts so that it can be repeated when you create a new cluster.