Create a security configuration - Amazon EMR

Create a security configuration

This topic covers general procedures to create a security configuration with the Amazon EMR console and the AWS CLI, followed by a reference for the parameters that comprise encryption, authentication, and IAM roles for EMRFS. For more information about these features, see the following topics:

To create a security configuration using the console
  1. Open the Amazon EMR console at https://console.aws.amazon.com/emr.

  2. In the navigation pane, choose Security Configurations, Create security configuration.

  3. Type a Name for the security configuration.

  4. Choose options for Encryption and Authentication as described in the sections below and then choose Create.

To create a security configuration using the AWS CLI
  • Use the create-security-configuration command as shown in the following example.

    • For SecConfigName, specify the name of the security configuration. This is the name you specify when you create a cluster that uses this security configuration.

    • For SecConfigDef, specify an inline JSON structure or the path to a local JSON file, such as file://MySecConfig.json. The JSON parameters define options for Encryption, IAM Roles for EMRFS access to Amazon S3, and Authentication as described in the sections below.

    aws emr create-security-configuration --name "SecConfigName" --security-configuration SecConfigDef

Configure data encryption

Before you configure encryption in a security configuration, create the keys and certificates that are used for encryption. For more information, see Providing keys for encrypting data at rest with Amazon EMR and Providing certificates for encrypting data in transit with Amazon EMR encryption.

When you create a security configuration, you specify two sets of encryption options: at-rest data encryption and in-transit data encryption. Options for at-rest data encryption include both Amazon S3 with EMRFS and local-disk encryption. In-transit encryption options enable the open-source encryption features for certain applications that support Transport Layer Security (TLS). At-rest options and in-transit options can be enabled together or separately. For more information, see Encrypt data at rest and in transit.

Note

When you use AWS KMS, charges apply for the storage and use of encryption keys. For more information, see AWS KMS Pricing.

Specifying encryption options using the console

Choose options under Encryption according to the following guidelines.

  • Choose options under At rest encryption to encrypt data stored within the file system.

    You can choose to encrypt data in Amazon S3, local disks, or both.

  • Under S3 data encryption, for Encryption mode, choose a value to determine how Amazon EMR encrypts Amazon S3 data with EMRFS.

    What you do next depends on the encryption mode you chose:

  • Under Local disk encryption, choose a value for Key provider type.

    • AWS KMS key

      Select this option to specify an AWS KMS key. For AWS KMS key, select a key. The key must exist in the same region as your EMR cluster. For more information about key requirements, see Using AWS KMS keys for encryption.

      EBS Encryption

      When you specify AWS KMS as your key provider, you can enable EBS encryption to encrypt EBS root device and storage volumes. To enable such option, you must grant the Amazon EMR service role EMR_DefaultRole with permissions to use the AWS KMS key that you specify. For more information about key requirements, see Enabling EBS encryption by providing additional permissions for KMS keys.

    • Custom

      Select this option to specify a custom key provider. For S3 object, enter the location in Amazon S3, or the Amazon S3 ARN, of your custom key-provider JAR file. For Key provider class, enter the full class name of a class declared in your application that implements the EncryptionMaterialsProvider interface. The class name you provide here must be different from the class name provided for CSE-Custom.

  • Choose In-transit encryption to enable the open-source TLS encryption features for in-transit data. Choose a Certificate provider type according to the following guidelines:

    • PEM

      Select this option to use PEM files that you provide within a zip file. Two artifacts are required within the zip file: privateKey.pem and certificateChain.pem. A third file, trustedCertificates.pem, is optional. See Providing certificates for encrypting data in transit with Amazon EMR encryption for details. For S3 object, specify the location in Amazon S3, or the Amazon S3 ARN, of the zip file field.

    • Custom

      Select this option to specify a custom certificate provider and then, for S3 object, enter the location in Amazon S3, or the Amazon S3 ARN, of your custom certificate-provider JAR file. For Key provider class, enter the full class name of a class declared in your application that implements the TLSArtifactsProvider interface.

Specifying encryption options using the AWS CLI

The sections that follow use sample scenarios to illustrate well-formed --security-configuration JSON for different configurations and key providers, followed by a reference for the JSON parameters and appropriate values.

Example in-transit data encryption options

The example below illustrates the following scenario:

aws emr create-security-configuration --name "MySecConfig" --security-configuration '{ "EncryptionConfiguration": { "EnableInTransitEncryption": true, "EnableAtRestEncryption": false, "InTransitEncryptionConfiguration": { "TLSCertificateConfiguration": { "CertificateProviderType": "PEM", "S3Object": "s3://MyConfigStore/artifacts/MyCerts.zip" } } } }'

The example below illustrates the following scenario:

aws emr create-security-configuration --name "MySecConfig" --security-configuration '{ "EncryptionConfiguration": { "EnableInTransitEncryption": true, "EnableAtRestEncryption": false, "InTransitEncryptionConfiguration": { "TLSCertificateConfiguration": { "CertificateProviderType": "Custom", "S3Object": "s3://MyConfig/artifacts/MyCerts.jar", "CertificateProviderClass": "com.mycompany.MyCertProvider" } } } }'

Example at-rest data encryption options

The example below illustrates the following scenario:

  • In-transit data encryption is disabled and at-rest data encryption is enabled.

  • SSE-S3 is used for Amazon S3 encryption.

  • Local disk encryption uses AWS KMS as the key provider.

aws emr create-security-configuration --name "MySecConfig" --security-configuration '{ "EncryptionConfiguration": { "EnableInTransitEncryption": false, "EnableAtRestEncryption": true, "AtRestEncryptionConfiguration": { "S3EncryptionConfiguration": { "EncryptionMode": "SSE-S3" }, "LocalDiskEncryptionConfiguration": { "EncryptionKeyProviderType": "AwsKms", "AwsKmsKey": "arn:aws:kms:us-east-1:123456789012:key/12345678-1234-1234-1234-123456789012" } } } }'

The example below illustrates the following scenario:

  • In-transit data encryption is enabled and references a zip file with PEM certificates in Amazon S3, using the ARN.

  • SSE-KMS is used for Amazon S3 encryption.

  • Local disk encryption uses AWS KMS as the key provider.

aws emr create-security-configuration --name "MySecConfig" --security-configuration '{ "EncryptionConfiguration": { "EnableInTransitEncryption": true, "EnableAtRestEncryption": true, "InTransitEncryptionConfiguration": { "TLSCertificateConfiguration": { "CertificateProviderType": "PEM", "S3Object": "arn:aws:s3:::MyConfigStore/artifacts/MyCerts.zip" } }, "AtRestEncryptionConfiguration": { "S3EncryptionConfiguration": { "EncryptionMode": "SSE-KMS", "AwsKmsKey": "arn:aws:kms:us-east-1:123456789012:key/12345678-1234-1234-1234-123456789012" }, "LocalDiskEncryptionConfiguration": { "EncryptionKeyProviderType": "AwsKms", "AwsKmsKey": "arn:aws:kms:us-east-1:123456789012:key/12345678-1234-1234-1234-123456789012" } } } }'

The example below illustrates the following scenario:

  • In-transit data encryption is enabled and references a zip file with PEM certificates in Amazon S3.

  • CSE-KMS is used for Amazon S3 encryption.

  • Local disk encryption uses a custom key provider referenced by its ARN.

aws emr create-security-configuration --name "MySecConfig" --security-configuration '{ "EncryptionConfiguration": { "EnableInTransitEncryption": true, "EnableAtRestEncryption": true, "InTransitEncryptionConfiguration": { "TLSCertificateConfiguration": { "CertificateProviderType": "PEM", "S3Object": "s3://MyConfigStore/artifacts/MyCerts.zip" } }, "AtRestEncryptionConfiguration": { "S3EncryptionConfiguration": { "EncryptionMode": "CSE-KMS", "AwsKmsKey": "arn:aws:kms:us-east-1:123456789012:key/12345678-1234-1234-1234-123456789012" }, "LocalDiskEncryptionConfiguration": { "EncryptionKeyProviderType": "Custom", "S3Object": "arn:aws:s3:::artifacts/MyKeyProvider.jar", "EncryptionKeyProviderClass": "com.mycompany.MyKeyProvider" } } } }'

The example below illustrates the following scenario:

  • In-transit data encryption is enabled with a custom key provider.

  • CSE-Custom is used for Amazon S3 data.

  • Local disk encryption uses a custom key provider.

aws emr create-security-configuration --name "MySecConfig" --security-configuration '{ "EncryptionConfiguration": { "EnableInTransitEncryption": "true", "EnableAtRestEncryption": "true", "InTransitEncryptionConfiguration": { "TLSCertificateConfiguration": { "CertificateProviderType": "Custom", "S3Object": "s3://MyConfig/artifacts/MyCerts.jar", "CertificateProviderClass": "com.mycompany.MyCertProvider" } }, "AtRestEncryptionConfiguration": { "S3EncryptionConfiguration": { "EncryptionMode": "CSE-Custom", "S3Object": "s3://MyConfig/artifacts/MyCerts.jar", "EncryptionKeyProviderClass": "com.mycompany.MyKeyProvider" }, "LocalDiskEncryptionConfiguration": { "EncryptionKeyProviderType": "Custom", "S3Object": "s3://MyConfig/artifacts/MyCerts.jar", "EncryptionKeyProviderClass": "com.mycompany.MyKeyProvider" } } } }'

The example below illustrates the following scenario:

  • In-transit data encryption is disabled and at-rest data encryption is enabled.

  • Amazon S3 encryption is enabled with SSE-KMS.

  • Multiple AWS KMS keys are used, one per each S3 bucket, and encryption exceptions are applied to these individual S3 buckets.

  • Local disk encryption is disabled.

aws emr create-security-configuration --name "MySecConfig" --security-configuration '{ "EncryptionConfiguration": { "AtRestEncryptionConfiguration": { "S3EncryptionConfiguration": { "EncryptionMode": "SSE-KMS", "AwsKmsKey": "arn:aws:kms:us-east-1:123456789012:key/12345678-1234-1234-1234-123456789012", "Overrides": [ { "BucketName": "sse-s3-bucket-name", "EncryptionMode": "SSE-S3" }, { "BucketName": "cse-kms-bucket-name", "EncryptionMode": "CSE-KMS", "AwsKmsKey": "arn:aws:kms:us-east-1:123456789012:key/12345678-1234-1234-1234-123456789012" }, { "BucketName": "sse-kms-bucket-name", "EncryptionMode": "SSE-KMS", "AwsKmsKey": "arn:aws:kms:us-east-1:123456789012:key/12345678-1234-1234-1234-123456789012" } ] } }, "EnableInTransitEncryption": false, "EnableAtRestEncryption": true } }'

The example below illustrates the following scenario:

  • In-transit data encryption is disabled and at-rest data encryption is enabled.

  • Amazon S3 encryption is enabled with SSE-S3 and local disk encryption is disabled.

aws emr create-security-configuration --name "MyS3EncryptionConfig" --security-configuration '{ "EncryptionConfiguration": { "EnableInTransitEncryption": false, "EnableAtRestEncryption": true, "AtRestEncryptionConfiguration": { "S3EncryptionConfiguration": { "EncryptionMode": "SSE-S3" } } } }'

The example below illustrates the following scenario:

  • In-transit data encryption is disabled and at-rest data encryption is enabled.

  • Local disk encryption is enabled with AWS KMS as the key provider and Amazon S3 encryption is disabled.

aws emr create-security-configuration --name "MyLocalDiskEncryptionConfig" --security-configuration '{ "EncryptionConfiguration": { "EnableInTransitEncryption": false, "EnableAtRestEncryption": true, "AtRestEncryptionConfiguration": { "LocalDiskEncryptionConfiguration": { "EncryptionKeyProviderType": "AwsKms", "AwsKmsKey": "arn:aws:kms:us-east-1:123456789012:key/12345678-1234-1234-1234-123456789012" } } } }'

The example below illustrates the following scenario:

  • In-transit data encryption is disabled and at-rest data encryption is enabled.

  • Local disk encryption is enabled with AWS KMS as the key provider and Amazon S3 encryption is disabled.

  • EBS encryption is enabled.

aws emr create-security-configuration --name "MyLocalDiskEncryptionConfig" --security-configuration '{ "EncryptionConfiguration": { "EnableInTransitEncryption": false, "EnableAtRestEncryption": true, "AtRestEncryptionConfiguration": { "LocalDiskEncryptionConfiguration": { "EnableEbsEncryption": true, "EncryptionKeyProviderType": "AwsKms", "AwsKmsKey": "arn:aws:kms:us-east-1:123456789012:key/12345678-1234-1234-1234-123456789012" } } } }'

The example below illustrates the following scenario:

SSE-EMR-WAL is used for EMR WAL encryption

aws emr create-security-configuration --name "MySecConfig" \ --security-configuration '{ "EncryptionConfiguration": { "EMRWALEncryptionConfiguration":{ }, "EnableInTransitEncryption":false, "EnableAtRestEncryption":false } }'

EnableInTransitEncryption and EnableAtRestEncryption still could be true, if want to enable related encryption.

The example below illustrates the following scenario:

  • SSE-KMS-WAL is used for EMR WAL encryption

  • Server side encryption uses AWS Key Management Service as the key provider

aws emr create-security-configuration --name "MySecConfig" \ --security-configuration '{ "EncryptionConfiguration": { "EMRWALEncryptionConfiguration":{ "AwsKmsKey":"arn:aws:kms:us-east-1:123456789012:key/12345678-1234-1234-1234-123456789012" }, "EnableInTransitEncryption":false, "EnableAtRestEncryption":false } }'

EnableInTransitEncryption and EnableAtRestEncryption still could be true, if want to enable related encryption.

JSON reference for encryption settings

The following table lists the JSON parameters for encryption settings and provides a description of acceptable values for each parameter.

Parameter Description
"EnableInTransitEncryption" : true | false Specify true to enable in-transit encryption and false to disable it. If omitted, false is assumed, and in-transit encryption is disabled.
"EnableAtRestEncryption": true | false Specify true to enable at-rest encryption and false to disable it. If omitted, false is assumed and at-rest encryption is disabled.
In-transit encryption parameters
"InTransitEncryptionConfiguration" : Specifies a collection of values used to configure in-transit encryption when EnableInTransitEncryption is true.
"CertificateProviderType": "PEM" | "Custom" Specifies whether to use PEM certificates referenced with a zipped file, or a Custom certificate provider. If PEM is specified, S3Object must be a reference to the location in Amazon S3 of a zip file containing the certificates. If Custom is specified, S3Object must be a reference to the location in Amazon S3 of a JAR file, followed by a CertificateProviderClass entry.
"S3Object" : "ZipLocation" | "JarLocation" Provides the location in Amazon S3 to a zip file when PEM is specified, or to a JAR file when Custom is specified. The format can be a path (for example, s3://MyConfig/artifacts/CertFiles.zip) or an ARN (for example, arn:aws:s3:::Code/MyCertProvider.jar). If a zip file is specified, it must contain files named exactly privateKey.pem and certificateChain.pem. A file named trustedCertificates.pem is optional.
"CertificateProviderClass" : "MyClassID" Required only if Custom is specified for CertificateProviderType. MyClassID specifies a full class name declared in the JAR file, which implements the TLSArtifactsProvider interface. For example, com.mycompany.MyCertProvider.
At-rest encryption parameters
"AtRestEncryptionConfiguration" : Specifies a collection of values for at-rest encryption when EnableAtRestEncryption is true, including Amazon S3 encryption and local disk encryption.
Amazon S3 encryption parameters
"S3EncryptionConfiguration" : Specifies a collection of values used for Amazon S3 encryption with the Amazon EMR File System (EMRFS).
"EncryptionMode": "SSE-S3" | "SSE-KMS" | "CSE-KMS" | "CSE-Custom" Specifies the type of Amazon S3 encryption to use. If SSE-S3 is specified, no further Amazon S3 encryption values are required. If either SSE-KMS or CSE-KMS is specified, an AWS KMS key ARN must be specified as the AwsKmsKey value. If CSE-Custom is specified, S3Object and EncryptionKeyProviderClass values must be specified.
"AwsKmsKey" : "MyKeyARN" Required only when either SSE-KMS or CSE-KMS is specified for EncryptionMode. MyKeyARN must be a fully specified ARN to a key (for example, arn:aws:kms:us-east-1:123456789012:key/12345678-1234-1234-1234-123456789012).
"S3Object" : "JarLocation" Required only when CSE-Custom is specified for CertificateProviderType. JarLocation provides the location in Amazon S3 to a JAR file. The format can be a path (for example, s3://MyConfig/artifacts/MyKeyProvider.jar) or an ARN (for example, arn:aws:s3:::Code/MyKeyProvider.jar).
"EncryptionKeyProviderClass" : "MyS3KeyClassID" Required only when CSE-Custom is specified for EncryptionMode. MyS3KeyClassID specifies a full class name of a class declared in the application that implements the EncryptionMaterialsProvider interface; for example, com.mycompany.MyS3KeyProvider.
Local disk encryption parameters
"LocalDiskEncryptionConfiguration" Specifies the key provider and corresponding values to be used for local disk encryption.
"EnableEbsEncryption": true | false Specify true to enable EBS encryption. EBS encryption encrypts the EBS root device volume and attached storage volumes. To use EBS encryption, you must specify AwsKms as your EncryptionKeyProviderType.
"EncryptionKeyProviderType": "AwsKms" | "Custom" Specifies the key provider. If AwsKms is specified, an KMS key ARN must be specified as the AwsKmsKey value. If Custom is specified, S3Object and EncryptionKeyProviderClass values must be specified.
"AwsKmsKey : "MyKeyARN" Required only when AwsKms is specified for Type. MyKeyARN must be a fully specified ARN to a key (for example, arn:aws:kms:us-east-1:123456789012:key/12345678-1234-1234-1234-456789012123).
"S3Object" : "JarLocation" Required only when CSE-Custom is specified for CertificateProviderType. JarLocation provides the location in Amazon S3 to a JAR file. The format can be a path (for example, s3://MyConfig/artifacts/MyKeyProvider.jar) or an ARN (for example, arn:aws:s3:::Code/MyKeyProvider.jar).

"EncryptionKeyProviderClass" : "MyLocalDiskKeyClassID"

Required only when Custom is specified for Type. MyLocalDiskKeyClassID specifies a full class name of a class declared in the application that implements the EncryptionMaterialsProvider interface; for example, com.mycompany.MyLocalDiskKeyProvider.
EMR WAL encryption parameters
"EMRWALEncryptionConfiguration" Specifies the value for EMR WAL encryption.
"AwsKmsKey" Specifies the CMK Key Id Arn.

Configure Kerberos authentication

A security configuration with Kerberos settings can only be used by a cluster that is created with Kerberos attributes or an error occurs. For more information, see Use Kerberos for authentication with Amazon EMR. Kerberos is only available in Amazon EMR release version 5.10.0 and later.

Specifying Kerberos settings using the console

Choose options under Kerberos authentication according to the following guidelines.

Parameter Description

Kerberos

Specifies that Kerberos is enabled for clusters that use this security configuration. If a cluster uses this security configuration, the cluster must also have Kerberos settings specified or an error occurs.

Provider

Cluster-dedicated KDC

Specifies that Amazon EMR creates a KDC on the primary node of any cluster that uses this security configuration. You specify the realm name and KDC admin password when you create the cluster.

You can reference this KDC from other clusters, if required. Create those clusters using a different security configuration, specify an external KDC, and use the realm name and KDC admin password that you specify for the cluster-dedicated KDC.

External KDC

Available only with Amazon EMR 5.20.0 and later. Specifies that clusters using this security configuration authenticate Kerberos principals using a KDC server outside the cluster. A KDC is not created on the cluster. When you create the cluster, you specify the realm name and KDC admin password for the external KDC.

Ticket Lifetime

Optional. Specifies the period for which a Kerberos ticket issued by the KDC is valid on clusters that use this security configuration.

Ticket lifetimes are limited for security reasons. Cluster applications and services auto-renew tickets after they expire. Users who connect to the cluster over SSH using Kerberos credentials need to run kinit from the primary node command line to renew after a ticket expires.

Cross-realm trust

Specifies a cross-realm trust between a cluster-dedicated KDC on clusters that use this security configuration and a KDC in a different Kerberos realm.

Principals (typically users) from another realm are authenticated to clusters that use this configuration. Additional configuration in the other Kerberos realm is required. For more information, see Tutorial: Configure a cross-realm trust with an Active Directory domain.

Cross-realm trust properties

Realm

Specifies the Kerberos realm name of the other realm in the trust relationship. By convention, Kerberos realm names are the same as the domain name but in all capital letters.

Domain

Specifies the domain name of the other realm in the trust relationship.

Admin server

Specifies the fully qualified domain name (FQDN) or IP address of the admin server in the other realm of the trust relationship. The admin server and KDC server typically run on the same machine with the same FQDN, but communicate on different ports.

If no port is specified, port 749 is used, which is the Kerberos default. Optionally, you can specify the port (for example, domain.example.com:749).

KDC server

Specifies the fully qualified domain name (FQDN) or IP address of the KDC server in the other realm of the trust relationship. The KDC server and admin server typically run on the same machine with the same FQDN, but use different ports.

If no port is specified, port 88 is used, which is the Kerberos default. Optionally, you can specify the port (for example, domain.example.com:88).

External KDC

Specifies that clusters external KDC is used by the cluster.

External KDC properties

Admin server

Specifies the fully qualified domain name (FQDN) or IP address of the external admin server. The admin server and KDC server typically run on the same machine with the same FQDN, but communicate on different ports.

If no port is specified, port 749 is used, which is the Kerberos default. Optionally, you can specify the port (for example, domain.example.com:749).

KDC server

Specifies the fully qualified domain name (FQDN) of the external KDC server. The KDC server and admin server typically run on the same machine with the same FQDN, but use different ports.

If no port is specified, port 88 is used, which is the Kerberos default. Optionally, you can specify the port (for example, domain.example.com:88).

Active Directory Integration

Specifies that Kerberos principal authentication is integrated with a Microsoft Active Directory domain.

Active Directory integration properties

Active Directory realm

Specifies the Kerberos realm name of the Active Directory domain. By convention, Kerberos realm names are typically the same as the domain name but in all capital letters.

Active Directory domain

Specifies the Active Directory domain name.

Active Directory server

Specifies the fully qualified domain name (FQDN) of the Microsoft Active Directory domain controller.

Specifying Kerberos settings using the AWS CLI

The following reference table shows JSON parameters for Kerberos settings in a security configuration. For example configurations, see, Configuration examples.

Parameter Description

"AuthenticationConfiguration": {

Required for Kerberos. Specifies that an authentication configuration is part of this security configuration.

"KerberosConfiguration": {

Required for Kerberos. Specifies Kerberos configuration properties.

"Provider": "ClusterDedicatedKdc",

or

"Provider: "ExternalKdc",

ClusterDedicatedKdc specifies that Amazon EMR creates a KDC on the primary node of any cluster that uses this security configuration. You specify the realm name and KDC admin password when you create the cluster. You can reference this KDC from other clusters, if required. Create those clusters using a different security configuration, specify an external KDC, and use the realm name and KDC admin password that you specified when you created the cluster with the cluster-dedicated KDC.

ExternalKdc specifies that the cluster uses an external KDC. Amazon EMR does not create a KDC on the primary node. A cluster that uses this security configuration must specify the realm name and KDC admin password of the external KDC.

"ClusterDedicatedKdcConfiguration": {

Required when ClusterDedicatedKdc is specified.

"TicketLifetimeInHours": 24,

Optional. Specifies the period for which a Kerberos ticket issued by the KDC is valid on clusters that use this security configuration.

Ticket lifetimes are limited for security reasons. Cluster applications and services auto-renew tickets after they expire. Users who connect to the cluster over SSH using Kerberos credentials need to run kinit from the primary node command line to renew after a ticket expires.

"CrossRealmTrustConfiguration": {

Specifies a cross-realm trust between a cluster-dedicated KDC on clusters that use this security configuration and a KDC in a different Kerberos realm.

Principals (typically users) from another realm are authenticated to clusters that use this configuration. Additional configuration in the other Kerberos realm is required. For more information, see Tutorial: Configure a cross-realm trust with an Active Directory domain.

"Realm": "KDC2.COM",

Specifies the Kerberos realm name of the other realm in the trust relationship. By convention, Kerberos realm names are the same as the domain name but in all capital letters.

"Domain": "kdc2.com",

Specifies the domain name of the other realm in the trust relationship.

"AdminServer": "kdc.com:749",

Specifies the fully qualified domain name (FQDN) or IP address of the admin server in the other realm of the trust relationship. The admin server and KDC server typically run on the same machine with the same FQDN, but communicate on different ports.

If no port is specified, port 749 is used, which is the Kerberos default. Optionally, you can specify the port (for example, domain.example.com:749).

"KdcServer": "kdc.com:88"

Specifies the fully qualified domain name (FQDN) or IP address of the KDC server in the other realm of the trust relationship. The KDC server and admin server typically run on the same machine with the same FQDN, but use different ports.

If no port is specified, port 88 is used, which is the Kerberos default. Optionally, you can specify the port (for example, domain.example.com:88).

}

}

"ExternalKdcConfiguration": {

Required when ExternalKdc is specified.

"TicketLifetimeInHours": 24,

Optional. Specifies the period for which a Kerberos ticket issued by the KDC is valid on clusters that use this security configuration.

Ticket lifetimes are limited for security reasons. Cluster applications and services auto-renew tickets after they expire. Users who connect to the cluster over SSH using Kerberos credentials need to run kinit from the primary node command line to renew after a ticket expires.

"KdcServerType": "Single",

Specifies that a single KDC server is referenced. Single is currently the only supported value.

"AdminServer": "kdc.com:749",

Specifies the fully qualified domain name (FQDN) or IP address of the external admin server. The admin server and KDC server typically run on the same machine with the same FQDN, but communicate on different ports.

If no port is specified, port 749 is used, which is the Kerberos default. Optionally, you can specify the port (for example, domain.example.com:749).

"KdcServer": "kdc.com:88",

Specifies the fully qualified domain name (FQDN) of the external KDC server. The KDC server and admin server typically run on the same machine with the same FQDN, but use different ports.

If no port is specified, port 88 is used, which is the Kerberos default. Optionally, you can specify the port (for example, domain.example.com:88).

"AdIntegrationConfiguration": {

Specifies that Kerberos principal authentication is integrated with a Microsoft Active Directory domain.

"AdRealm": "AD.DOMAIN.COM",

Specifies the Kerberos realm name of the Active Directory domain. By convention, Kerberos realm names are typically the same as the domain name but in all capital letters.

"AdDomain": "ad.domain.com"

Specifies the Active Directory domain name.

"AdServer": "ad.domain.com"

Specifies the fully qualified domain name (FQDN) of the Microsoft Active Directory domain controller.

}

}

}

}

Configure IAM roles for EMRFS requests to Amazon S3

IAM roles for EMRFS allow you to provide different permissions to EMRFS data in Amazon S3. You create mappings that specify an IAM role that is used for permissions when an access request contains an identifier that you specify. The identifier can be a Hadoop user or role, or an Amazon S3 prefix.

For more information, see Configure IAM roles for EMRFS requests to Amazon S3.

Specifying IAM roles for EMRFS using the AWS CLI

The following is an example JSON snippet for specifying custom IAM roles for EMRFS within a security configuration. It demonstrates role mappings for the three different identifier types, followed by a parameter reference.

{ "AuthorizationConfiguration": { "EmrFsConfiguration": { "RoleMappings": [{ "Role": "arn:aws:iam::123456789101:role/allow_EMRFS_access_for_user1", "IdentifierType": "User", "Identifiers": [ "user1" ] },{ "Role": "arn:aws:iam::123456789101:role/allow_EMRFS_access_to_MyBuckets", "IdentifierType": "Prefix", "Identifiers": [ "s3://MyBucket/","s3://MyOtherBucket/" ] },{ "Role": "arn:aws:iam::123456789101:role/allow_EMRFS_access_for_AdminGroup", "IdentifierType": "Group", "Identifiers": [ "AdminGroup" ] }] } } }
Parameter Description

"AuthorizationConfiguration":

Required.

"EmrFsConfiguration":

Required. Contains role mappings.

  "RoleMappings":

Required. Contains one or more role mapping definitions. Role mappings are evaluated in the top-down order that they appear. If a role mapping evaluates as true for an EMRFS call for data in Amazon S3, no further role mappings are evaluated and EMRFS uses the specified IAM role for the request. Role mappings consist of the following required parameters:

   "Role":

Specifies the ARN identifier of an IAM role in the format arn:aws:iam::account-id:role/role-name. This is the IAM role that Amazon EMR assumes if the EMRFS request to Amazon S3 matches any of the Identifiers specified.

   "IdentifierType":

Can be one of the following:

  • "User" specifies that the identifiers are one or more Hadoop users, which can be Linux account users or Kerberos principals. When the EMRFS request originates with the user or users specified, the IAM role is assumed.

  • "Prefix" specifies that the identifier is an Amazon S3 location. The IAM role is assumed for calls to the location or locations with the specified prefixes. For example, the prefix s3://mybucket/ matches s3://mybucket/mydir and s3://mybucket/yetanotherdir.

  • "Group" specifies that the identifiers are one or more Hadoop groups. The IAM role is assumed if the request originates from a user in the specified group or groups.

   "Identifiers":

Specifies one or more identifiers of the appropriate identifier type. Separate multiple identifiers by commas with no spaces.

Configure metadata service requests to Amazon EC2 instances

Instance metadata is data about your instance that you can use to configure or manage the running instance. You can access instance metadata from a running instance using one of the following methods:

  • Instance Metadata Service Version 1 (IMDSv1) - a request/response method

  • Instance Metadata Service Version 2 (IMDSv2) - a session-oriented method

While Amazon EC2 supports both IMDSv1 and IMDSv2, Amazon EMR supports IMDSv2 in Amazon EMR 5.23.1, 5.27.1, 5.32 or later, and 6.2 or later. In these releases, Amazon EMR components use IMDSv2 for all IMDS calls. For IMDS calls in your application code, you can use both IMDSv1 and IMDSv2, or configure the IMDS to use only IMDSv2 for added security. When you specify that IMDSv2 must be used, IMDSv1 no longer works.

For more information, see Configure the instance metadata service in the Amazon EC2 User Guide.

Note

In earlier Amazon EMR 5.x or 6.x releases, turning off IMDSv1 causes cluster startup failure as Amazon EMR components use IMDSv1 for all IMDS calls. When turning off IMDSv1, please ensure that any custom software that utilizes IMDSv1 is updated to IMDSv2.

Specifying instance metadata service configuration using the AWS CLI

The following is an example JSON snippet for specifying Amazon EC2 instance metadata service (IMDS) within a security configuration. Using a custom security configuration is optional.

{ "InstanceMetadataServiceConfiguration" : { "MinimumInstanceMetadataServiceVersion": integer, "HttpPutResponseHopLimit": integer } }
Parameter Description

"InstanceMetadataServiceConfiguration":

If you don't specify IMDS within a security configuration and use an Amazon EMR release that requires IMDSv1, Amazon EMR defaults to using IMDSv1 as the minimum instance metadata service version. If you want to use your own configuration, both of the following parameters are required.

"MinimumInstanceMetadataServiceVersion":

Required. Specify 1 or 2. A value of 1 allows IMDSv1 and IMDSv2. A value of 2 allows only IMDSv2.

"HttpPutResponseHopLimit":

Required. The desired HTTP PUT response hop limit for instance metadata requests. The larger the number, the further instance metadata requests can travel. Default: 1. Specify an integer from 1 to 64.

Specifying instance metadata service configuration using the console

You can configure the use of IMDS for a cluster when you launch it from the Amazon EMR console.

To configure the use of IMDS using the console:
  1. When creating a new security configuration on the Security configurations page, select Configure EC2 Instance metadata service under the EC2 Instance Metadata Service setting. This configuration is supported only in Amazon EMR 5.23.1, 5.27.1, 5.32 or later, and 6.2 or later.

  2. For the Minimum Instance Metadata Service Version option, select either:

    • Turn off IMDSv1 and only allow IMDSv2, if you want to allow only IMDSv2 on this cluster. See Transition to using instance metadata service version 2 in the Amazon EC2 User Guide.

    • Allow both IMDSv1 and IMDSv2 on cluster, if you want to allow IMDSv1 and session-oriented IMDSv2 on this cluster.

  3. For IMDSv2, you can also configure the allowable number of network hops for the metadata token by setting the HTTP put response hop limit to an integer between 1 and 64.

For more information, see Configure the instance metadata service in the Amazon EC2 User Guide.

See Configure instance details and Configure the instance metadata service in the Amazon EC2 User Guide.