Menu
Amazon EMR
Amazon EMR Release Guide

Providing Keys for At-Rest Data Encryption with Amazon EMR

You can use AWS Key Management Service (AWS KMS) or a custom key provider for at-rest data encryption in Amazon EMR. When you use AWS KMS, charges apply for the storage and use of encryption keys. For more information, see AWS KMS Pricing.

This topic provides key policy details for an AWS KMS CMK to be used with Amazon EMR, as well as guidelines and code examples for writing a custom key provider class for Amazon S3 encryption. For more information about creating keys, see Creating Keys in the AWS Key Management Service Developer Guide.

Using AWS KMS Customer Master Keys (CMKs) for Encryption

When using SSE-KMS or CSE-KMS, the AWS KMS encryption key must be created in the same region as your Amazon EMR cluster instance and the Amazon S3 buckets used with EMRFS. In addition, the Amazon EC2 instance role used for a cluster must have permission to use the CMK you specify. The default instance role is EMR_EC2_DefaultRole. If you have assigned a different instance role to a cluster, make sure that the role is added as a key user, which gives the role permission to use the CMK. For more information, see Using Key Policies in the AWS Key Management Service Developer Guide and Create and Use IAM Roles for Amazon EMR in the Amazon EMR Management Guide. Although you may use the same AWS KMS customer master key (CMK) for Amazon S3 data encryption as you use for local disk encryption, using separate keys is recommended.

You can use the AWS Management Console to add your instance profile or EC2 role to the list of key users for the specified AWS KMS CMK, or you can use the AWS CLI or an AWS SDK to attach an appropriate key policy.

Add the EMR Instance Role to an AWS KMS CMK

The following procedure describes how to add the default EMR instance role, EMR_EC2_DefaultRole as a key user using the AWS Management Console. It assumes that you have already created a CMK. To create a new CMK, see Creating Keys in the AWS Key Management Service Developer Guide.

To add the default instance role for Amazon EC2 to the list of encryption key users

  1. Open the Encryption keys section of the AWS Identity and Access Management (IAM) console at https://console.aws.amazon.com/iam/home#encryptionKeys.

  2. For Region, choose the appropriate AWS Region. Do not use the region selector in the navigation bar (top right corner).

  3. Select the alias of the CMK to modify.

  4. On the key details page under Key Users, choose Add.

  5. In the Attach dialog box, select the appropriate role. The default name for the role is EMR_EC2_DefaultRole.

  6. Choose Attach.

Creating a Custom Key Provider

When you create a custom key provider, the application is expected to implement the EncryptionMaterialsProvider interface, which is available in the AWS SDK for Java version 1.11.0 and later. The implementation can use any strategy to provide encryption materials. You may, for example, choose to provide static encryption materials or integrate with a more complex key management system. You must use a different provider class name for local disk encryption and Amazon S3 encryption.

The EncryptionMaterialsProvider class gets encryption materials by encryption context, which is used when Amazon EMR fetches encryption materials to encrypt data. Amazon EMR populates encryption context information at runtime to help the caller determine which encryption materials to return.

Example: Using a Custom Key Provider for Amazon S3 Encryption with EMRFS

When Amazon EMR fetches the encryption materials from the EncryptionMaterialsProvider class to perform encryption, EMRFS optionally populates the materialsDescription argument with two fields: the Amazon S3 URI for the object and the JobFlowId of the cluster, which can be used by the EncryptionMaterialsProvider class to return encryption materials selectively.

For example, the provider may return different keys for different Amazon S3 URI prefixes. It is the description of the returned encryption materials that is eventually stored with the Amazon S3 object rather than the materialsDescription value that is generated by EMRFS and passed to the provider. While decrypting an Amazon S3 object, the encryption materials description is passed to the EncryptionMaterialsProvider class, so that it can, again, selectively return the matching key to decrypt the object.

An EncryptionMaterialsProvider reference implementation is provided below. Another custom provider, EMRFSRSAEncryptionMaterialsProvider, is available from GitHub.

Copy
import com.amazonaws.services.s3.model.EncryptionMaterials; import com.amazonaws.services.s3.model.EncryptionMaterialsProvider; import com.amazonaws.services.s3.model.KMSEncryptionMaterials; import org.apache.hadoop.conf.Configurable; import org.apache.hadoop.conf.Configuration; import java.util.Map; /** * Provides KMSEncryptionMaterials according to Configuration */ public class MyEncryptionMaterialsProviders implements EncryptionMaterialsProvider, Configurable{ private Configuration conf; private String kmsKeyId; private EncryptionMaterials encryptionMaterials; private void init() { this.kmsKeyId = conf.get("my.kms.key.id"); this.encryptionMaterials = new KMSEncryptionMaterials(kmsKeyId); } @Override public void setConf(Configuration conf) { this.conf = conf; init(); } @Override public Configuration getConf() { return this.conf; } @Override public void refresh() { } @Override public EncryptionMaterials getEncryptionMaterials(Map<String, String> materialsDescription) { return this.encryptionMaterials; } @Override public EncryptionMaterials getEncryptionMaterials() { return this.encryptionMaterials; } }