Menu
Amazon Elastic MapReduce
Amazon EMR Release Guide

Hadoop Key Management Server

Hadoop KMS is a key management server that provides the ability to implement cryptographic services for Hadoop clusters, and can serve as the key vendor for Transparent Encryption in HDFS. Hadoop KMS in Amazon EMR is installed and enabled by default when you select the Hadoop application when launching an EMR cluster. The Hadoop KMS does not store the keys itself except in the case of temporary caching. Hadoop KMS acts as a proxy between the key provider and the client trustee to a backing keystore—it is not a keystore. The default keystore that is created for Hadoop KMS is the Java Cryptography Extension KeyStore (JCEKS). The JCE unlimited strength policy is also included, so you can create keys with the desired length. Hadoop KMS also supports a range of ACLs that control access to keys and key operations independently of other client applications such as HDFS. The default key length in Amazon EMR is 256 bit.

To configure Hadoop KMS, use the hadoop-kms-site classification to change settings. To configure ACLs, you use the classification kms-acls.

For more information, go to the Hadoop KMS documentation. Hadoop KMS is used in Hadoop HDFS transparent encryption. To learn more about HDFS transparent encryption in Amazon EMR, see Transparent Encryption in HDFS and the HDFS Transparent Encryption topic in the Apache Hadoop documentation.

Note

In Amazon EMR, KMS over HTTPS is not enabled by default with Hadoop KMS. To learn how to enable KMS over HTTPS, see the Hadoop documentation.

Important

Hadoop KMS requires your key names to be lowercase. If you use a key that has uppercase characters, then your cluster will fail during launch,

Configuring Hadoop KMS in Amazon EMR

Important

The Hadoop KMS port is changed in Amazon EMR release 4.6 or later. kms-http-port is now 9700 and kms-admin-port is 9701.

You can configure Hadoop KMS at cluster creation time using the configuration API for Amazon EMR releases. The following are the configuration object classifications available for Hadoop KMS:

Hadoop KMS Configuration Classifications

ClassificationFilename
hadoop-kms-sitekms-site.xml
hadoop-kms-aclskms-acls.xml
hadoop-kms-envkms-env.sh
hadoop-kms-log4jkms-log4j.properties

To set Hadoop KMS ACLs using the CLI

  • Create a cluster with Hadoop KMS with a ACLs, use the following:

    aws emr create-cluster --release-label emr-4.7.2 --instance-type m3.xlarge --instance-count 2 \
    --applications Name=App1 Name=App2 --configurations https://s3.amazonaws.com/mybucket/myfolder/myConfig.json

    Note

    For Windows, replace the above Linux line continuation character (\) with the caret (^).

    myConfig.json:

    [
        {
          "Classification": "hadoop-kms-acls",
          "Properties": {
            "hadoop.kms.blacklist.CREATE": "hdfs,foo,myBannedUser",
            "hadoop.kms.acl.ROLLOVER": "myAllowedUser"       
          }
        }
      ]
    

To disable Hadoop KMS cache using the CLI

  • Create a cluster with Hadoop KMS hadoop.kms.cache.enable set to false, using the following:

    aws emr create-cluster --release-label emr-4.7.2 --instance-type m3.xlarge --instance-count 2 \
    --applications Name=App1 Name=App2 --configurations https://s3.amazonaws.com/mybucket/myfolder/myConfig.json

    Note

    For Windows, replace the above Linux line continuation character (\) with the caret (^).

    myConfig.json:

    [
        {
          "Classification": "hadoop-kms-site",
          "Properties": {
            "hadoop.kms.cache.enable": "false"
          }
        }
      ]
    

To set environment variables in the kms-env.sh script using the CLI

  • You can change settings in kms-env.sh via the hadoop-kms-env configuration. Create a cluster with Hadoop KMS using the following:

    aws emr create-cluster --release-label emr-4.7.2 --instance-type m3.xlarge --instance-count 2 \
    --applications Name=App1 Name=App2 --configurations https://s3.amazonaws.com/mybucket/myfolder/myConfig.json

    Note

    For Windows, replace the above Linux line continuation character (\) with the caret (^).

    myConfig.json:

    [
      {
        "Classification": "hadoop-kms-env",
        "Properties": {     
        },
        "Configurations": [
          {
            "Classification": "export",
            "Properties": {
              "JAVA_LIBRARY_PATH": "/path/to/files",
              "KMS_SSL_KEYSTORE_FILE": "/non/Default/Path/.keystore",
              "KMS_SSL_KEYSTORE_PASS": "myPass"
            },
            "Configurations": [        
            ]
          }
        ]
      }
    ]
    

For infomation about configuring Hadoop KMS, see http://hadoop.apache.org/docs/current/hadoop-kms/index.html.