AWS GovCloud (US) User Guide
AWS GovCloud (US) User Guide

The AWS Documentation website is getting a new look!
Try it now and let us know what you think. Switch to the new look >>

You can return to the original look by selecting English in the language selector above.

Amazon EMR

Amazon EMR is a web service that makes it easy to process large amounts of data efficiently. Amazon EMR uses Hadoop processing combined with several AWS products to do such tasks as web indexing, data mining, log file analysis, machine learning, scientific simulation, and data warehousing.

The following list details the differences for using this service in AWS GovCloud (US) Regions compared to other AWS Regions:

  • MapR distributions are currently not supported in AWS GovCloud (US) Regions.

  • In AWS GovCloud (US) Regions, you launch all Amazon EMR job flows in Amazon Virtual Private Cloud (Amazon VPC). For information about configuring an Amazon VPC that can run a job flow, see Select an Amazon VPC and Subnet for the Cluster.

  • Launching a job flow by using Spot instances is not currently supported in AWS GovCloud (US) Regions.

  • Launching a job flow with debugging is not currently supported in AWS GovCloud (US) Regions.

For more information about Amazon EMR, see the Amazon EMR documentation.

ITAR Boundary

AWS GovCloud (US) has an ITAR boundary, which defines where customers are allowed to store ITAR-controlled data for this service in AWS GovCloud (US) Regions. To maintain ITAR compliance, you must place ITAR-controlled data on the applicable part of the ITAR boundary. If you do not have any ITAR-controlled data in AWS GovCloud (US) Regions, this section does not apply to you. The following information identifies the ITAR boundary for this service:

ITAR-Regulated Data Permitted ITAR-Regulated Data Not Permitted
  • All input and output data that is entered, stored, and processed in Amazon EMR can contain ITAR-regulated data.

  • Amazon EMR metadata is not permitted to contain ITAR-regulated data. This metadata includes all configuration data that you enter when creating and maintaining your job flows.

  • Do not enter ITAR-regulated data in Amazon EMR when doing the following:

    • Naming a job flow

    • Specifying a file location

    • Naming a bootstrap action

    • Providing arguments

    • Resource tags

  • ITAR-regulated data should not be printed to your logs. (Amazon EMR metadata and logs are not permitted to contain ITAR-regulated data.)

If you are processing ITAR-regulated data with this service, use the SSL (HTTPS) endpoint to maintain ITAR compliance. For a list of endpoints, see Endpoints for the AWS GovCloud (US) Regions.

On this page: