Amazon EMR
Management Guide

Amazon VPC Options

When launching an EMR cluster within a VPC, you can launch it within either a public or private subnet. There are slight, notable differences in configuration, depending on the subnet type you choose for a cluster.

Public Subnets

EMR clusters in a public subnet require a connected internet gateway. This is because Amazon EMR clusters must access AWS services and Amazon EMR. If a service, such as Amazon S3, provides the ability to create a VPC endpoint, you can access those services using the endpoint instead of accessing a public endpoint through an internet gateway. Additionally, Amazon EMR cannot communicate with clusters in public subnets through a network address translation (NAT) device. An internet gateway is required for this purpose but you can still use a NAT instance or gateway for other traffic in more complex scenarios.

All instances in a cluster connect to Amazon S3 through either a VPC endpoint or internet gateway. Other AWS services which do not currently support VPC endpoints use only an internet gateway.

If you have additional AWS resources that you do not want connected to the internet gateway, you can launch those components in a private subnet that you create within your VPC.

Clusters running in a public subnet use two security groups: one for the master node and another for core and task nodes. For more information, see Control Network Traffic with Security Groups.

The following diagram shows how an Amazon EMR cluster runs in a VPC using a public subnet. The cluster is able to connect to other AWS resources, such as Amazon S3 buckets, through the internet gateway.

						Cluster on a VPC

The following diagram shows how to set up a VPC so that a cluster in the VPC can access resources in your own network, such as an Oracle database.

						Set up a VPC and cluster to access local VPN resources

Private Subnets

Private subnets allow you to launch AWS resources without requiring the subnet to have an attached internet gateway. This might be useful, for example, in an application that uses these private resources in the backend. Those resources can then initiate outbound traffic using a NAT instance located in another subnet that has an internet gateway attached. For more information about this scenario, see Scenario 2: VPC with Public and Private Subnets (NAT).


Amazon EMR only supports launching clusters in private subnets in releases 4.2 or later.

The following are differences from public subnets:

  • To access AWS services that do not provide a VPC endpoint, you still must use a NAT instance or an internet gateway.

  • At a minimum, you must provide a route to the Amazon EMR service logs bucket and Amazon Linux repository in Amazon S3. For more information, see Minimum Amazon S3 Policy for Private Subnet

  • If you use EMRFS features, you need to have an Amazon S3 VPC endpoint and a route from your private subnet to DynamoDB.

  • Debugging only works if you provide a route from your private subnet to a public Amazon SQS endpoint.

  • Creating a private subnet configuration with a NAT instance or gateway in a public subnet is only supported using the AWS Management Console. The easiest way to add and configure NAT instances and Amazon S3 VPC endpoints for EMR clusters is to use the VPC Subnets List page in the Amazon EMR console. To configure NAT gateways, see NAT Gateways in the Amazon VPC User Guide.

  • You cannot change a subnet with an existing EMR cluster from public to private or vice versa. To locate an EMR cluster within a private subnet, the cluster must be started in that private subnet.

Amazon EMR creates and uses different default security groups for the clusters in a private subnet: ElasticMapReduce-Master-Private, ElasticMapReduce-Slave-Private, and ElasticMapReduce-ServiceAccess. For more information, see Control Network Traffic with Security Groups.

For a complete listing of NACLs of your cluster, choose Security groups for Master and Security groups for Core & Task on the Amazon EMR console Cluster Details page.

The following image shows how an EMR cluster is configured within a private subnet. The only communication outside the subnet is to Amazon EMR.

						Launch an EMR cluster in a private subnet

The following image shows a sample configuration for an EMR cluster within a private subnet connected to a NAT instance that is residing in a public subnet.

						Private subnet with NAT