Launch an Amazon EMR cluster with Lake Formation - Amazon EMR

Launch an Amazon EMR cluster with Lake Formation

Before you launch an Amazon EMR cluster with AWS Lake Formation, you must complete the prerequisites in Before you begin.

When you launch an Amazon EMR cluster with Lake Formation, you must specify the following items.

After you launch your cluster, ensure you update the callback or single sign-on URL with your identity provider to direct users back to the master node of the cluster.

Example: Create an EMR Cluster with Lake Formation using the AWS CLI

The following example AWS CLI command launches an Amazon EMR cluster with Zeppelin integrated with AWS Lake Formation. It includes specified values for --kerberos-attributes, --security-configuration, and InstanceProfile as required for EMR integration with Lake Formation.

aws emr create-cluster --region us-east-1 \ --release-label emr-5.31.0 \ --use-default-roles \ --instance-groups InstanceGroupType=MASTER,InstanceCount=1,InstanceType=m4.xlarge \ InstanceGroupType=CORE,InstanceCount=1,InstanceType=m4.xlarge \ --applications Name=Zeppelin Name=Livy \ --kerberos-attributes Realm=EC2.INTERNAL,KdcAdminPassword=MyClusterKDCAdminPassword \ --ec2-attributes KeyName=EC2_KEY_PAIR,SubnetId=subnet-00xxxxxxxxxxxxx11,InstanceProfile=MyCustomEC2InstanceProfile \ --security-configuration security-configuration \ --name cluster-name

Update the callback or single sign-on URL with your Identity Provider

After you launch an Amazon EMR cluster with Lake Formation, you must update the callback or single-sign-on URL with your identity provider in order to direct users back to the master node of your cluster.

  1. Locate the public IP address of the master node and the master instance ID for your cluster in the EMR console (https://console.aws.amazon.com/elasticmapreduce/) or using the AWS CLI.

  2. Set up a callback URL in your identity provider (IdP) account:

    • When using AD FS as your IdP, complete the following steps:

      1. From the AD FS Management Console, go to Relying Party Trusts.

      2. Right-click the display name of your replying party trust, and choose Properties.

      3. In the Properties window, choose the Endpoints tab.

      4. Select the temporary URL that you provided previously, then choose edit.

      5. In the Edit Endpoint window, update the Trusted URL with the correct DNS name for your master node.

      6. In the Add an Endpoint window, fill in the Trusted URL box with your master node public DNS. For example,

        https://ec2-11-111-11-111.compute-1.amazonaws.com:8442/gateway/knoxsso/api/v1/websso?pac4jCallback=true&client_name=SAML2Client
      7. Choose OK.

    • When using Auth0 as your IdP, complete the following steps:

      1. Go to https://auth0.com/ and log in.

      2. In the left panel, choose Applications.

      3. Select your previously created application.

      4. On the Settings tab, update Allowed Callback URLs with your master node public DNS.

    • When using Okta as your IdP, complete the following steps:

      1. Go to https://developer.okta.com/ and log in.

      2. In the top right corner, choose Admin, then choose the Applications tab.

      3. Select your application name.

      4. On the General tab under your application name, choose SAML Settings, then choose Edit.

      5. On the Configure SAML tab, update Single-sign on URL with your master node public DNS.