Data Lake Solution
Data Lake Solution

Appendix A: Federated Template

For customers who want to integrate with their existing SAML identity provider such Microsoft Active Directory, this data lake solution includes another AWS CloudFormation template that deploys the same workflow with Active Directory (AD) Federation configuration.

AWS CloudFormation Template

This solution includes the following AWS CloudFormation template, which you can download before deployment:


          Federated data lake template button
        data-lake-deploy-federated.template: Use this template to launch a version of the solution that is ready to integrate with your existing SAML identity provider such Microsoft Active Directory.

This template, in turn, launches the following nested stacks:

  • data-lake-storage.template: This template deploys the Amazon S3, Amazon Elasticsearch Service, and Amazon DynamoDB components of the solution.

  • data-lake-services.template: This template deploys the AWS Lambda microservices and the necessary IAM roles and policies. In addition, it deploys the AWS KMS resources for the solution.

  • data-lake-api.template: This template deploys the Amazon API Gateway resources.

Automated Deployment

Before you launch the automated deployment, please review the architecture, configuration, network security, and other considerations discussed in this guide. Follow the step-by-step instructions in this section to configure and deploy the data lake solution with AD Federation into your account.

Time to deploy: Approximately 30 minutes

What We'll Cover

The procedure for deploying this architecture on AWS consists of the following steps. For detailed instructions, follow the links for each step.

Step 1. Launch the Stack

  • Launch the AWS CloudFormation template into your AWS account.

  • Enter values for the required parameter: Stack Name, Cognito Domain, AD FS Hostname

  • Review the other template parameters, and adjust if necessary.

Step 2. Complete AD Federation

  • Complete AD Federation.

  • Manually configure the AD Federation.

Step 1. Launch the Stack

The Amazon CloudWatch template automatically deploys the data lake solution on the AWS Cloud.

Note

You are responsible for the cost of the AWS services used while running this solution. See the Cost section for more details. For full details, see the pricing webpage for each AWS service you will be using in this solution.

  1. Log in to the AWS Management Console and click the button below to launch the data-lake-deploy-federated AWS CloudFormation template.

    
                  Data lake solution launch button

    You can also download the template as a starting point for your own implementation.

  2. The template is launched in the US East (N. Virginia) Region by default. To launch the data lake solution in a different AWS Region, use the region selector in the console navigation bar.

    Note

    This solution uses Amazon Cognito which is currently available in specific AWS Regions only. Therefore, you must launch this solution in an AWS Region where Amazon Cognito, Amazon Athena, and AWS Glue which are currently available in specific AWS Regions only. Therefore, you must launch this solution in an AWS Region where these services are available. For the most current service availability by region, see the AWS service offerings by region.

  3. On the Select Template page, verify that you selected the correct template and choose Next.

  4. On the Specify Details page, assign a name to your data lake solution stack.

  5. Under Parameters, review the parameters for the template, and modify them as necessary. This solution uses the following default values.

    Parameter Default Description
    Cognito Domain <Requires input>

    Choose an available domain prefix for your Amazon Cognito hosted domain. The solution uses Amazon Cognito to offer user name and password protection for solution's Kibana. Defining a domain name for the user pool is a pre-requirement for that.

    AD FS Hostname <Requires input>

    Insert the hostname of your AD FS endpoint.

  6. Choose Next.

  7. On the Options page, you can specify tags (key-value pairs) for resources in your stack and set additional options, and then choose Next.

  8. On the Review page, review and confirm the settings. Be sure to check the box acknowledging that the template will create AWS Identity and Access Management (IAM) resources with custom names.

  9. Choose Create to deploy the stack.

    After the stack launches, the three nested stacks will be launched in the same AWS Region. Once all of the stacks and stack resources have successfully launched, you will see the message CREATE_COMPLETE. This can take 30 minutes or longer.

Step 2. Complete AD Federation

After the data lake stack launch completes, you must complete the AD Federation configuration. Note that this step is required only if you deployed the federated template.

  1. Log into the AWS Management Console and navigate to the stack Outputs tab.

  2. Note the values of the RelyingPartyURL, RelyingPartyTrustedIdentifier, and LogoutTrustedURL keys.

  3. Navigate to the IdentityProvidersUrl link.

  4. On the Federation console, in the Identity providers section under Active SAML Providers, select Show signing certificate.

  5. Copy the certificate containing the public key. This key will be used by the identity provider to verify the signed logout request to a .cer file (for example, datalake.cer) on your AD FS server.

Add Amazon Cognito as a relying party in AD FS:

AD FS federation occurs with the participation of two parties; the identity or claims provider (Active Directory) and the relying party (Cognito). The relying party is a federation partner that is represented by a claims provider trust in the federation service. Use the following procedure to configure a new relying party in Active Directory Federation Services:

  1. In the AD FS Management Console, right-click AD FS, and select Add Relying Party Trust.

  2. In the Add Relying Party Trust wizard, select Start.

  3. Select Select Data Source. Then, select the Enter data about the relying party manually radio button, and choose Next.

  4. Select Specify Display Name and set a Display Name (For example, Data Lake Solution on AWS).

  5. Select Configure Certificate and select Next to accept the default values.

  6. Select Configure URL. Then, select Enable support for the SAML 2.0 WebSSO protocol and set the URL replying party (use the value you noted from RelyingPartyURL output parameter). Then, select Next.

  7. Select Configure Identifiers and set the Relying party trusted identifier (use the value you noted from RelyingPartyTrustedIdentifier output parameter). Then, select Next.

  8. Select Next until you reach the end of the wizard.

Enable Sign Out Flow

Enabling this flow sends a signed logout request to the Active Directory when logout is called. The AD will process the signed logout request and logout your user from the Amazon Cognito session. Note that the AD FS server expects a signed logout request, you must configure the signing certificate provided by Amazon Cognito with your AD FS.

Use the following procedure to configure this endpoint for consuming logout responses from your Active Directory Federation Services:

  1. In the AD FS Management Console, double-click on the relying party, select the Endpoints tab, and select Add SAML.

  2. Set the Endpoint type to SAML Logout; Binding to POST, and set the Trusted URL URL value (use the value you noted from LogoutTrustedURLoutput parameter). Then, select OK.

  3. Select the Signature tab, and add the certificate you copied from Federation console (datalake.cer), and select OK.

Custom Claim Rules

Microsoft AD FS uses Claims Rule Language to issue and transform claims between claims providers and relying parties. A claim is information about a user from a trusted source. The trusted source is asserting that the information is true, and that source has authenticated the user. The claims provider is the source of the claim. This can be information pulled from an attribute store such as Active Directory (AD). Amazon Cognito user pools support SAML 2.0 federation with post-binding endpoints. This eliminates the need for your app to retrieve or parse SAML assertion responses, because the user pool directly receives the SAML response from your identity provider via a user agent. Your user pool acts as a service provider on behalf of your application.

Use the following procedure to configure a new relying party in Active Directory Federation Services:

Note that this procedure configures all members of the DataLake Admins groups Role outgoing claim type as Admin.

  1. In the AD FS Management Console, right-click on the relying party, and select Edit Claim Issue Policy.

  2. Specify a claim rule name.

  3. Select Attribute store. Note that this can be Active Directory if your users are in Active Directory.

  4. Map an LDAP Attribute (For example, E-Mail-Address) to Outgoing Claim Type (For example, E-Mail Address).

    Make sure that your AD FS populates the following required attributes for your user pool in the SAML assertion: fullName, email, nameId, groups, and isAdmin.

  5. Log into the AWS Management Console, navigate to the stack Outputs tab.

  6. Select the Value of the ConsoleUrl key, you will be redirected to the AD FS Management Console.