Architecture details - Amazon Marketing Cloud Insights on AWS

Architecture details

This section describes the components and AWS services that make up this solution and the architecture details on how these components work together.

AWS services in this solution

AWS service Description
Amazon Athena Core. Access the AWS Glue Data Catalog and query the transformed data in the stage Amazon S3 bucket.
AWS Glue Core. Apply a heavy transformation in the data lake including partitioning pre-stage data and output the data into parquet files.
AWS Lambda Core. Lambda is used to add AMC instances as a part of microservices and register provisioned customers for the data lake. Lambda is also used to process workflow requests, check responses, notify users, transform raw data, partition pre-stage data, and manage metadata stored in Amazon S3 files.
AWS Lake Formation Core. For data lake governance and security.
Amazon S3 Core. The solution uses Amazon S3 buckets to store reporting from Amazon Ads API and Selling Partner API, pre-stage data, and post-stage data.
AWS Step Functions Core. Step Functions orchestrates the Lambda functions and user notifications in the Tenant Provisioning Service, Workflow Manager and data lake.
Amazon DynamoDB Supporting. DynamoDB tables store details of tenants, workflows, and data lake transformations.
Amazon EventBridge Supporting. EventBridge captures the raw data landing into Amazon S3 buckets and invokes the data lake on a recurring basis.
AWS KMS Supporting. The solution uses KMS keys to encrypt and decrypt the data in Amazon S3 buckets, SQS queues, and DynamoDB tables.
Amazon SNS Supporting. The solution uses Amazon SNS to publish execution status of workflow management service.
Amazon SQS Supporting. The solution uses Amazon SQS to send, store, and receive messages between tenants, workflows, and the data lake.
AWS Systems Manager Supporting. Provides application-level resource monitoring and visualization of resource operations and cost data.
AWS Secrets Manager Supporting. Secrets Manager stores the user-specified OAuth credentials.
Amazon QuickSight Optional. For business intelligence, analytics, interactive dashboards, and visualizations that business stakeholders can use.
Amazon SageMaker Jupyter notebook Optional. Amazon SageMaker with sample Jupyter notebooks that analysts can use to provision tenants and manage workflows.

Microservices

This solution deploys six microservices: Platform Management Notebooks, Tenant Provisioning Service, Workflow Manager, Amazon Ads Reporting, Selling Partner Reporting, and the Serverless Data Lake.

Platform Management Notebooks

The Platform Management Notebooks serve as sample code for interfacing with the Tenant Provisioning Service, Workflow Manager, Amazon Ads Reporting, and Selling Partner Reporting microservices.

Tenant Provisioning Service

The Tenant Provisioning Service manages AMC customers onboarded through the solution. Each onboarded AMC customer is mapped to an AMC instance and deployed as a stack in the solution.

Workflow Manager

The Workflow Manager manages requests sent to the AMC API. In addition to synchronizing data between the solution and a customer's AMC instance, the Workflow Manager enables scheduling of AMC workflows using CRON-based scheduling, and queue-based routing to ensure that all requests are processed.

Depicts Workflow Manager

Workflow Manager

Amazon Ads Reporting

The Amazon Ads Reporting microservice schedules and fetches reports from the Amazon Ads reporting API endpoint.

Depicts Amazon Ads Reporting

Amazon Ads Reporting

Selling Partner Reporting

The Selling Partner Reporting microservice schedules and fetches reports from the Selling Partner API.

Depicts Amazon Ads Reporting

Amazon Ads Reporting

Serverless Data lake

The Data Lake transforms the data delivered by the other microservices in any of the intake S3 buckets deployed by the application (reporting bucket for Amazon Ads and Selling Partner reports, AMC buckets for AMC data, and the general-purpose Raw bucket for custom data uploaded by an external provider or AWS service). The data lake detects the objects created in the bucket and starts the transformations if the dataset is configured. The data lake routes the data to its corresponding pipeline and applies custom transformation for the dataset provided by customers. The transformed data is stored to the Amazon S3 stage buckets and can be accessed through AWS Glue Data Catalog.

Depicts Data Lake

Data Lake

Orchestration

AWS Step Functions is the orchestration service used in the Tenant Provisioning Service, Workflow Manager, Amazon Ads Reporting, Selling Partner Reporting, and data lake to coordinate multiple activities in this solution.

  • The Step Functions in the Tenant Provisioning Service orchestrate Lambda functions to add AMC instances, and register the provisioned customer into the data lake.

  • The Workflow Manager uses Step Functions to coordinate Lambda functions for processing workflow requests, creating workflow runs, checking workflow status, and notifying the user.

  • Step Functions in the data lake automates transformations after data are delivered in any of the intake S3 buckets.

  • The Amazon Ads Reporting and Selling Partner Reporting Step Functions orchestrate the Lambda functions to schedule and handle report requests, check the status of reports, and download the completed reports to the S3 bucket.