Architecture overview - AWS Content Analysis

Architecture overview

Deploying this solution builds the following environment in the AWS Cloud.

        AWS Content Analysis architecture diagram on AWS

Figure 1: AWS Content Analysis architecture on AWS

The AWS CloudFormation template deploys the following infrastructure:

  1. An Amazon CloudFront distribution to serve the static Content Analysis web application.

  2. An Amazon Simple Storage Service (Amazon S3) web source bucket for hosting the static web application.

  3. An Amazon Cognito user pool to provide a user directory.

  4. An Amazon Cognito identity pool to provide federation with AWS Identity and Access Management (IAM) for authentication and authorization to the web UI.

  5. An Amazon API Gateway REST API for the control plane to proxy file uploads and orchestrate workflow operations from the web UI to Amazon S3 and AWS Step Functions. AWS IAM roles are created for the API to operate.

  6. An AWS Lambda API handler function to support the control plane REST API.

  7. Amazon DynamoDB tables to store system parameters, workflow definitions, workflow status, workflow execution history and other workflow-related data.

  8. Amazon Simple Queue Service (Amazon SQS) resources to limit the total number of concurrently running workflows to a configurable maximum.

  9. A Lambda function for checking and recording the run status of workflows in DynamoDB.

  10. Two AWS Step Functions workflows: CasVideoWorkflow and CasImageWorkflow. These workflows consist of Lambda functions that run media analysis jobs in Amazon Rekognition, Amazon Transcribe, Amazon Translate, AWS Elemental MediaConvert, and Amazon Comprehend. These Lambda functions also interact with the data plane to store and retrieve media objects and metadata returned by media analysis jobs.

  11. An API Gateway REST API for CRUD functionality in the data plane.

  12. A Lambda API handler function to support the data plane REST API.

  13. A DynamoDB table to record relationships between metadata, media objects, and user-specified media files.

  14. An Amazon S3 bucket to store uploaded video files, derived metadata results, and derived media objects like thumbnails, audio files, and transcoded video files.

  15. Amazon Kinesis Data Streams resources to provide an interface for Amazon OpenSearch Service to access media metadata via a change data capture stream that reflects CRUD operations to the dataplane DynamoDB table.

  16. A Lambda function to extract, transform, and load media metadata from the dataplane DynamoDB table into an Amazon OpenSearch Service cluster.

  17. An Amazon OpenSearch Service cluster to index media metadata.