Media Analysis Solution
Media Analysis Solution

The AWS Documentation website is getting a new look!
Try it now and let us know what you think. Switch to the new look >>

You can return to the original look by selecting English in the language selector above.

Solution Components

Image Analysis

When you upload a PNG, JPG, or JPEG image to the solution’s encrypted Amazon S3 bucket, a Lambda function invokes the Step Functions state machine which enters a parallel state that simultaneously executes three branches, each responsible for orchestrating a different type of image analysis: label detection, celebrity recognition, and face search. The metadata results are stored in the Amazon S3 bucket and returned to the state machine as the task output.

When the analysis is complete, the state machine enters a final state that indexes the results in the Elasticsearch cluster. You can search and retrieve the image metadata using the solution API or web interface.

For more information on the image analysis state machine, see Appendix B.

Video Analysis

When you upload an MOV or MP4 video, a Lambda function starts the state machine which enters a parallel state that executes five branches, each responsible for orchestrating a different type of video analysis: label detection, celebrity recognition, face detection, face search, and pathing. The metadata results are stored in Amazon S3 and returned to the state machine as the task output.

When the analysis is complete, the results are indexed in the Elasticsearch cluster where they can be searched and retrieved using the solution API or web interface.

Note that MP4 video files will also be processed by the audio analysis state machine.

Media Conversion

When an MP4 video file is uploaded, a Lambda function triggers an AWS Elemental MediaConvert job to create a separate audio file. When the job is complete MediaConvert will upload the audio file into Amazon S3 and returned to the audio analysis state machine.

For more information on the video analysis state machine, see Appendix B.

Audio Analysis

When a FLAC, MP3, WAV, or MP4 audio file is uploaded, a Lambda function starts the state machine which orchestrates the audio transcription process using Amazon Transcribe.

Once the transcription is complete, the state machine leverages the resulting transcript to perform natural language processing using Amazon Comprehend. The state machine enters a parallel state that simultaneously executes two branches to perform key phrase and key entity detection.

When the analysis is complete, the results are indexed in the Elasticsearch cluster where they can be searched and retrieved using the solution API or web interface.

For more information on the audio analysis state machine, see Appendix B.

Face Indexing

The Media Analysis Solution allows you to upload images of faces to be indexed in an Amazon Rekognition collection. These faces are automatically used during face searches on images and videos.

The first time you upload a face image to the solution, a Lambda function will create a collection for you, then it will index the face. When a new face image is uploaded, the face is added to your collection.

Web Interface

The Media Analysis Solution features a simple static web interface that makes it easier to upload files, index faces, browse media files, and view detailed search results. This web interface can be used as a reference for building your own media analysis applications. The interface leverages Amazon Cognito for user authentication, an Amazon API Gateway RESTful API for search and metadata retrieval, AWS Amplify for interacting with cloud services, and is powered by web assets hosted in an Amazon S3 bucket.

When authenticated users upload files through the interface, the files are stored in private folders that correspond to their unique Amazon Cognito identifier to ensure fine-grained access control using AWS Identity and Access Management (IAM) policies.

Customization

To leverage existing tools and applications to upload files or develop your own, consider the following Amazon S3 naming conventions.

File Type Convention
Media files private/<Amazon Cognito Identity ID>/media/<UUID v4>/content/filename.ext
Raw extracted metadata private/<Amazon Cognito Identity ID>/media/<UUID v4>/results/filename.ext
Face images (to be added to you Amazon Rekognition collection) private/<Amazon Cognito Identity ID>/collection/<UUID v4>/content/externalImageId.ext