Media Analysis Solution
Media Analysis Solution

Appendix B: Media Analysis State Machine

The Media Analysis Solution coordinates the analysis of media files using an AWS Step Functions state machine that triggers an AWS Lambda function to orchestrate the analysis and extraction of metadata using managed Artificial Intelligence (AI) services. When a new media file is uploaded to Amazon Simple Storage Service (Amazon S3), a Lambda function is invoked that parses the event details and starts the state machine using the following input:

{ "Records": [ { "eventSource": "media-analysis" } ], "upload_time": "2018-05-16T10:00:00.000Z", "key":"private/<Cognito-Identity-Id>/media/<object-id>/content/filename.ext", "file_type": "ext", "size": 50000, "owner_id": "<Cognito-Identity-Id>", "object_id": "<object-id>", "file_name": "image_name.ext" }

Depending on the format of the file uploaded, the state machine will execute image, video, and/or audio analysis.

The solution leverages Pass states to inform Task states of the type of analysis that should be performed by the Lambda function. The Pass state simply passes its input to its output. Each Pass state updates the $.lambda field in the state machine output with parameters for the Lambda function. When invoked, the Lambda function consumes this output, and depending on the $.lambda.service_name and $.lambda.function_name, it will execute the specified analysis. The code snippet below coordinates label detection in the image analysis state machine:

{ "Image-Label Params": { "Type": "Pass", "Result": { "service_name": "image", "function_name": "get_labels" }, "ResultPath": "$.lambda", "Next": "Image-Get Labels" }, "Image-Get Labels": { "Type": "Task", "Resource": "arn:aws:lambda:us-east-1:xxxxxxxxxxxx:function:name", "InputPath": "$", "ResultPath": "$.results.labels", "End": true } }

Images

If a file_type is a PNG, JPG, or JPEG, the state machine enters a parallel state that automatically coordinates the analysis of the image using the synchronous Amazon Rekognition API.


        Image Analysis State Machine

Figure 2: Image analysis state machine

Each branch of the image analysis state machine starts the analysis job and immediately processes and stores the results.

Videos

If a file_type is MOV, or MP4, the state machine enters a parallel state that automatically coordinates the analysis of the video using the asynchronous Amazon Rekognition Video API.


        Video Analysis State Machine

Figure 3: Video analysis state machine

Each branch of the video analysis state machine starts the analysis job, then uses the job status poller pattern to check on the status of the video analysis job. Once the analysis is complete, the state machine retrieves, processes, and stores the results.

Audio

If a file_type is a MP3, MP4, WAV, or FLAC, the state machine automatically coordinates the analysis of the audio using the asynchronous Amazon Transcribe API.


        Video Analysis State Machine

Figure 4: Audio analysis state machine

The audio analysis state machine starts the analysis job, then uses the job status poller pattern to check on the status of the audio analysis job. Once the analysis is complete, the results are retrieved, processed, and stored and the state machine enters a parallel state that automatically coordinates the detection of key entities and phrases in the resulting transcript using the asynchronous Amazon Comprehend API. The results are immediately processed and stored.

On this page: