Architecture Overview - Live Streaming with Automated Multi-Language Subtitling

Architecture Overview

Deploying this solution builds the following environment in the AWS Cloud.

        Live Streaming with Automated Multi-Language Subtitling solution architectural overview

Figure 1: Live Streaming with Automated Multi-Language Subtitling solution architecture

The solution’s AWS CloudFormation template deploys Live Streaming on AWS, which includes AWS Elemental MediaLive, MediaPackage, and Amazon CloudFront; Amazon Simple Storage Service (Amazon S3) buckets; Amazon Transcribe; Amazon Translate; and two AWS Lambda functions: one function (CaptionCreation) that converts audio to text and one function (TranscribeStreaming) that generates WebVTT subtitles that are sent to MediaPackage.

The subtitle generation process starts when MediaLive output is sent to the solution’s Amazon S3 bucket. The CaptionCreation Lambda function takes the manifest files from the bucket, extracts unsigned pulse-code module (PCM) audio from the TS video segments, and saves the PCM audio to Amazon S3. Then, the CaptionCreation function invokes the TranscribeStreaming function and gives it the PCM audio.

The TranscribeStreaming function uses Amazon Transcribe streaming transcription to convert the audio stream to text in real time. The function then sends the transcript back to the CaptionCreation function. If multiple languages are required, the CaptionCreation function calls Amazon Translate to translate the transcript.

The CaptionCreation function creates the WebVTT subtitle files and the manifests and sends those and the video files to MediaPackage.

MediaPackage ingests the files and packages them into formats that are delivered to four MediaPackage custom endpoints.

An Amazon CloudFront distribution is configured to use the MediaPackage custom endpoints as its origin. The CloudFront distribution delivers your live stream to viewers with low latency and high transfer speeds.