Build a video processing pipeline by using Amazon Kinesis Video Streams and AWS Fargate - AWS Prescriptive Guidance

Build a video processing pipeline by using Amazon Kinesis Video Streams and AWS Fargate

Created by Piotr Chotkowski (AWS) and Pushparaju Thangavel (AWS)

Environment: PoC or pilot

Technologies: Analytics; Media services

AWS services: AWS Fargate; Amazon Kinesis; Amazon S3

Summary

This pattern demonstrates how to use Amazon Kinesis Video Streams and AWS Fargate to extract frames from a video stream and store them as image files for further processing in Amazon Simple Storage Service (Amazon S3). 

The pattern provides a sample application in the form of a Java Maven project. This application defines the AWS infrastructure by using the AWS Cloud Development Kit (AWS CDK). Both the frame processing logic and the infrastructure definitions are written in the Java programming language. You can use this sample application as a basis for developing your own real-time video processing pipeline or to build the video preprocessing step of a machine learning pipeline. 

Prerequisites and limitations

Prerequisites 

Limitations 

This pattern is intended as a proof of concept, or as a basis for further development. It should not be used in its current form in production deployments.

Product versions

  • This pattern was tested with the AWS CDK version 1.77.0 (see AWS CDK versions)

  • JDK 11

  • AWS CLI version 2

Architecture

Target technology stack

  • Amazon Kinesis Video Streams

  • AWS Fargate task

  • Amazon Simple Queue Service (Amazon SQS) queue

  • Amazon S3 bucket

Target architecture

Architecture for using Kinesis Video Streams and Fargate to build a video processing pipeline.

The user creates a Kinesis video stream, uploads a video, and sends a JSON message that contains details about the input Kinesis video stream and the output S3 bucket to an SQS queue. AWS Fargate, which is running the main application in a container, pulls the message from the SQS queue and starts extracting frames. Each frame is saved in an image file and stored in the target S3 bucket.

Automation and scale

The sample application can scale both horizontally and vertically within a single AWS Region. Horizontal scaling can be achieved by increasing the number of deployed AWS Fargate tasks that read from the SQS queue. Vertical scaling can be achieved by increasing the number of frame-splitting and image-publishing threads in the application. These settings are passed as environment variables to the application in the definition of the QueueProcessingFargateService resource in the AWS CDK. Due to the nature of AWS CDK stack deployment, you can deploy this application in multiple AWS Regions and accounts with no additional effort.

Tools

Tools

  • AWS CDK is a software development framework for defining your cloud infrastructure and resources by using programming languages such as TypeScript, JavaScript, Python, Java, and C#/.Net.

  • Amazon Kinesis Video Streams is a fully managed AWS service that you can use to stream live video from devices to the AWS Cloud, or build applications for real-time video processing or batch-oriented video analytics.

  • AWS Fargate is a serverless compute engine for containers. Fargate removes the need to provision and manage servers, and lets you focus on developing your applications.

  • Amazon S3 is an object storage service that offers scalability, data availability, security, and performance.

  • Amazon SQS is a fully managed message queuing service that enables you to decouple and scale microservices, distributed systems, and serverless applications.

Code

  • A .zip file of the sample application project (frame-splitter-code.zip) is attached.

Epics

TaskDescriptionSkills required

Start the Docker daemon.

Start the Docker daemon on your local system. The AWS CDK uses Docker to build the image that is used in the AWS Fargate task. You must run Docker before you proceed to the next step.

Developer, DevOps engineer

Build the project.

Download the frame-splitter-code sample application (attached) and extract its contents into a folder on your local machine. Before you can deploy the infrastructure, you have to build the Java Maven project. At a command prompt, navigate to the root directory of the project, and build the project by running the command: 

mvn clean install
Developer, DevOps engineer

Bootstrap the AWS CDK.

(First-time AWS CDK users only) If this is the first time you’re using the AWS CDK, you might have to bootstrap the environment by running the AWS CLI command:

cdk bootstrap --profile "$AWS_PROFILE_NAME"

where $AWS_PROFILE_NAME holds the name of the AWS profile from your AWS credentials. Or, you can remove this parameter to use the default profile. For more information, see the AWS CDK documentation.

Developer, DevOps engineer

Deploy the AWS CDK stack.

In this step, you create the required infrastructure resources (SQS queue, S3 bucket, AWS Fargate task definition) in your AWS account, build the Docker image that is required for the AWS Fargate task, and deploy the application. At a command prompt, navigate to the root directory of the project, and run the command:

cdk deploy --profile "$AWS_PROFILE_NAME" --all

where $AWS_PROFILE_NAME holds the name of the AWS profile from your AWS credentials. Or, you can remove this parameter to use the default profile. Confirm the deployment. Note the QueueUrl and Bucket values from the CDK deployment output; you will need these in later steps. The AWS CDK creates the assets, uploads them to your AWS account, and creates all infrastructure resources. You can observe the resource creation process in the AWS CloudFormation console. For more information, see the AWS CloudFormation documentation and the AWS CDK documentation.

Developer, DevOps engineer

Create a video stream.

In this step, you create a Kinesis video stream that will serve as an input stream for video processing. Make sure that you have the AWS CLI installed and configured. In the AWS CLI, run:

aws kinesisvideo --profile "$AWS_PROFILE_NAME" create-stream --stream-name "$STREAM_NAME" --data-retention-in-hours "24"

where $AWS_PROFILE_NAME holds the name of the AWS profile from your AWS credentials (or remove this parameter to use the default profile) and $STREAM_NAME is any valid stream name. 

Alternatively, you can create a video stream by using the Kinesis console by following the steps in the Kinesis Video Streams documentation. Note the AWS Resource Name (ARN) of the created stream; you will need it later.

Developer, DevOps engineer
TaskDescriptionSkills required

Upload the video to the stream.

In the project folder for the sample frame-splitter-code application, open the ProcessingTaskTest.java file in the src/test/java/amazon/awscdk/examples/splitter folder. Replace the profileName and streamName variables with the values you used in the previous steps. To upload the example video to the Kinesis video stream you created in the previous step, run:  

amazon.awscdk.examples.splitter.ProcessingTaskTest#testExample test

Alternatively, you can upload your video by using one of the methods described in the Kinesis Video Streams documentation.

Developer, DevOps engineer

Initiate video processing.

Now that you have uploaded a video to the Kinesis video stream, you can start processing it. To initiate the processing logic, you have to send a message with details to the SQS queue that the AWS CDK created during deployment. To send a message by using the AWS CLI, run:

aws sqs --profile "$AWS_PROFILE_NAME" send-message --queue-url QUEUE_URL --message-body MESSAGE

where $AWS_PROFILE_NAME holds the name of the AWS profile from your AWS credentials (remove this parameter to use the default profile), QUEUE_URL is the QueueUrl value from the AWS CDK output, and MESSAGE is a JSON string in the following format: 

{ "streamARN": "STREAM_ARN", "bucket": "BUCKET_NAME", "s3Directory": "test-output" }

where STREAM_ARN is the ARN of of the video stream you created in an earlier step and BUCKET_NAME is the Bucket value from the AWS CDK output. 

Sending this message initiates video processing. Alternatively, you can send a message by using the Amazon SQS console, as described in the Amazon SQS documentation.

Developer, DevOps engineer

View images of the video frames.

You can see the resulting images in the S3 output bucket s3://BUCKET_NAME/test-output where BUCKET_NAME is the Bucket value from the AWS CDK output.

Developer, DevOps engineer

Related resources

Additional information

Choosing an IDE

We recommend that you use your favorite Java IDE to build and explore this project.  

Cleaning up

After you finish running this example, remove all deployed resources to avoid incurring additional AWS infrastructure costs. 

To remove the infrastructure and the video stream, use these two commands in the AWS CLI:

cdk destroy --profile "$AWS_PROFILE_NAME" --all
aws kinesisvideo --profile "$AWS_PROFILE_NAME" delete-stream --stream-arn "$STREAM_ARN"

Alternatively, you can remove the resources manually by using the AWS CloudFormation console to remove the AWS CloudFormation stack, and the Kinesis console to remove the Kinesis video stream. Note that cdk destroy doesn’t remove the output S3 bucket or the images in Amazon Elastic Container Registry (Amazon ECR) repositories (aws-cdk/assets). You have to remove them manually.

Attachments

To access additional content that is associated with this document, unzip the following file: attachment.zip