Option 2: Deploy the aws-streaming-data-solution-for-kinesis-using-kpl-and-kinesis-data-analytics CloudFormation template - Streaming Data Solution for Amazon Kinesis

Option 2: Deploy the aws-streaming-data-solution-for-kinesis-using-kpl-and-kinesis-data-analytics CloudFormation template

Before you launch this template, review the architecture and other considerations discussed in this guide. Follow the step-by-step instructions in this section to configure and deploy the solution into your account.

Time to deploy: Approximately 10 minutes

Deployment overview

Use the following steps to deploy this solution on AWS. For detailed instructions, follow the links for each step.

Step 1. Launch the Stack

  • Launch the AWS CloudFormation template into your AWS account.

  • Review the other template parameters, and adjust if necessary.

Step 2. Post-configuration steps

Step 1. Launch the Stack

Note

You are responsible for the cost of the AWS services used while running this solution. Refer to the Cost section for more details. For full details, refer to the pricing webpage for each AWS service used in this solution.

  1. Sign in to the AWS Management Console and use the button below to launch the streaming-data-solution-for-kinesis-using-kpl-and-kinesis-data-analytics AWS CloudFormation template.

    Streaming Data Solution for Amazon Kinesis launch button

    Alternatively, you can download the template as a starting point for your own implementation.

  2. The template launches in the US East (N. Virginia) Region by default. To launch this solution in a different AWS Region, use the Region selector in the console navigation bar.

    Note

    This template uses Amazon Managed Service for Apache Flink, which is not currently available in all AWS Regions. You must launch this solution in an AWS Region where Managed Service for Apache Flink is available. For the most current availability by Region, refer to the AWS Service Region Table.

  3. On the Create stack page, verify that the correct template URL shows in the Amazon S3 URL text box and choose Next.

  4. On the Specify stack details page, assign a name to your solution stack. For information about naming character limitations, refer to IAM and STS Limits in the AWS Identity and Access Management User Guide.

  5. Under Parameters, review the parameters for the template and modify them as necessary. This solution uses the following default values.

    Parameter Default Description
    Amazon Kinesis Producer Library (KPL) configuration
    VPC where the KPL instance should be launched

    (ProducerVpcId)

    <Requires input>

    VPC where the KPL instance is launched.

    Subnet where the KPL instance should be launched

    (ProducerSubnetId)

    <Requires input>

    Subnet where the KPL instance is launched (the subnet requires access to Kinesis Data Streams, either via IGW or NAT).

    Amazon Machine Image for the KPL instance

    (ProducerAmId)

    /aws/service/ami-amazon-linux-latest/amzn2-ami-hvm-x86_64-gp2

    Amazon Machine Image (AMI) ID for the KPL instance.

    Amazon Kinesis Data Streams configuration
    Number of open shards

    (ShardCount)

    2

    The number of shards that the stream uses. The allowed range is from 1 to 100 shards.

    Data retention period (hours)

    (RetentionHours)

    24

    The number of hours that data records stored in shards will remain accessible. The allowed range is from 24 to 8760 hours.

    Enable enhanced (shard-level) metrics

    (EnableEnhancedMonitoring)

    false

    Choose whether to activate enhanced monitoring for shard-level metrics. This function is turned off by default.

    Amazon Managed Service for Apache Flink configuration
    Monitoring log level

    (LogLevel)

    INFO

    The level of detail of the CloudWatch Logs for an application. The available options include DEBUG, ERROR, INFO, and WARN. For information about choosing a log level, refer to Application Monitoring Levels in the Amazon Managed Service for Apache Flink Developer Guide.

    Comma-separated list of subnet ids for VPC connectivity

    (ApplicationSubnetIds)

    <Optional input> If subnet IDs are provided, then security groups must also be included.
    Comma-separated list of security groups ids for VPC connectivity

    (ApplicationSecurityGroupIds)

    <Optional input> If security group IDs are provided, then subnets must also be included.
  6. Choose Next.

  7. On the Configure stack options page, choose Next.

  8. On the Review page, review and confirm the settings. Check the box acknowledging that the template will create AWS Identity and Access Management (IAM) resources.

  9. Choose Create stack to deploy the stack.

    You can view the status of the stack in the AWS CloudFormation console in the Status column. You should receive a CREATE_COMPLETE status in approximately 10 minutes.

Note

This solution includes the solution-helper Lambda function, which runs only during initial configuration. This function is only created if you start the collection of operational metrics.

Step 2. Post-configuration steps

By default, the demo producer and Studio notebook will not run after the stacks are created. Follow these steps to enable them.

  1. Sign in to the Amazon Kinesis console and, from the left menu pane, select Analytics applications.

  2. On the Amazon Managed Service for Apache Flink page, go to the Studio tab, and select Kda<studio-notebook-name>.

  3. Choose Actions then choose Run application.

  4. Navigate to the AWS Systems Manager console and, from the left menu pane under Instances and Nodes, select Session Manager.

  5. On the AWS Systems Manager page, choose Start session.

  6. On the Start a session page, select the ec2-instance-id> for the KPL instance and choose Start session.

    Refer to the AWS CloudFormation Outputs tab for the Amazon EC2 instance ID.

  7. In the console window, run the following command to start the demo producer application. (Replace <stream-name>, <aws-region>, and <seconds> with your specific information).

    sudo java -jar /tmp/aws-kpl-demo.jar <stream-name> <aws-region> <seconds>

You can customize or replace the demo application that is included with this solution can be customized or replaced to meet your business needs. The source code is available from the solution’s GitHub repository. For information about the demo producer application and customizing the demo application or replacing it with your own application, refer to the README.md file in the GitHub repository.