Option 1: Deploy the streaming-data-solution-for-msk CloudFormation template - Streaming Data Solution for Amazon MSK

Option 1: Deploy the streaming-data-solution-for-msk CloudFormation template

Before you launch this template, review the architecture and other considerations in this guide. Follow the step-by-step instructions in this section to configure and deploy the solution into your account.

Time to deploy: Approximately 25-30 minutes

Deployment overview

Use the following steps to deploy this solution on AWS. For detailed instructions, follow the links for each step.

Step 1. Launch the stack

  1. Launch the AWS CloudFormation template into your AWS account.

  2. Review the template parameters, and adjust if necessary.

Step 2. (Optional) Create a topic that produces and consumes data

Step 1. Launch the stack

Note

You are responsible for the cost of the AWS services used while running this solution. Refer to the Cost section for more details. For full details, refer to the pricing webpage for each AWS service used in this solution.

  1. Sign in to the AWS Management Console and select the button below to launch the streaming-data-solution-for-msk.template AWS CloudFormation template.

    Streaming Data Solution for Amazon MSK launch button

  2. The template launches in the US East (N. Virginia) Region by default. To launch this solution in a different AWS Region, use the Region selector in the console navigation bar.

    Note

    This template uses Amazon MSK, which is not currently available in all AWS Regions. You must launch this solution in an AWS Region where Amazon MSK is available. For the most current availability by Region, refer to the AWS Regional Services List.

  3. On the Create stack page, verify that the correct template URL shows in the Amazon S3 URL text box and choose Next.

  4. On the Specify stack details page, assign a name to your solution stack. For information about naming character limitations, refer to IAM and STS Limits in the AWS Identity and Access Management User Guide.

  5. Under Parameters, review the parameters for the template and modify them as necessary. This solution uses the following default values.

    Parameter Default Description
    Broker configuration
    Apache Kafka version

    (KafkaVersion)

    2.8.1

    Apache Kafka version on the brokers.

    Number of broker nodes

    (NumberBrokerNodes)

    3

    Number of broker nodes you want in the cluster (must be a multiple of the number of subnets).

    Broker instance type

    (BrokerInstanceType)

    kafka.m5.large

    Amazon EC2 instance type that Amazon MSK uses when it creates your brokers.

    Monitoring level

    (MonitoringLevel)

    DEFAULT

    Level of monitoring for the cluster. The available options include DEFAULT, PER_BROKER, PER_TOPIC_PER_BROKERand PER_TOPIC_PER_PARTITION.

    Amazon EBS storage volume per broker (in GiB) (EbsVolumeSize) 1000

    Size (in GiB) of the storage volume in each broker node. The allowed range is from 1 to 16384.

    Access control configuration
    Method Amazon MSK uses to authenticate clients

    (AccessControlMethod)

    IAM role-based authentication

    The available options are Unauthenticated access, IAM role-based authentication, and SASL/SCRAM authentication.

    Networking configuration
    Cluster VPC

    (BrokerVpcId)

    <Requires input>

    VPC where the cluster launch.

    Cluster subnets

    (BrokerSubnetIds)

    <Requires input>

    List of subnets in which brokers are distributed (must contain between 2 and 3 items).

    Client configuration
    Instance type

    (ClientInstanceType)

    t3.small

    Instance type for the client instance.

    Amazon Machine Image

    (ClientAmiId)

    1

    Amazon Machine Image (AMI) ID for the client instance.

  6. Choose Next.

  7. On the Configure stack options page, choose Next.

  8. On the Review page, review and confirm the settings. Check the box acknowledging that the template will create AWS Identity and Access Management (IAM) resources.

  9. Choose Create stack to deploy the stack.

    You can view the status of the stack in the AWS CloudFormation console in the Status column. You should receive a CREATE_COMPLETE status in approximately 25 minutes.

Note

This solution includes the solution-helper Lambda function, which runs only during initial configuration. This function is only created if you start the collection of operational metrics. For details, refer to Anonymized data collection.

Step 2. (Optional) Create a topic that produces and consumes data

After the stack is created, you can use the Amazon EC2 client instance to interact with the Amazon MSK cluster.

  1. Sign in to the Amazon MSK console and, from the left menu pane, select Clusters.

  2. On the Amazon MSK page, select kafka-cluster-<account-id>.

  3. Choose View client information then copy the values for ZooKeeper connection and Bootstrap servers.

  4. Navigate to the AWS Systems Manager console and, from the left menu pane under Instances and Nodes, select Session Manager.

  5. On the AWS Systems Manager page, choose Start session.

  6. On the Start a session page, select the <KafkaClient> and choose Start session.

    Refer to the AWS CloudFormation Outputs tab for the Amazon EC2 instance ID.

  7. In the console window, run the following command to create a topic:

    sudo su cd /home/kafka/bin ./kafka-topics.sh --create --zookeeper <ZookeeperConnectString> --replication-factor 2 --partitions 2 --topic msk-serverless-tutorial/home/kafka/bin
./kafka-topics.sh --create --zookeeper<zookeeper-connection-string> --replication-factor 2 --partitions 2 --topic MyTopic
./kafka-console-producer.sh --broker-list<broker-list> --producer.config config-file --topic MyTopic
Note

The client configuration file depends on the access control method selected when launching the stack. For Unauthenticated access, use client-ssl.properties; for IAM role-based authentication, use client-iam.properties; and for SASL/SCRAM, use client-sasl.properties