Amazon Managed Streaming for Apache Kafka
Developer Guide

What Is Amazon MSK?

Amazon Managed Streaming for Apache Kafka (Amazon MSK) is a fully managed service that enables you to build and run applications that use Apache Kafka to process streaming data. Amazon MSK provides the control-plane operations, such as those for creating, updating, and deleting clusters. It lets you use Apache Kafka data-plane operations, such as those for producing and consuming data. It runs open-source versions of Apache Kafka. This means existing applications, tooling, and plugins from partners and the Apache Kafka community are supported without requiring changes to application code. You can use Amazon MSK to create clusters that use Apache Kafka versions 1.1.1 and 2.2.1.

The following diagram provides an overview of how Amazon MSK works.


       Diagram showing the architecture of an example Amazon MSK Cluster.

The diagram demonstrates the interaction between the following components:

  • Broker nodes — When creating an Amazon MSK cluster, you specify how many broker nodes you want Amazon MSK to create in each Availability Zone. In the example cluster shown in this diagram, there's one broker per Availability Zone. Each Availability Zone has its own virtual private cloud (VPC) subnet.

  • ZooKeeper nodes — Amazon MSK also creates the Apache ZooKeeper nodes for you. Apache ZooKeeper is an open-source server that enables highly reliable distributed coordination.

  • Producers, consumers, and topic creators — Amazon MSK lets you use Apache Kafka data-plane operations to create topics and to produce and consume data.

  • AWS CLI — You can use the AWS Command Line Interface (AWS CLI) or the APIs in the SDK to perform control-plane operations. For example, you can use the AWS CLI or the SDK to create or delete an Amazon MSK cluster, list all the clusters in an account, or view the properties of a cluster.

Amazon MSK detects and automatically recovers from the most common failure scenarios for Multi-AZ clusters so that your producer and consumer applications can continue their write and read operations with minimal impact. Amazon MSK automatically detects the following failure scenarios:

  • Loss of network connectivity to a broker

  • Compute unit failure for a broker

When Amazon MSK detects one of these failures, it replaces the unhealthy or unreachable broker with a new broker. In addition, where possible, it reuses the storage from the older broker to reduce the data that Apache Kafka needs to replicate. Your availability impact is limited to the time required for Amazon MSK to complete the detection and recovery. After a recovery, your producer and consumer apps can continue to communicate with the same broker IP addresses that they used before the failure.

To get started using Amazon MSK, see Getting Started Using Amazon MSK.

To see the control-plane operations available through Amazon MSK, see the Amazon MSK API Reference.

After you create a cluster, you can use Amazon CloudWatch to monitor it. For more information about monitoring your cluster using metrics, see Monitoring Your Amazon MSK Cluster.