Menu
Amazon Kinesis Streams
Developer Guide

Amazon Kinesis Streams

Amazon Kinesis Streams ingests a large amount of data in real time, durably stores the data, and makes the data available for consumption. The unit of data stored by the Streams service is a data record. A stream represents an ordered sequence of data records. The data records in a stream are distributed into shards.

A shard is a group of data records in a stream. When you create a stream, you specify the number of shards for the stream. Each shard can support up to 5 transactions per second for reads, up to a maximum total data read rate of 2 MB per second and up to 1,000 records per second for writes, up to a maximum total data write rate of 1 MB per second (including partition keys). The total capacity of a stream is the sum of the capacities of its shards. You can increase or decrease the number of shards in a stream as needed. However, note that you are charged on a per-shard basis.

A producer puts data records into shards and a consumer gets data records from shards.

Determining the Initial Size of an Amazon Kinesis Stream

Before you create a stream, you need to determine an initial size for the stream. After you create the stream, you can dynamically scale your shard capacity up or down using the AWS Management Console or the UpdateShardCount API. You can make updates while there is an Amazon Kinesis Streams application consuming data from the stream.

To determine the initial size of a stream, you'll need the following input values:

  • The average size of the data record written to the stream in kilobytes (KB), rounded up to the nearest 1 KB, the data size (average_data_size_in_KB).

  • The number of data records written to and read from the stream per second (records_per_second).

  • The number of Amazon Kinesis Streams applications that consume data concurrently and independently from the stream, that is, the consumers (number_of_consumers).

  • The incoming write bandwidth in KB (incoming_write_bandwidth_in_KB),

    which is equal to the average_data_size_in_KB multiplied by the records_per_second.

  • The outgoing read bandwidth in KB (outgoing_read_bandwidth_in_KB),

    which is equal to the incoming_write_bandwidth_in_KB multiplied by the number_of_consumers.

You can calculate the initial number of shards (number_of_shards) that your stream will need by using the input values in the following formula:

number_of_shards = max(incoming_write_bandwidth_in_KB/1000, outgoing_read_bandwidth_in_KB/2000)

Creating a Stream

You can create a stream using the Streams console, the Streams API, or the AWS CLI.

To create a stream using the console

  1. Open the Streams console at https://console.aws.amazon.com/kinesis/.

  2. In the navigation bar, expand the region selector and select a region.

  3. Click Create Stream.

  4. On the Create Stream page, enter a name for your stream and the number of shards you need, and then click Create.

    On the Stream List page, your stream's Status is CREATING while the stream is being created. When the stream is ready to use, the Status changes to ACTIVE.

  5. Click the name of your stream. The Stream Details page displays a summary of your stream configuration, along with monitoring information.

To create a stream using the Streams API

For information about creating a stream using the Streams API, see Creating a Stream.

To create a stream using the AWS CLI

For information about creating a stream using the AWS CLI, see the create-stream command.