AWS Database Migration Service
User Guide (Version API Version 2016-01-01)

Using Amazon Kinesis Data Streams as a Target for AWS Database Migration Service

You can use AWS DMS to migrate data to an Amazon Kinesis data stream. Amazon Kinesis data streams are part of the Amazon Kinesis Data Streams service. You can use Kinesis data streams to collect and process large streams of data records in real time.


Support for Amazon Kinesis Data Streams as a target is available in AWS DMS versions 3.1.2 and later.

A Kinesis data stream is made up of shards. Shards are uniquely identified sequences of data records in a stream. For more information on shards in Amazon Kinesis Data Streams, see Shard in the Amazon Kinesis Data Streams Developer Guide.

AWS Database Migration Service publishes records to a Kinesis data stream using JSON. During conversion, AWS DMS serializes each record from the source database into an attribute-value pair in JSON format.

You use object mapping to migrate your data from any supported data source to a target stream. With object mapping, you determine how to structure the data records in the stream. You also define a partition key for each table, which Kinesis Data Streams uses to group the data into its shards.

When AWS DMS creates tables on an Amazon Kinesis Data Streams target endpoint, it creates as many tables as in the source database endpoint. AWS DMS also sets several Kinesis Data Streams parameter values. The cost for the table creation depends on the amount of data and the number of tables to be migrated.

To help increase the speed of the transfer, AWS DMS supports a multithreaded full load to an Kinesis Data Streams target instance. DMS supports this multithreading with task settings that include the following:


Support for multithreaded full loads to Amazon Kinesis Data Streams targets is available in AWS DMS versions 3.1.4 and later.

  • MaxFullLoadSubTasks – Use this option to indicate the maximum number of source tables to load in parallel. DMS loads each table into its corresponding Kinesis target table using a dedicated subtask. The default is 8; the maximum value is 49.

  • ParallelLoadThreads – Use this option to specify the number of threads that AWS DMS uses to load each table into its Kinesis target table. The maximum value for an Kinesis Data Streams target is 32. You can ask to have this maximum limit increased.

  • ParallelLoadBufferSize – Use this option to specify the maximum number of records to store in the buffer that the parallel load threads use to load data to the Kinesis target. The default value is 50. The maximum value is 1,000. Use this setting with ParallelLoadThreads. ParallelLoadBufferSize is valid only when there is more than one thread.

  • Table-mapping settings for individual tables – Use table-settings rules to identify the individual tables from the source that you want to load in parallel. Also use these rules to specify how to segment the rows of each table for multithreaded loading. For more information, see Table-Settings Rules and Operations.


    DMS assigns each segment of a table to its own thread for loading. Therefore, set ParallelLoadThreads to the maximum number of segments that you specify for a table in the source.

Prerequisites for Using a Kinesis Data Stream as a Target for AWS Database Migration Service

Before you set up a Kinesis data stream as a target for AWS DMS, make sure that you create an IAM role. This role must allow AWS DMS to assume and grant access to the Kinesis data streams that are being migrated into. The minimum set of access permissions is shown in the following example role policy.

{ "Version": "2012-10-17", "Statement": [ { "Sid": "1", "Effect": "Allow", "Principal": { "Service": "" }, "Action": "sts:AssumeRole" } ] }

The role that you use for the migration to a Kinesis data stream must have the following permissions.

{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "kinesis:DescribeStream", "kinesis:PutRecord", "kinesis:PutRecords" ], "Resource": "arn:aws:kinesis:region:accountID:stream/streamName" } ] }

Limitations When Using Kinesis Data Streams as a Target for AWS Database Migration Service

The following limitations apply when using Kinesis Data Streams as a target:

  • AWS DMS publishes each update to a single record in the source database as one data record in a given Kinesis data stream. As a result, applications consuming the data from the stream lose track of transaction boundaries.

  • Kinesis data streams don't support deduplication. Applications that consume data from a stream need to handle duplicate records. For more information, see Handling Duplicate Records in the Amazon Kinesis Data Streams Developer Guide.

  • AWS DMS supports the following two forms for partition keys:

    • SchemaName.TableName: A combination of the schema and table name.

    • ${AttributeName}: The value of one of the fields in the JSON, or the primary key of the table in the source database.

Using Object Mapping to Migrate Data to a Kinesis Data Stream

AWS DMS uses table-mapping rules to map data from the source to the target Kinesis data stream. To map data to a target stream, you use a type of table-mapping rule called object mapping. You use object mapping to define how data records in the source map to the data records published to the Kinesis data stream.

Kinesis data streams don't have a preset structure other than having a partition key.

To create an object-mapping rule, you specify rule-type as object-mapping. This rule specifies what type of object mapping you want to use.

The structure for the rule is as follows.

{ "rules": [ { "rule-type": "object-mapping", "rule-id": "id", "rule-name": "name", "rule-action": "valid object-mapping rule action", "object-locator": { "schema-name": "case-sensitive schema name", "table-name": "" } } ] }

AWS DMS currently supports map-record-to-record and map-record-to-document as the only valid values for the rule-action parameter. Map-record-to-record and map-record-to-document specify what AWS DMS does by default to records that aren't excluded as part of the exclude-columns attribute list. These values don't affect the attribute mappings in any way.

Use map-record-to-record when migrating from a relational database to a Kinesis data stream. This rule type uses the taskResourceId.schemaName.tableName value from the relational database as the partition key in the Kinesis data stream and creates an attribute for each column in the source database. When using map-record-to-record, for any column in the source table not listed in the exclude-columns attribute list, AWS DMS creates a corresponding attribute in the target stream. This corresponding attribute is created regardless of whether that source column is used in an attribute mapping.

One way to understand map-record-to-record is to see it in action. For this example, assume that you are starting with a relational database table row with the following structure and data.

FirstName LastName StoreId HomeAddress HomePhone WorkAddress WorkPhone DateofBirth




221B Baker Street


31 Spooner Street, Quahog



To migrate this information to a Kinesis data stream, you create rules to map the data to the target stream. The following rule illustrates the mapping.

{ "rules": [ { "rule-type": "selection", "rule-id": "1", "rule-name": "1", "rule-action": "include", "object-locator": { "schema-name": "Test", "table-name": "%" } }, { "rule-type": "object-mapping", "rule-id": "2", "rule-name": "DefaultMapToKinesis", "rule-action": "map-record-to-record", "object-locator": { "schema-name": "Test", "table-name": "Customers" } } ] }

The following illustrates the resulting record format in the Kinesis data stream.

  • StreamName: XXX

  • PartitionKey: Test.Customers //schmaName.tableName

  • Data: //The following JSON message

{ "FirstName": "Randy", "LastName": "Marsh", "StoreId": "5", "HomeAddress": "221B Baker Street", "HomePhone": "1234567890", "WorkAddress": "31 Spooner Street, Quahog", "WorkPhone": "9876543210", "DateOfBirth": "02/29/1988" }

Restructuring Data with Attribute Mapping

You can restructure the data while you are migrating it to a Kinesis data stream using an attribute map. For example, you might want to combine several fields in the source into a single field in the target. The following attribute map illustrates how to restructure the data.

{ "rules": [ { "rule-type": "selection", "rule-id": "1", "rule-name": "1", "rule-action": "include", "object-locator": { "schema-name": "Test", "table-name": "%" } }, { "rule-type": "object-mapping", "rule-id": "2", "rule-name": "TransformToKinesis", "rule-action": "map-record-to-record", "target-table-name": "CustomerData", "object-locator": { "schema-name": "Test", "table-name": "Customers" }, "mapping-parameters": { "partition-key-type": "attribute-name", "partition-key-name": "CustomerName", "exclude-columns": [ "firstname", "lastname", "homeaddress", "homephone", "workaddress", "workphone" ], "attribute-mappings": [ { "target-attribute-name": "CustomerName", "attribute-type": "scalar", "attribute-sub-type": "string", "value": "${lastname}, ${firstname}" }, { "target-attribute-name": "ContactDetails", "attribute-type": "document", "attribute-sub-type": "json", "value": { "Home": { "Address": "${homeaddress}", "Phone": "${homephone}" }, "Work": { "Address": "${workaddress}", "Phone": "${workphone}" } } } ] } } ] }

To set a constant value for partition-key, specify a partition-key value. For example, you might do this to force all the data to be stored in a single shard. The following mapping illustrates this approach.

{ "rules": [ { "rule-type": "selection", "rule-id": "1", "rule-name": "1", "object-locator": { "schema-name": "Test", "table-name": "%" }, "rule-action": "include" }, { "rule-type": "object-mapping", "rule-id": "1", "rule-name": "TransformToKinesis", "rule-action": "map-record-to-document", "object-locator": { "schema-name": "Test", "table-name": "Customer" }, "mapping-parameters": { "partition-key": { "value": "ConstantPartitionKey" }, "exclude-columns": [ "FirstName", "LastName", "HomeAddress", "HomePhone", "WorkAddress", "WorkPhone" ], "attribute-mappings": [ { "attribute-name": "CustomerName", "value": "${FirstName},${LastName}" }, { "attribute-name": "ContactDetails", "value": { "Home": { "Address": "${HomeAddress}", "Phone": "${HomePhone}" }, "Work": { "Address": "${WorkAddress}", "Phone": "${WorkPhone}" } } }, { "attribute-name": "DateOfBirth", "value": "${DateOfBirth}" } ] } } ] }

Message Format for Kinesis Data Streams

The JSON output is simply a list of key-value pairs. AWS DMS provides the following reserved fields to make it easier to consume the data from the Kinesis Data Streams:


The record type can be either data or control. Data records represent the actual rows in the source. Control records are for important events in the stream, for example a restart of the task.


For data records, the operation can be create,read, update, or delete.

For control records, the operation can be TruncateTable or DropTable.


The source schema for the record. This field can be empty for a control record.


The source table for the record. This field can be empty for a control record.


The timestamp for when the JSON message was constructed. The field is formatted with the ISO 8601 format.


The partition-key value for a control record that is for a specific table is TaskId.SchemaName.TableName. The partition-key value for a control record that is for a specific task is that record's TaskId. Specifying a partition-key value in the object mapping has no impact on the partition-key for a control record.