Using Amazon EMR

This solution is similar to the Data Pipeline solution in that Data Pipeline uses Amazon EMR clusters behind the scenes for the job. The EMR clusters in the source account read from the source Amazon DynamoDB table and write to a destination S3 bucket. The target EMR clusters read from the destination S3 bucket and write to the target DynamoDB table.

To replicate DynamoDB tables using this approach, EMR clusters configured with Apache Hive must be launched in both the source and target accounts. Both EMR clusters must be configured with read/write permissions for the destination S3 bucket.

Advantages

The solution provides more options for customization and provides more control over the data migration process.

Drawbacks

The process is more involved, because it requires running Hive queries on the source and the target and creating an external table on the S3 location to contain the data.
It requires setting up the clusters and terminating them after the completion of the job.

Warning Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Document Conventions

Using the DynamoDB Amazon S3 features

Using a custom implementation