Using AWS Glue with Amazon DynamoDB as source and sink - AWS Prescriptive Guidance

Using AWS Glue with Amazon DynamoDB as source and sink

This solution uses AWS Glue ETL jobs to copy Amazon DynamoDB table data from the source account to the destination account. The solution doesn't require any intermediate storage. The ETL job reads directly from the source table and writes directly to the target table. The role assigned to the ETL job should have permissions to read from the table in the source account and to write to the table in the the destination account. Because this process consumes provisioned capacity on both the source and target tables, it shouldn't be used for larger datasets.

This solution requires creating an AWS Glue job with the source as a DynamoDB table and the target as a DynamoDB table. The Jobs page on the AWS Glue console doesn't support using a DynamoDB table as a target for an ETL job. Instead, you use the Apache Spark script editor to generate the boilerplate code for the ETL job, and then you update it to read from and write to DynamoDB tables.

For more information and instructions, see Cross-account cross-Region access to DynamoDB tables.

AWS Glue reads the database in the source account and writes to the database in the target account.

Advantages

  • It's a serverless solution.

  • AWS Glue is the only additional AWS service required, and AWS Glue supports scheduling the ETL jobs.

  • Unlike the export solution, this solution does not require keeping up with schema changes.

Drawbacks

  • The solution consumes provisioned throughput on the source and target tables, which can affect performance and availability.