Getting started with Amazon RDS zero-ETL integrations with Amazon Redshift - Amazon Relational Database Service

Getting started with Amazon RDS zero-ETL integrations with Amazon Redshift

This is prerelease documentation for Amazon RDS zero-ETL integrations with Amazon Redshift, which is in preview release. The documentation and the feature are both subject to change. We recommend that you use this feature only in test environments, and not in production environments. For preview terms and conditions, see Betas and Previews in AWS Service Terms.

Before you create a zero-ETL integration with Amazon Redshift, configure your RDS database and your Amazon Redshift data warehouse with the required parameters and permissions. During setup, you'll complete the following steps:

After you complete these tasks, continue to Creating Amazon RDS zero-ETL integrations with Amazon Redshift.

Step 1: Create a custom DB parameter group

Amazon RDS zero-ETL integrations with Amazon Redshift require specific values for the DB parameters that control binary logging (binlog). To configure binary logging, you must first create a custom DB parameter group, and then associate it with the source database.

Create a custom DB parameter group with the following settings. For instructions to create a parameter group, see Working with DB parameter groups in a DB instance.

  • binlog_format=ROW

  • binlog_row_image=full

  • binlog_checksum=NONE

In addition, make sure that the binlog_row_value_options parameter is not set to PARTIAL_JSON.

Step 2: Select or create a source database

After you create a custom DB parameter group, choose or create an RDS for MySQL Single-AZ or Multi-AZ DB instance. This database will be the source of data replication to Amazon Redshift.

The database must be running RDS for MySQL version 8.0.32 or higher. For instructions to create a Single-AZ or Multi-AZ DB instance, see Creating an Amazon RDS DB instance.

Under Additional configuration, change the default DB parameter group to the custom parameter group that you created in the previous step.

Note

If you associate the parameter group with the database after the database is already created, you must reboot the database to apply the changes before you can create a zero-ETL integration. For instructions, see Rebooting a DB instance.

In addition, make sure that automated backups are enabled on the database. For more information, see Enabling automated backups.

Step 3: Create a target Amazon Redshift data warehouse

After you create your source database, you must create and configure a target data warehouse in Amazon Redshift. The data warehouse must meet the following requirements:

  • Created in preview

    • To create a provisioned cluster in preview, choose Create preview cluster from the banner on the provisioned clusters dashboard. For more information, see Creating a preview cluster.

      When creating the cluster, set the Preview track to preview_2023.

    • To create a Redshift Serverless workgroup in preview, choose Create preview workgroup from the banner on the Serverless dashboard. For more information, see Creating a preview workgroup.

  • Using an RA3 node type (ra3.xlplus, ra3.4xlarge, or ra3.16xlarge) with at least two nodes, or Redshift Serverless.

  • Encrypted (if using a provisioned cluster). For more information, see Amazon Redshift database encryption.

For instructions to create a data warehouse, see Creating a cluster for provisioned clusters, or Creating a workgroup with a namespace for Redshift Serverless.

Enable case sensitivity on the data warehouse

For the integration to be successful, the case sensitivity parameter (enable_case_sensitive_identifier) must be enabled for the data warehouse. By default, case sensitivity is disabled on all provisioned clusters and Redshift Serverless workgroups.

To enable case sensitivity, perform the following steps depending on your data warehouse type:

  • Provisioned cluster – To enable case sensitivity on a provisioned cluster, create a custom parameter group with the enable_case_sensitive_identifier parameter enabled. Then, associate the parameter group with the cluster. For instructions, see Managing parameter groups using the console or Configuring parameter values using the AWS CLI.

    Note

    Remember to reboot the cluster after you associate the custom parameter group with it.

  • Serverless workgroup – To enable case sensitivity on a Redshift Serverless workgroup, you must use the AWS CLI. The Amazon Redshift console doesn't currently support modifying Redshift Serverless parameter values. Send the following update-workgroup request:

    aws redshift-serverless update-workgroup \ --workgroup-name target-workgroup \ --config-parameters parameterKey=enable_case_sensitive_identifier,parameterValue=true

    You don't need to reboot a workgroup after you modify its parameter values.

Configure authorization for the data warehouse

After you create a data warehouse, you must configure the source RDS database as an authorized integration source. For instructions, see Configure authorization for your Amazon Redshift data warehouse.

Next steps

With a source RDS database and an Amazon Redshift target data warehouse, you can now create a zero-ETL integration and replicate data. For instructions, see Creating Amazon RDS zero-ETL integrations with Amazon Redshift.