Considerations when using data sharing in Amazon Redshift - Amazon Redshift

Considerations when using data sharing in Amazon Redshift

Following are considerations for working with Amazon Redshift data sharing. For information on data sharing limitations, see Limitations for data sharing.

  • Cross-Region data sharing comes with additional cross-region data transfer charges. There is no additional cost for data sharing within the same Region.

  • As a datashare user, you continue to connect to your local cluster database only. You can't connect to the databases created from a datashare but can read from those databases.

  • The consumer is charged for all compute and cross-region data transfer fees required to query the producer's data. The producer is charged for the underlying storage of data in their provisioned cluster or serverless namespace.

  • The performance of the queries on shared data depends on the compute capacity of the consumer clusters.

Managing cluster encryption

To share data across AWS account, both the producer and consumer clusters must be encrypted.

In Amazon Redshift, you can turn on database encryption for your clusters to help protect data at rest. When you turn on encryption for a cluster, the data blocks and system metadata are encrypted for the cluster and its snapshots. You can turn on encryption when you launch your cluster, or you can modify an unencrypted cluster to use AWS Key Management Service (AWS KMS) encryption. For more information about Amazon Redshift database encryption, see Amazon Redshift database encryption in the Amazon Redshift Management Guide.

To protect data in transit, all data is encrypted in transit through the encryption schema of the producer cluster. The consumer cluster adopts this encryption schema when data is loaded. The consumer cluster then operates as a normal encrypted cluster. Communications between the producer and consumer are also encrypted using a shared key schema. For more information about encryption in transit, Encryption in transit.

Limitations for data sharing

The following are limitations when working with datashares in Amazon Redshift:

  • Data sharing is supported for all provisioned ra3 cluster types (ra3.16xlarge, ra3.4xlarge, and ra3.xlplus) and Amazon Redshift Serverless. It isn't supported for other cluster types.

  • For cross-account and cross-Region data sharing, both the producer and consumer clusters and serverless namespaces must be encrypted. This is for security purposes. However, they don't need to share the same encryption key.

  • You can only share SQL UDFs through datashares. Python and Lambda UDFs aren't supported.

  • If the producer database has specific collation, use the same collation settings for the consumer database.

  • Amazon Redshift doesn't support adding external schemas, tables, or late-binding views on external tables to datashares.

  • Amazon Redshift doesn't support nested SQL user-defined functions on producer clusters.

  • Amazon Redshift doesn't support sharing tables with interleaved sort keys and views that refer to tables with interleaved sort keys.

  • Consumers can't add datashare objects to another datashare. Additionally, consumers can't add views referencing datashare objects to another datashare.

  • Amazon Redshift doesn't support accessing a datashare object which had a concurrent DDL occur between the Prepare and Execute of the access.

  • Amazon Redshift doesn't support sharing stored procedures through datashares.