Data sharing in Amazon Redshift
With Amazon Redshift, you can securely share data across Amazon Redshift clusters or with other AWS services. Data sharing lets you share live data, without having to create a copy or move it. Database administrators and data engineers can use data sharing to provide secure, read-only access to data for analytics purposes, while maintaining control over the data. Data analysts, business intelligence professionals, and data scientists can leverage shared data to gain insights without duplicating or moving data. Common use cases include sharing data with partners, enabling cross-functional analysis, and facilitating data democratization within an organization. The following sections cover the details of configuring and managing data sharing in Amazon Redshift.
With Amazon Redshift data sharing, you can securely share access to live data across Amazon Redshift clusters, workgroups, AWS accounts, and AWS Regions without manually moving or copying the data. Since the data is live, all users can see the most up-to-date and consistent information in Amazon Redshift as soon as it’s updated.
You can share data across provisioned clusters, serverless workgroups, Availability Zones, AWS accounts, and AWS Regions. You can share between cluster types as well as between provisioned clusters and serverless.
You can share database objects for both reads and writes across different Amazon Redshift clusters or Amazon Redshift Serverless workgroups within the same AWS account, or from one AWS account to another. You can write data across regions as well. You can grant permissions such as SELECT, INSERT, and UPDATE for different tables and USAGE and CREATE for different schemas. The data is live and available to all warehouses as soon as a write transaction is committed.
For more information about configuring capabilities for data sharing in the PREVIEW_2023 track, see Sharing write access to data (Preview).
Note
Multi-warehouse writes through data sharing is not currently available on ra3.xlplus clusters. To use this feature, create ra3.4xl clusters, ra3.16xl clusters, or Amazon Redshift Serverless workgroups.
Considerations when using data sharing in Amazon Redshift
Following are considerations for working with Amazon Redshift data sharing. For information on data sharing limitations, see Limitations for data sharing.
-
Cross-region data sharing includes additional cross-region data-transfer charges. These data-transfer charges don't apply within the same region, only across regions. For more information, see Managing cost control for cross-Region data sharing.
-
When you read data from a datashare, you remain connected to your local cluster database. For more information about setting up and reading from a database created from a datashare, see Querying datashare objects.
-
The consumer is charged for all compute and cross-region data transfer fees required to query the producer's data. The producer is charged for the underlying storage of data in their provisioned cluster or serverless namespace.
-
The performance of the queries on shared data depends on the compute capacity of the consumer clusters.
Cluster encryption management for data sharing
To share data across AWS account, both the producer and consumer clusters must be encrypted.
In Amazon Redshift, you can turn on database encryption for your clusters to help protect data at rest. When you turn on encryption for a cluster, the data blocks and system metadata are encrypted for the cluster and its snapshots. You can turn on encryption when you launch your cluster, or you can modify an unencrypted cluster to use AWS Key Management Service (AWS KMS) encryption. For more information about Amazon Redshift database encryption, see Amazon Redshift database encryption in the Amazon Redshift Management Guide.
To protect data in transit, all data is encrypted in transit through the encryption schema of the producer cluster. The consumer cluster adopts this encryption schema when data is loaded. The consumer cluster then operates as a normal encrypted cluster. Communications between the producer and consumer are also encrypted using a shared key schema. For more information about encryption in transit, Encryption in transit.
Limitations for data sharing
The following are limitations when working with datashares in Amazon Redshift:
Data sharing is supported for all provisioned RA3 cluster types and Amazon Redshift Serverless. It isn't supported for other cluster types.
For cross-account and cross-Region data sharing, both the producer and consumer clusters and serverless namespaces must be encrypted. This is for security purposes. However, they don't need to share the same encryption key.
You can only share SQL UDFs through datashares. Python and Lambda UDFs aren't supported.
If the producer database has specific collation, use the same collation settings for the consumer database.
Amazon Redshift doesn't support adding external schemas, tables, or late-binding views on external tables to datashares.
Amazon Redshift doesn't support nested SQL user-defined functions on producer clusters.
Amazon Redshift doesn't support sharing tables with interleaved sort keys and views that refer to tables with interleaved sort keys.
Consumers can't add datashare objects to another datashare. Additionally, consumers can't add views referencing datashare objects to another datashare.
Amazon Redshift doesn't support accessing a datashare object which had a concurrent DDL occur between the Prepare and Execute of the access.
Amazon Redshift doesn't support sharing stored procedures through datashares.
Amazon Redshift doesn't support sharing metadata system views and system tables.
Regions where data sharing is available
The following table lists availability for data-sharing capabilities.
Region | Same-region data sharing | Cross-region data sharing | AWS Lake Formation governed data shares |
---|---|---|---|
US East (N. Virginia) (us-east-1) | Yes | Yes | Yes |
US East (Ohio) (us-east-2) | Yes | Yes | Yes |
US West (N. California) (us-west-1) | Yes | Yes | Yes |
US West (Oregon) (us-west-2) | Yes | Yes | Yes |
Asia Pacific (Hong Kong) (ap-east-1) | Yes | No | No |
Asia Pacific (Mumbai) (ap-south-1) | Yes | Yes | Yes |
Asia Pacific (Hyderabad) (ap-south-2) | Yes | No | No |
Asia Pacific (Tokyo) (ap-northeast-1) | Yes | Yes | Yes |
Asia Pacific (Singapore) (ap-southeast-1) | Yes | Yes | Yes |
Asia Pacific (Sydney) (ap-southeast-2) | Yes | Yes | Yes |
Asia Pacific (Jakarta); (ap-southeast-3) | Yes | No | No |
Asia Pacific (Melbourne) (ap-southeast-4) | Yes | No | No |
Asia Pacific (Seoul) (ap-northeast-2) | Yes | Yes | Yes |
Asia Pacific (Osaka) (ap-northeast-3) | Yes | No | No |
China (Beijing) (cn-north-1) | Yes | No | No |
Africa (Cape Town) (af-south-1) | Yes | Yes | No |
Canada West (Calgary) (ca-west-1) | Yes | No | No |
Canada (Central) (ca-central-1) | Yes | Yes | Yes |
Europe (Frankfurt) (eu-central-1) | Yes | Yes | Yes |
Europe (Zurich) (eu-central-2) | Yes | No | No |
Europe (Ireland) (eu-west-1) | Yes | Yes | Yes |
Europe (London) (eu-west-2) | Yes | Yes | Yes |
Europe (Paris) (eu-west-3) | Yes | Yes | Yes |
Europe (Milan) (eu-south-1) | Yes | No | No |
Europe (Spain) (eu-south-2) | Yes | No | No |
Europe (Stockholm) (eu-north-1) | Yes | Yes | Yes |
Middle East (UAE) (me-central-1) | Yes | No | No |
Middle East (Bahrain) (me-south-1) | Yes | No | No |
Israel (Tel Aviv) (il-central-1) | Yes | No | No |
South America (São Paulo) (sa-east-1) | Yes | Yes | Yes |
AWS GovCloud (US-East) (us-gov-east-1) | Yes | No | Yes |
AWS GovCloud (US-West) (us-gov-west-1) | Yes | No | Yes |
Regional availability for multi-warehouse writes for data sharing
In the PREVIEW_2023 track, data sharing has the capability for write operations and more granular sharing capabilities. For more information about how to configure these, see Sharing write access to data (Preview). For information about regions where preview capabilities are available, see Regions where data sharing is available (preview).