Exporting data from an Aurora PostgreSQL DB cluster to Amazon S3
You can query data from an Aurora PostgreSQL DB cluster
and export it directly into files stored in an Amazon S3 bucket. To do this, you first install the Aurora PostgreSQL
aws_s3
extension. This extension provides you with
the functions that you use to export the results of queries to Amazon S3. Following, you can find out how to install the extension and how to export data to Amazon S3.
You can export from a provisioned or an Aurora Serverless v2 DB instance. These steps aren't supported for Aurora Serverless v1.
Note
Cross-account export to Amazon S3 isn't supported.
All currently available versions of Aurora PostgreSQL support exporting data to Amazon Simple Storage Service. For detailed version information, see Amazon Aurora PostgreSQL updates in the Release Notes for Aurora PostgreSQL.
If you don't have a bucket set up for your export, see the following topics the Amazon Simple Storage Service User Guide.
By default, the data exported from Aurora PostgreSQL to Amazon S3 uses server-side encryption with AWS managed key. You can alternatively use customer managed key that you have already created. If you are using bucket encryption, the Amazon S3 bucket must be encrypted with AWS Key Management Service (AWS KMS) key (SSE-KMS). Currently, buckets encrypted with Amazon S3 managed keys (SSE-S3) are not supported.
Note
You can save DB and DB cluster snapshot data to Amazon S3 using the AWS Management Console, AWS CLI, or Amazon RDS API. For more information, see Exporting DB cluster snapshot data to Amazon S3.
Topics
Installing the aws_s3 extension
Before you can use Amazon Simple Storage Service with your Aurora PostgreSQL DB cluster,
you need to install the
aws_s3
extension. This extension provides functions for exporting data from
the writer instance of an
Aurora PostgreSQL DB cluster
to an Amazon S3 bucket. It also provides functions for importing data from an Amazon S3. For more information,
see Importing data from Amazon S3 into an Aurora PostgreSQL DB cluster.
The aws_s3
extension depends on some of the helper functions in the
aws_commons
extension, which is installed automatically when needed.
To install the aws_s3
extension
Use psql (or pgAdmin) to connect to the writer instance of your Aurora PostgreSQL DB cluster as a user that has
rds_superuser
privileges. If you kept the default name during the setup process, you connect aspostgres
.psql --host=
111122223333
.aws-region
.rds.amazonaws.com --port=5432 --username=postgres --passwordTo install the extension, run the following command.
postgres=>
CREATE EXTENSION aws_s3 CASCADE;NOTICE: installing required extension "aws_commons" CREATE EXTENSION
To verify that the extension is installed, you can use the psql
\dx
metacommand.postgres=>
\dxList of installed extensions Name | Version | Schema | Description -------------+---------+------------+--------------------------------------------- aws_commons | 1.2 | public | Common data types across AWS services aws_s3 | 1.1 | public | AWS S3 extension for importing data from S3 plpgsql | 1.0 | pg_catalog | PL/pgSQL procedural language (3 rows)
The functions for importing data from Amazon S3 and exporting data to Amazon S3 are now available to use.
Verify that your Aurora PostgreSQL version supports exports to Amazon S3
You can verify that your Aurora PostgreSQL version supports export to Amazon S3 by using the
describe-db-engine-versions
command. The following example checks to see
if version 10.14 can export to Amazon S3.
aws rds describe-db-engine-versions --region us-east-1 \ --engine aurora-postgresql --engine-version 10.14 | grep s3Export
If the output includes the string "s3Export"
, then the engine supports Amazon S3 exports. Otherwise,
the engine doesn't support them.
Overview of exporting data to Amazon S3
To export data stored in an Aurora PostgreSQL database to an Amazon S3 bucket, use the following procedure.
To export Aurora PostgreSQL data to S3
-
Identify an Amazon S3 file path to use for exporting data. For details about this process, see Specifying the Amazon S3 file path to export to.
-
Provide permission to access the Amazon S3 bucket.
To export data to an Amazon S3 file, give the Aurora PostgreSQL DB cluster permission to access the Amazon S3 bucket that the export will use for storage. Doing this includes the following steps:
-
Create an IAM policy that provides access to an Amazon S3 bucket that you want to export to.
-
Create an IAM role.
-
Attach the policy you created to the role you created.
-
Add this IAM role to your DB cluster .
For details about this process, see Setting up access to an Amazon S3 bucket.
-
-
Identify a database query to get the data. Export the query data by calling the
aws_s3.query_export_to_s3
function.After you complete the preceding preparation tasks, use the aws_s3.query_export_to_s3 function to export query results to Amazon S3. For details about this process, see Exporting query data using the aws_s3.query_export_to_s3 function.
Specifying the Amazon S3 file path to export to
Specify the following information to identify the location in Amazon S3 where you want to export data to:
-
Bucket name – A bucket is a container for Amazon S3 objects or files.
For more information on storing data with Amazon S3, see Create a bucket and View an object in the Amazon Simple Storage Service User Guide.
-
File path – The file path identifies where the export is stored in the Amazon S3 bucket. The file path consists of the following:
-
An optional path prefix that identifies a virtual folder path.
-
A file prefix that identifies one or more files to be stored. Larger exports are stored in multiple files, each with a maximum size of approximately 6 GB. The additional file names have the same file prefix but with
_part
appended. TheXX
represents 2, then 3, and so on.XX
For example, a file path with an
exports
folder and aquery-1-export
file prefix is/exports/query-1-export
. -
-
AWS Region (optional) – The AWS Region where the Amazon S3 bucket is located. If you don't specify an AWS Region value, then Aurora saves your files into Amazon S3 in the same AWS Region as the exporting DB cluster.
Note
Currently, the AWS Region must be the same as the region of the exporting DB cluster.
For a listing of AWS Region names and associated values, see Regions and Availability Zones.
To hold the Amazon S3 file information about where the export is to be stored, you can use
the aws_commons.create_s3_uri function to create an
aws_commons._s3_uri_1
composite structure as follows.
psql=> SELECT aws_commons.create_s3_uri( '
amzn-s3-demo-bucket
', 'sample-filepath', 'us-west-2' ) AS s3_uri_1 \gset
You later provide this s3_uri_1
value as a parameter in the call to the
aws_s3.query_export_to_s3 function. For examples, see Exporting query data using the
aws_s3.query_export_to_s3 function.