Migrating data to Amazon Aurora with PostgreSQL compatibility - Amazon Aurora

Migrating data to Amazon Aurora with PostgreSQL compatibility

You have several options for migrating data from your existing database to an Amazon Aurora PostgreSQL-Compatible Edition DB cluster. Your migration options also depend on the database that you are migrating from and the size of the data that you are migrating. Following are your options:

Migrating an RDS for PostgreSQL DB instance using a snapshot

You can migrate data directly from an RDS for PostgreSQL DB snapshot to an Aurora PostgreSQL DB cluster.

Migrating an RDS for PostgreSQL DB instance using an Aurora read replica

You can also migrate from an RDS for PostgreSQL DB instance by creating an Aurora PostgreSQL read replica of an RDS for PostgreSQL DB instance. When the replica lag between the RDS for PostgreSQL DB instance and the Aurora PostgreSQL read replica is zero, you can stop replication. At this point, you can make the Aurora read replica a standalone Aurora PostgreSQL DB cluster for reading and writing.

Importing S3 data into Aurora PostgreSQL

You can migrate data by importing it from Amazon S3 into a table belonging to an Aurora PostgreSQL DB cluster.

Migrating from a database that is not PostgreSQL-compatible

You can use AWS Database Migration Service (AWS DMS) to migrate data from a database that is not PostgreSQL-compatible. For more information on AWS DMS, see What is AWS Database Migration Service? in the AWS Database Migration Service User Guide.

For a list of AWS Regions where Aurora is available, see Amazon Aurora in the AWS General Reference.

Important

If you plan to migrate an RDS for PostgreSQL DB instance to an Aurora PostgreSQL DB cluster in the near future, we strongly recommend that you disable auto minor version upgrades for the DB instance early in the migration planning phase. Migration to Aurora PostgreSQL might be delayed if the RDS for PostgreSQL version isn't yet supported by Aurora PostgreSQL. For information about Aurora PostgreSQL versions, see Engine versions for Amazon Aurora PostgreSQL.

Migrating a snapshot of an RDS for PostgreSQL DB instance to an Aurora PostgreSQL DB cluster

To create an Aurora PostgreSQL DB cluster, you can migrate a DB snapshot of an RDS for PostgreSQL DB instance. The new Aurora PostgreSQL DB cluster is populated with the data from the original RDS for PostgreSQL DB instance. For information about creating a DB snapshot, see Creating a DB snapshot.

In some cases, the DB snapshot might not be in the AWS Region where you want to locate your data. If so, use the Amazon RDS console to copy the DB snapshot to that AWS Region. For information about copying a DB snapshot, see Copying a DB snapshot.

You can migrate RDS for PostgreSQL snapshots that are compatible with the Aurora PostgreSQL versions available in the given AWS Region. For example, a snapshot from an RDS for PostgreSQL 11.1 DB instance can be migrated to Aurora PostgreSQL version 11.4, 11.7, 11.8, or 11.9 in the US West (N. California) Region. An RDS PostgreSQL 10.11 snapshot can be migrated to Aurora PostgreSQL 10.11, 10.12, 10.13, and 10.14. In other words, the RDS for PostgreSQL snapshot must use the same or a lower minor version as the Aurora PostgreSQL.

When you migrate the DB snapshot by using the console, the console takes the actions necessary to create both the DB cluster and the primary instance.

You can also choose for your new Aurora PostgreSQL DB cluster to be encrypted at rest by using an AWS KMS key. This option is available only for unencrypted DB snapshots.

To migrate a PostgreSQL DB snapshot by using the RDS console

  1. Sign in to the AWS Management Console and open the Amazon RDS console at https://console.aws.amazon.com/rds/.

  2. Choose Snapshots.

  3. On the Snapshots page, choose the RDS for PostgreSQL snapshot that you want to migrate into an Aurora PostgreSQL DB cluster.

  4. Choose Actions then choose Migrate snapshot.

  5. Set the following values on the Migrate database page:

    • DB engine version: Choose a DB engine version you want to use for the new migrated instance.

    • DB instance identifier: Enter a name for the DB cluster that is unique for your account in the AWS Region that you chose. This identifier is used in the endpoint addresses for the instances in your DB cluster. You might choose to add some intelligence to the name, such as including the AWS Region and DB engine that you chose, for example aurora-cluster1.

      The DB instance identifier has the following constraints:

      • It must contain 1–63 alphanumeric characters or hyphens.

      • Its first character must be a letter.

      • It can't end with a hyphen or contain two consecutive hyphens.

      • It must be unique for all DB instances per AWS account, per AWS Region.

    • DB instance class: Choose a DB instance class that has the required storage and capacity for your database, for example db.r3.large. Aurora cluster volumes automatically grow as the amount of data in your database increases. 128 tebibytes (TiB) So you only need to choose a DB instance class that meets your current storage requirements. For more information, see Overview of Aurora storage.

    • Virtual private cloud (VPC): If you have an existing VPC, then you can use that VPC with your Aurora PostgreSQL DB cluster by choosing your VPC identifier, for example vpc-a464d1c1. For information on using an existing VPC, see How to create a VPC for use with Amazon Aurora.

      Otherwise, you can choose to have Amazon RDS create a VPC for you by choosing Create new VPC.

    • Subnet group: If you have an existing subnet group, then you can use that subnet group with your Aurora PostgreSQL DB cluster by choosing your subnet group identifier, for example gs-subnet-group1.

    • Public access: Choose No to specify that instances in your DB cluster can only be accessed by resources inside of your VPC. Choose Yes to specify that instances in your DB cluster can be accessed by resources on the public network.

      Note

      Your production DB cluster might not need to be in a public subnet, because only your application servers require access to your DB cluster. If your DB cluster doesn't need to be in a public subnet, set Public access to No.

    • VPC security group: Choose a VPC security group to allow access to your database.

    • Availability Zone: Choose the Availability Zone to host the primary instance for your Aurora PostgreSQL DB cluster. To have Amazon RDS choose an Availability Zone for you, choose No preference.

    • Database port: Enter the default port to be used when connecting to instances in the Aurora PostgreSQL DB cluster. The default is 5432.

      Note

      You might be behind a corporate firewall that doesn't allow access to default ports such as the PostgreSQL default port, 5432. In this case, provide a port value that your corporate firewall allows. Remember that port value later when you connect to the Aurora PostgreSQL DB cluster.

    • Enable Encryption: Choose Enable Encryption for your new Aurora PostgreSQL DB cluster to be encrypted at rest. Also choose a KMS key as the AWS KMS key value.

    • Auto minor version upgrade: Choose Enable auto minor version upgrade to enable your Aurora PostgreSQL DB cluster to receive minor PostgreSQL DB engine version upgrades automatically when they become available.

      The Auto minor version upgrade option only applies to upgrades to PostgreSQL minor engine versions for your Aurora PostgreSQL DB cluster. It doesn't apply to regular patches applied to maintain system stability.

  6. Choose Migrate to migrate your DB snapshot.

  7. Choose Databases to see the new DB cluster. Choose the new DB cluster to monitor the progress of the migration. On the Connectivity & security tab, you can find the cluster endpoint to use for connecting to the primary writer instance of the DB cluster. For more information on connecting to an Aurora PostgreSQL DB cluster, see Connecting to an Amazon Aurora DB cluster.

Migrating data from an RDS for PostgreSQL DB instance to an Aurora PostgreSQL DB cluster using an Aurora read replica

You can migrate from an RDS for PostgreSQL DB instance to an Aurora PostgreSQL DB cluster by using an Aurora read replica. When you need to migrate from an RDS for PostgreSQL DB instance to an Aurora PostgreSQL DB cluster, we recommend using this approach.

In this case, Amazon RDS uses the PostgreSQL DB engine's streaming replication functionality to create a special type of DB cluster for the source PostgreSQL DB instance. This type of DB cluster is called an Aurora read replica. Updates made to the source RDS for PostgreSQL DB instance are asynchronously replicated to the Aurora read replica.

Overview of migrating data by using an Aurora read replica

To migrate from an RDS for PostgreSQL DB instance to an Aurora PostgreSQL DB cluster, we recommend creating an Aurora read replica of your source RDS for PostgreSQL DB instance. When the replica lag between the RDS for PostgreSQL DB instance and the Aurora PostgreSQL Read Replica is zero, you can stop replication. At this point, you can promote the Aurora read replica to be a standalone Aurora PostgreSQL DB cluster. This standalone DB cluster can then accept write loads.

Be prepared for migration to take a while, roughly several hours per tebibyte (TiB) of data. While the migration is in progress, your RDS for PostgreSQL instance accumulates write ahead log (WAL) segments. Make sure that your Amazon RDS instance has sufficient storage capacity for these segments.

When you create an Aurora read replica of an RDS for PostgreSQL DB instance, Amazon RDS creates a DB snapshot of your source RDS for PostgreSQL DB instance. This snapshot is private to Amazon RDS and incurs no charges. Amazon RDS then migrates the data from the DB snapshot to the Aurora read replica. After the DB snapshot data is migrated to the new Aurora PostgreSQL DB cluster, RDS starts replication between your RDS for PostgreSQL DB instance and the Aurora PostgreSQL DB cluster.

You can only have one Aurora read replica for an RDS for PostgreSQL DB instance. If you try to create an Aurora read replica for your RDS for PostgreSQL instance and you already have an Aurora read replica or a cross-region read replica, the request is rejected.

Note

Replication issues can arise due to feature differences between Aurora PostgreSQL and the PostgreSQL engine version of your RDS for PostgreSQL DB instance that is the replication source. You can replicate only from an RDS for PostgreSQL instance that is compatible with the Aurora PostgreSQL version. The RDS for PostgreSQL version must be lower than or equal to a supported Aurora PostgreSQL version in the same major version. For example, you can replicate data between an RDS for PostgreSQL version 11.7 DB instance and an Aurora PostgreSQL version 11.7 or higher 11 version DB cluster, but not an Aurora PostgreSQL version 11.6 DB cluster. For information about Aurora PostgreSQL versions, see Aurora PostgreSQL releases and engine versions in the Aurora User Guide. If you encounter an error, you can find help in the Amazon RDS community forum or by contacting AWS Support.

For more information on PostgreSQL read replicas, see Working with read replicas in the Amazon RDS User Guide.

Preparing to migrate data by using an Aurora read replica

Before you migrate data from your RDS for PostgreSQL instance to an Aurora PostgreSQL cluster, make sure that your instance has sufficient storage capacity. This storage capacity is for the write ahead log (WAL) segments that accumulate during the migration. There are several metrics to check for this, described following.

Metric Description

FreeStorageSpace

The available storage space.

Units: Bytes

OldestReplicationSlotLag

The size of the lag for WAL data in the replica that is lagging the most.

Units: Megabytes

RDSToAuroraPostgreSQLReplicaLag

The amount of time in seconds that an Aurora PostgreSQL DB cluster lags behind the source RDS DB instance.

TransactionLogsDiskUsage

The disk space used by the transaction logs.

Units: Megabytes

For more information about monitoring your RDS instance, see Monitoring in the Amazon RDS User Guide.

Creating an Aurora read replica

You can create an Aurora read replica for an RDS for PostgreSQL DB instance by using the console or the AWS CLI.

To create an Aurora read replica from a source PostgreSQL DB instance

  1. Sign in to the AWS Management Console and open the Amazon RDS console at https://console.aws.amazon.com/rds/.

  2. In the navigation pane, choose Databases.

  3. Choose the RDS for PostgreSQL DB instance that you want to use as the source for your Aurora read replica, and choose Create Aurora read replica for Actions.

    
                                    Create Aurora read replica
  4. Choose the DB cluster specifications that you want to use for the Aurora read replica, as described in the following table.

    Option Description

    DB instance class

    Choose a DB instance class that defines the processing and memory requirements for the primary instance in the DB cluster. For more information about DB instance class options, see Aurora DB instance classes.

    Multi-AZ deployment

    Not available for PostgreSQL

    DB instance identifier

    Enter a name for the primary instance in your Aurora read replica DB cluster. This identifier is used in the endpoint address for the primary instance of the new DB cluster.

    The DB instance identifier has the following constraints:

    • It must contain 1–63 alphanumeric characters or hyphens.

    • Its first character must be a letter.

    • It can't end with a hyphen or contain two consecutive hyphens.

    • It must be unique for all DB instances for each AWS account, for each AWS Region.

    The Aurora read replica DB cluster is created from a snapshot of the source DB instance. Thus, the master user name and master password for the Aurora read replica are the same as the master user name and master password for the source DB instance.

    Virtual Private Cloud (VPC)

    Choose the VPC to host the DB cluster. Choose Create new VPC to have Amazon RDS create a VPC for you. For more information, see DB cluster prerequisites.

    Subnet group

    Choose the DB subnet group to use for the DB cluster. Choose Create new DB Subnet Group to have Amazon RDS create a DB subnet group for you. For more information, see DB cluster prerequisites.

    Public accessibility

    Choose Yes to give the DB cluster a public IP address; otherwise, choose No. The instances in your DB cluster can be a mix of both public and private DB instances. For more information about hiding instances from public access, see Hiding a DB instance in a VPC from the internet.

    Availability zone

    Determine if you want to specify a particular Availability Zone. For more information about Availability Zones, see Regions and Availability Zones .

    VPC security groups

    Choose one or more VPC security groups to secure network access to the DB cluster. Choose Create new VPC security group to have Amazon RDS create a VPC security group for you. For more information, see DB cluster prerequisites.

    Database port

    Specify the port for applications and utilities to use to access the database. Aurora PostgreSQL DB clusters default to the default PostgreSQL port, 5432. Firewalls at some companies block connections to this port. If your company firewall blocks the default port, choose another port for the new DB cluster.

    DB parameter group

    Choose a DB parameter group for the Aurora PostgreSQL DB cluster. Aurora has a default DB parameter group you can use, or you can create your own DB parameter group. For more information about DB parameter groups, see Working with DB parameter groups and DB cluster parameter groups.

    DB cluster parameter group

    Choose a DB cluster parameter group for the Aurora PostgreSQL DB cluster. Aurora has a default DB cluster parameter group you can use, or you can create your own DB cluster parameter group. For more information about DB cluster parameter groups, see Working with DB parameter groups and DB cluster parameter groups.

    Encryption

    Choose Enable encryption for your new Aurora DB cluster to be encrypted at rest. If you choose Enable encryption, also choose a KMS key as the AWS KMS key value.

    Priority

    Choose a failover priority for the DB cluster. If you don't choose a value, the default is tier-1. This priority determines the order in which Aurora Replicas are promoted when recovering from a primary instance failure. For more information, see Fault tolerance for an Aurora DB cluster.

    Backup retention period

    Choose the length of time, 1–35 days, for Aurora to retain backup copies of the database. Backup copies can be used for point-in-time restores (PITR) of your database down to the second.

    Enhanced monitoring

    Choose Enable enhanced monitoring to enable gathering metrics in real time for the operating system that your DB cluster runs on. For more information, see Monitoring the OS by using Enhanced Monitoring.

    Monitoring Role

    Only available if you chose Enable enhanced monitoring. The AWS Identity and Access Management (IAM) role to use for Enhanced Monitoring. For more information, see Setting up and enabling Enhanced Monitoring.

    Granularity

    Only available if you chose Enable enhanced monitoring. Set the interval, in seconds, between when metrics are collected for your DB cluster.

    Auto minor version upgrade

    Choose Yes to enable your Aurora PostgreSQL DB cluster to receive minor PostgreSQL DB engine version upgrades automatically when they become available.

    The Auto minor version upgrade option only applies to upgrades to PostgreSQL minor engine versions for your Aurora PostgreSQL DB cluster. It doesn't apply to regular patches applied to maintain system stability.

    Maintenance window

    Choose the weekly time range during which system maintenance can occur.

  5. Choose Create read replica.

To create an Aurora read replica from a source RDS for PostgreSQL DB instance, use the create-db-cluster and create-db-instance AWS CLI commands to create a new Aurora PostgreSQL DB cluster. When you call the create-db-cluster command, include the --replication-source-identifier parameter to identify the Amazon Resource Name (ARN) for the source RDS for PostgreSQL DB instance. For more information about Amazon RDS ARNs, see Amazon Relational Database Service (Amazon RDS) in the AWS General Reference.

Don't specify the master user name, master password, or database name. The Aurora read replica uses the same master user name, master password, and database name as the source RDS for PostgreSQL DB instance.

For Linux, macOS, or Unix:

aws rds create-db-cluster --db-cluster-identifier sample-replica-cluster --engine aurora-postgresql \ --db-subnet-group-name mysubnetgroup --vpc-security-group-ids sg-c7e5b0d2 \ --replication-source-identifier arn:aws:rds:us-west-2:123456789012:db:master-postgresql-instance

For Windows:

aws rds create-db-cluster --db-cluster-identifier sample-replica-cluster --engine aurora-postgresql ^ --db-subnet-group-name mysubnetgroup --vpc-security-group-ids sg-c7e5b0d2 ^ --replication-source-identifier arn:aws:rds:us-west-2:123456789012:db:master-postgresql-instance

If you use the console to create an Aurora read replica, then RDS automatically creates the primary instance for your DB cluster Aurora Read Replica. If you use the CLI to create an Aurora read replica, you must explicitly create the primary instance for your DB cluster. The primary instance is the first instance that is created in a DB cluster.

You can create a primary instance for your DB cluster by using the create-db-instance CLI command with the following parameters:

  • --db-cluster-identifier

    The name of your DB cluster.

  • --db-instance-class

    The name of the DB instance class to use for your primary instance.

  • --db-instance-identifier

    The name of your primary instance.

  • --engine aurora-postgresql

    The database engine to use.

In the following example, you create a primary instance named myreadreplicainstance for the DB cluster named myreadreplicacluster. You do this using the DB instance class specified in myinstanceclass.

Example

For Linux, macOS, or Unix:

aws rds create-db-instance \ --db-cluster-identifier myreadreplicacluster \ --db-instance-class myinstanceclass \ --db-instance-identifier myreadreplicainstance \ --engine aurora-postgresql

For Windows:

aws rds create-db-instance ^ --db-cluster-identifier myreadreplicacluster ^ --db-instance-class myinstanceclass ^ --db-instance-identifier myreadreplicainstance ^ --engine aurora-postgresql

To create an Aurora read replica from a source RDS for PostgreSQL DB instance, use the RDS API operations CreateDBCluster and CreateDBInstance to create a new Aurora DB cluster and primary instance. Don't specify the master user name, master password, or database name. The Aurora read replica uses the same master user name, master password, and database name as the source RDS for PostgreSQL DB instance.

You can create a new Aurora DB cluster for an Aurora read replica from a source RDS for PostgreSQL DB instance. To do so, use the RDS API operation CreateDBCluster with the following parameters:

  • DBClusterIdentifier

    The name of the DB cluster to create.

  • DBSubnetGroupName

    The name of the DB subnet group to associate with this DB cluster.

  • Engine=aurora-postgresql

    The name of the engine to use.

  • ReplicationSourceIdentifier

    The Amazon Resource Name (ARN) for the source PostgreSQL DB instance. For more information about Amazon RDS ARNs, see Amazon Relational Database Service (Amazon RDS) in the Amazon Web Services General Reference.

  • VpcSecurityGroupIds

    The list of Amazon EC2 VPC security groups to associate with this DB cluster.

See an example with the RDS API operation CreateDBCluster.

If you use the console to create an Aurora read replica, then Amazon RDS automatically creates the primary instance for your DB cluster Aurora Read Replica. If you use the CLI to create an Aurora read replica, you must explicitly create the primary instance for your DB cluster. The primary instance is the first instance that is created in a DB cluster.

You can create a primary instance for your DB cluster by using the RDS API operation CreateDBInstance with the following parameters:

  • DBClusterIdentifier

    The name of your DB cluster.

  • DBInstanceClass

    The name of the DB instance class to use for your primary instance.

  • DBInstanceIdentifier

    The name of your primary instance.

  • Engine=aurora-postgresql

    The name of the engine to use.

See an example with the RDS API operation CreateDBInstance.

Promoting an Aurora read replica

After migration completes, you can promote the Aurora read replica to a standalone DB cluster. You then direct your client applications to the endpoint for the Aurora read replica. For more information on the Aurora endpoints, see Amazon Aurora connection management. Promotion should complete fairly quickly. You can't delete the primary PostgreSQL DB instance or unlink the DB instance and the Aurora read replica until the promotion is complete.

Before you promote your Aurora read replica, stop any transactions from being written to the source RDS for PostgreSQL DB instance. Then wait for the replica lag on the Aurora read replica to reach zero. For more information, see Monitoring Aurora PostgreSQL replication and Monitoring Amazon Aurora metrics with Amazon CloudWatch.

To promote an Aurora read replica to an Aurora DB cluster

  1. Sign in to the AWS Management Console and open the Amazon RDS console at https://console.aws.amazon.com/rds/.

  2. In the navigation pane, choose Databases.

  3. Choose the DB instance for the Aurora read replica.

  4. For Actions, choose Promote.

  5. Choose Promote read replica.

To promote an Aurora read replica to a stand-alone DB cluster, use the promote-read-replica-db-cluster AWS CLI command.

Example

For Linux, macOS, or Unix:

aws rds promote-read-replica-db-cluster \ --db-cluster-identifier myreadreplicacluster

For Windows:

aws rds promote-read-replica-db-cluster ^ --db-cluster-identifier myreadreplicacluster

To promote an Aurora read replica to a stand-alone DB cluster, use the RDS API operation PromoteReadReplicaDBCluster.

After you promote your read replica, confirm that the promotion has completed by using the following procedure.

To confirm that the Aurora read replica was promoted

  1. Sign in to the AWS Management Console and open the Amazon RDS console at https://console.aws.amazon.com/rds/.

  2. In the navigation pane, choose Events.

  3. On the Events page, verify that there is an event, Promoted read replica cluster to a stand-alone database cluster for the cluster that you promoted.

After promotion is complete, the primary RDS for PostgreSQL DB instance and the Aurora Read Replica are unlinked. At this point, you can safely delete the DB instance if you want to.