Native PostgreSQL client applications - AWS Prescriptive Guidance

Native PostgreSQL client applications

You can use native PostgreSQL client applications for an offline migration, including pg_dump, pg_dumpall, and pg_restore. The pg_dump utility is for backing up PostgreSQL databases. You can use pg_dump to make consistent backups and back up a single database. To back up global objects that are common to all databases in a cluster (such as roles and tablespaces), use pg_dumpall. We recommend that you run the pg_dump utility as a superuser because only a superuser can perform a complete dump. The pg_dump utility also offers options to optimize the backup process, including using the jobs option (for running the dump in parallel) and the directory format when you take the backup. For more information, see the Options section of the pg_dump page in the PostgreSQL documentation.

The pg_restore utility is for restoring a PostgreSQL database from an archive created by pg_dump. You can run the pg_restore utility in parallel mode if you use pg_dump in custom or directory format. To increase the performance of pg_restore, you can tune the following parameters:

  • Set shared_buffers to align with your requirements and increase maintenance_work_mem to speed up index creation.

  • Turn off extensive logging, the autovacuum daemon, and full_page_writes.

  • (Optionally) Back up and restore the schema and use a data-only dump and restore. If you use this method, you can use the disable-triggers option to improve performance.

You can run pg_dump and pg_restore from your target EC2 instance. We recommend that you use AWS Direct Connect for better speed and performance. Finally, make sure that you have sufficient storage for the backup file.

Architecture

The following diagram shows the architecture for migrating an on-premises PostgreSQL database to the AWS Cloud by using native PostgreSQL client applications.

Native PostgreSQL client application architecture

The diagram shows the following workflow:

  1. Take backups by using pg_dump.

  2. Create an EC2 instance and install PostgreSQL on the instance.

  3. Copy the dump file to Amazon EC2.

  4. Restore the backup file by using pg_restore.

Limitations

We recommend that you consider the following limitations of using native PostgreSQL client applications before starting your migration:

  • A dump and restore can be time-consuming for larger databases (that is, databases greater than 300 GB).

  • This option requires complete database downtime.

  • A significant amount of diskspace is required on the server to take backups and then copy the backups to Amazon EC2.