Importing Data into PostgreSQL on Amazon RDS
If you have an existing PostgreSQL deployment that you want to move to Amazon RDS, the complexity of your task depends on the size of your database and the types of database objects that you are transferring. For example, a database that contains data sets on the order of gigabytes, along with stored procedures and triggers, is going to be more complicated than a simple database with only a few megabytes of test data and no triggers or stored procedures.
We recommend that you use native PostgreSQL database migration tools under the following conditions:
You have a homogeneous migration, where you are migrating from a database with the same database engine as the target database.
You are migrating an entire database.
The native tools allow you to migrate your system with minimal downtime.
In most other cases, performing a database migration using AWS Database Migration Service (AWS DMS) is the best approach. AWS DMS can migrate databases without downtime and, for many database engines, continue ongoing replication until you are ready to switch over to the target database. You can migrate to either the same database engine or a different database engine using AWS DMS. If you are migrating to a different database engine than your source database, you can use the AWS Schema Conversion Tool to migrate schema objects that are not migrated by AWS DMS. For more information about AWS DMS, see What is AWS Database Migration Service.
Modify your DB parameter group to include the following settings for your import *ONLY*. You should test the parameter settings to find the most efficient settings for your DB instance size. You will also need to revert back to production values for these parameters after your import completes:
Modify your DB instance settings to the following:
Disable DB instance backups (set backup_retention to 0)
Modify your DB parameter group to include the following settings. You should only use these setting when importing data. You should test the parameter settings to find the most efficient settings for your DB instance size. You will also need to revert back to production values for these parameters after your import completes:
|Parameter||Recommended Value When Importing||Description|
524288, 1048576, 2097152 or 4194304 (in KB). These settings are comparable to 512 MB, 1 GB, 2 GB, and 4 GB.
The value for this setting depends on the size of your host. This parameter is used during CREATE INDEX statements and each parallel command can use this much memory. You will need to calculate the best value so that you don't set this value so high that you run out of memory.
The value for this setting consumes more diskspace, but gives you less contention on your WAL logs.
The value for this setting allows for less frequent WAL rotation.
Disable this setting to speed up writes. Turning this parameter off can increase the risk of data loss in the event of a server crash (do not turn off FSYNC)
This is value is in 8 KB units. This again helps your WAL generation speed
Disable the PostgreSQL auto vacuum parameter while you are loading data so that it doesn’t use resources
pg_dump -Fc (compressed) or
(parallel) commands with these settings.
The PostgreSQL command
super_user permissions that are not granted when you create a
DB instance, so it cannot be used for importing data.
Importing a PostgreSQL Database from an Amazon EC2 Instance
If you have data in a PostgreSQL server on an Amazon EC2 instance and want to move it to a PostgreSQL DB instance, you can use the following process. The following list shows the steps to take. Each step is discussed in more detail in the following sections.
Create a file using pg_dump that contains the data to be loaded
Create the target DB instance
Use psql to create the database on the DB instance and load the data
Create a DB snapshot of the DB instance
Step 1: Create a
pg_dump that contains the data to be loaded
pg_dump uses the COPY command to create a schema and data
dump of a PostgreSQL database. The dump script generated by
loads data into a database with the same name and recreates the tables, indexes,
and foreign keys. You can use the
pg_restore command and the
-d parameter to restore the data to a database with a different
Before you create the data dump, you should query the tables to be dumped to get a row count so you can confirm the count on the target DB instance.
The following command creates a dump file called mydb2dump.sql for a database called mydb2.
prompt>pg_dump dbname=mydb2 -f mydb2dump.sql
Step 2: Create the target DB instance
Create the target PostgreSQL DB instance using either the Amazon RDS
console, AWS CLI, or API. Create the instance with the backup retention setting set
to 0 and disable Multi-AZ. This will allow faster data import. You must create a
database on the instance before you can dump the data. The database can have the
same name as the database that is contained the dumped data or you can create a
database with a different name and use the
pg_restore command and
-d parameter to restore the data into the newly named
For example, the following commands can be used to dump, restore, and rename a database:
pg_dump -Fc -v -h
[endpoint of instance]-U
[new database name]pg_restore -v -h
[endpoint of instance]-U
[new database name]
Step 3: Use psql to create the database on the DB instance and load the data
You can use the same connection you used to execute the pg_dump command to connect to the target DB instance and recreate the database. Using psql, you can use the master user name and master password to create the database on the DB instance
The following example uses psql and a dump file named mydb2dump.sql to create a database called mydb2 on a PostgreSQL DB instance called mypginstance:
For Linux, OS X, or Unix:
psql \ -f
psql ^ -f
Step 4: Create a DB snapshot of the DB instance
Once you have verified that the data was loaded into your DB instance, we recommend that you create a DB snapshot of the target PostgreSQL DB instance. DB snapshots are complete backups of your DB instance that can be used to restore your DB instance to a known state. A DB snapshot taken immediately after the load protects you from having to load the data again in case of a mishap and can also be used to seed new database instances. For information about creating a DB snapshot, see Creating a DB Snapshot.
\copy Command to Import Data to a Table on a PostgreSQL DB Instance
You can run the
\copy command from the psql
prompt to import data into a table on a PostgreSQL DB instance. The table must
already exist on the DB instance. For more information on the \copy command, see the
\copy command does not provide confirmation of actions, such as a count of
rows inserted. PostgreSQL does provide error messages if the copy command fails
due to an error.
Create a .csv file from the data in the source table, log on to the target database on the PostgreSQL instance using psql, and then run the following command. This example uses source-table as the source table name, source-table.csv as the .csv file, and target-db as the target database:
target-db=> \copy source-table from 'source-table.csv' with DELIMITER ',';
You can also run the following command from your client computer command prompt. This example uses source-table as the source table name, source-table.csv as the .csv file, and target-db as the target database:
For Linux, OS X, or Unix:
$psql target-db \ -U <admin user> \ -p <port> \ -h <DB instance name> \ -c "\copy source-table from 'source-table.csv' with DELIMITER ','"
$psql target-db ^ -U <admin user> ^ -p <port> ^ -h <DB instance name> ^ -c "\copy source-table from 'source-table.csv' with DELIMITER ','"