Oracle Data Pump and PostgreSQL pg_dump and pg_restore
With AWS DMS, you can migrate data from source databases to target databases using Oracle Data Pump and PostgreSQL pg_dump and pg_restore. Oracle Data Pump is a utility for transferring data between Oracle databases, while PostgreSQL pg_dump and pg_restore create a backup of a PostgreSQL database.
Feature compatibility | AWS SCT / AWS DMS automation level | AWS SCT action code index | Key differences |
---|---|---|---|
|
N/A |
N/A |
Non-compatible tool. |
Oracle usage
Oracle Data Pump is a utility for exporting and importing data from/to an Oracle database. It can be used to copy an entire database, entire schemas, or specific objects in a schema. Oracle Data Pump is commonly used as a part of a backup strategy for restoring individual database objects (specific records, tables, views, stored procedures, and so on) as opposed to snapshots or Oracle RMAN, which provides backup and recovery capabilities at the database level. By default (without using the sqlfile parameter during export), the dump file generated by Oracle Data Pump is binary (it can’t be opened using a text editor).
Oracle Data Pump supports:
-
Export Data from an Oracle database — The Data Pump
EXPDP
command creates a binary dump file containing the exported database objects. Objects can be exported with data or with metadata only. Exports can be performed for specific timestamps or Oracle SCNs to ensure cross-object consistency. -
Import Data to an Oracle database — The Data Pump
IMPDP
command imports objects and data from a specific dump file created with theEXPDP
command. TheIMPDP
command can filter on import (for example, only import certain objects) and remap object and schema names during import.
The term “Logical backup” refers to a dump file created by Oracle Data Pump.
Both EXPDP
and IMPDP
can only read/write dump files from file system paths that were pre-configured in the Oracle database as directories. During export/import, users must specify the logical directory name where the dump file should be created; not the actual file system path.
Examples
Use EXPDP to export the HR
schema.
$ expdp system/**** directory=expdp_dir schemas=hr dumpfile=hr.dmp logfile=hr.log
The command contains the credentials to run Data Pump, the logical Oracle directory name for the dump file location (which maps in the database to a physical file system location), the schema name to export, the dump file name, and log file name.
Use IMPDP to import the HR
a schema and rename to HR_COPY
.
$ impdp system/**** directory=expdp_dir schemas=hr dumpfile=hr.dmp logfile=hr.log REMAP_SCHEMA=hr:hr_copy
The command contains the database credentials to run Data Pump, the logical Oracle directory for where the export dump file is located, the dump file name, the schema to export, the name for the dump file, the log file name, and the REMAP_SCHEMA
parameter
For more information, see Oracle Data Pump
PostgreSQL usage
PostgreSQL provides native utilities — pg_dump
and pg_restore
can be used to perform logical database exports and imports with a degree of comparable functionality to the Oracle Data Pump utility. Such as for moving data between two databases and creating logical database backups.
-
pg_dump
is an equivalent to Oracle expdp -
pg_restore
is an equivalent to Oracle impdp
Amazon Aurora PostgreSQL supports data export and import using both pg_dump
and pg_restore
, but the binaries for both utilities will need to be placed on your local workstation or on an Amazon EC2 server as part of the PostgreSQL client binaries.
PostgreSQL dump files created using pg_dump
can be copied, after export, to an Amazon S3 bucket as cloud backup storage or for maintaining the desired backup retention policy. Later, when dump files are needed for database restore, the dump files should be copied back to the desktop / server that has a PostgreSQL client (such as your workstation or an Amazon EC2 server) to issue the pg_restore
command.
Starting with PostgreSQL 10, the following capabilities were added:
-
A schema can be excluded in
pg_dump
orpg_restore
commands. -
Can create dumps with no blobs.
-
Allow to run
pg_dumpall
by non-superusers, using the--no-role-passwords
option. -
Create additional integrity option to ensure that the data is stored in disk using the
fsync()
method.
Starting with PostgreSQL 11, pg_dump
and pg_restore
can export and import relationships between extensions and database objects established with ALTER … DEPENDS ON EXTENSION
, which allows these objects to be dropped when extension is dropped with CASCADE
option.
Note
pg_dump
will create consistent backups even if the database is being used concurrently. pg_dump
doesn’t block other users accessing the database (readers or writers).pg_dump
only exports a single database, in order to backup global objects that are common to all databases in a cluster, such as roles and tablespaces, use pg_dumpall
. PostgreSQL dump files can be both plain-text and custom format files.
Another option to export and import data from PostgreSQL database is to use COPY TO
and COPY FROM
commands. Starting with PostgreSQL 12 the COPY FROM
command, that can be used to load data into DB, has support for filtering incoming rows with WHERE
condition.
CREATE TABLE tst_copy(v TEXT); COPY tst_copy FROM '/home/postgres/file.csv' WITH (FORMAT CSV) WHERE v LIKE '%apple%';
Examples
Export data using pg_dump
: Use a workstation or server with the PostgreSQL client installed in order to connect to the Aurora PostgreSQL instance in AWS; providing the hostname (-h), database user name (-U) and database name (-d) while issuing the pg_dump
command.
$ pg_dump -h hostname.rds.amazonaws.com -U username -d db_name -f dump_file_name.sql
The output file, dump_file_name.sql
, will be stored on the server where the pg_dump
command runs. You can later copy the output file to an Amazon S3 Bucket, if needed.
Run pg_dump
and copy the backup file to an Amazon S3 bucket using pipe and the AWS CLI.
$ pg_dump -h hostname.rds.amazonaws.com -U username -d db_name -f dump_file_name.sql | aws s3 cp - s3://pg-backup/pg_bck-$(date"+%Y-%m-%d-%H-%M-%S")
Restore data with pg_restore
. Use a workstation or server with the PostgreSQL client installed to connect to the Aurora PostgreSQL instance providing the hostname (-h), database user name (-U), database name (-d) and the dump file to restore from while issuing the pg_restore
command.
$ pg_restore -h hostname.rds.amazonaws.com -U username -d dbname_restore dump_file_name.sql
Copy the output file from the local server to an Amazon S3 Bucket using the AWS CLI. Upload the dump file to Amazon S3 bucket.
$ aws s3 cp /usr/Exports/hr.dmp s3://my-bucket/backup-$(date "+%Y-%m-%d-%H-%M-%S")
The {-$(date "+%Y-%m-%d-%H-%M-%S")} format is valid on Linux servers only.
Download the output file from the Amazon S3 bucket.
$ aws s3 cp s3://my-bucket/backup-2017-09-10-01-10-10 /usr/Exports/hr.dmp
You can create a copy of an existing database without having to use pg_dump
or pg_restore
. Instead, use the template keyword to signify the database used as the source.
CREATE DATABASE mydb_copy TEPLATE mydb;
Summary
Description | Oracle Data Pump | PostgreSQL Dump |
---|---|---|
Export data to a local file |
expdp system/*** schemas=hr dumpfile=hr.dmp logfile=hr.log |
pg_dump -F c -h hostname.rds.amazonaws.com -U username -d hr -p 5432 > c:\Export\hr.dmp |
Export data to a remote file |
Create Oracle directory on remote storage mount or NFS directory called EXP_DIR. Use the export command: expdp system/*** schemas=hr directory=EXP_DIR dumpfile=hr.dmp logfile=hr.log |
Export: pg_dump -F c -h hostname.rds.amazonaws.com -U username -d hr -p 5432 > c:\Export\hr.dmp Upload to Amazon S3 aws s3 cp c:\Export\hr.dmp s3://my-bucket/backup-$(date"+%Y-%m-%d-%H-%M-%S") |
Import data to a new database with a new name |
impdp system/*** schemas=hr dumpfile=hr.dmp logfile=hr.log REMAP_SCHEMA=hr:hr_copy TRANSFORMM=OID:N |
pg_restore -h hostname.rds.amazonaws.com -U hr -d hr_restore -p 5432 c:\Expor\hr.dmp |
Exclude schemas |
expdp system/*** FULL=Y directory=EXP_DIR dumpfile=hr.dmp logfile=hr.log exclude=SCHEMA:"HR" |
pg_dump -F c -h hostname.rds.amazonaws.com -U username -d hr -p 5432 -N 'log_schema' c:\Export\hr_nolog.dmp |
For more information, see SQL Dump