Importing using Oracle Data Pump - Amazon Relational Database Service

Importing using Oracle Data Pump

Oracle Data Pump is a long-term replacement for the Oracle Export/Import utilities. Oracle Data Pump is the preferred way to move large amounts of data from an Oracle installation to an Amazon RDS DB instance. You can use Oracle Data Pump for several scenarios:

  • Import data from an Oracle database (either on-premises or Amazon EC2 instance) to an Amazon RDS for Oracle DB instance.

  • Import data from an RDS for Oracle DB instance to an Oracle database (either on-premises or Amazon EC2 instance).

  • Import data between RDS for Oracle DB instances (for example, to migrate data from EC2-Classic to VPC).

To download Oracle Data Pump utilities, see Oracle database software downloads on the Oracle Technology Network website.

For compatibility considerations when migrating between versions of Oracle Database, see the Oracle documentation.

When you import data with Oracle Data Pump, you must transfer the dump file that contains the data from the source database to the target database. You can transfer the dump file using an Amazon S3 bucket or by using a database link between the two databases.

When you use Oracle Data Pump to import data into an Oracle DB instance, we recommend the following best practices:

  • Perform imports in schema or table mode to import specific schemas and objects.

  • Limit the schemas you import to those required by your application.

  • Don't import in full mode.

    Because Amazon RDS for Oracle does not allow access to SYS or SYSDBA administrative users, importing in full mode, or importing schemas for Oracle-maintained components, might damage the Oracle data dictionary and affect the stability of your database.

  • When loading large amounts of data, do the following:

    1. Transfer the dump file to the target Amazon RDS for Oracle DB instance.

    2. Take a DB snapshot of your instance.

    3. Test the import to verify that it succeeds.

    If database components are invalidated, you can delete the DB instance and re-create it from the DB snapshot. The restored DB instance includes any dump files staged on the DB instance when you took the DB snapshot.

  • Don't import dump files that were created using the Oracle Data Pump export parameters TRANSPORT_TABLESPACES, TRANSPORTABLE, or TRANSPORT_FULL_CHECK. Amazon RDS for Oracle DB instances don't support importing these dump files.

  • Don't import dump files that contain Oracle Scheduler objects in SYS, SYSTEM, RDSADMIN, RDSSEC, and RDS_DATAGUARD, and belong to the following categories:

    • Jobs

    • Programs

    • Schedules

    • Chains

    • Rules

    • Evaluation contexts

    • Rule sets

    Amazon RDS for Oracle DB instances don't support importing these dump files.

Note

To exclude unsupported Scheduler objects, use additional directives during the Data Pump export. If you use DBMS_DATAPUMP, add an additional METADATA_FILTER before the DBMS_METADATA.START_JOB:

DBMS_DATAPUMP.METADATA_FILTER( v_hdnl, 'EXCLUDE_NAME_EXPR', q'[IN (SELECT NAME FROM SYS.OBJ$ WHERE TYPE# IN (66,67,74,79,59,62,46) AND OWNER# IN (SELECT USER# FROM SYS.USER$ WHERE NAME IN ('RDSADMIN','SYS','SYSTEM','RDS_DATAGUARD','RDSSEC') ) ) ]', 'PROCOBJ' );

If you use expdp, create a parameter file that contains the exclude directive shown in the following example. Then use PARFILE=parameter_file with your expdp command.

exclude=procobj:"IN (SELECT NAME FROM sys.OBJ$ WHERE TYPE# IN (66,67,74,79,59,62,46) AND OWNER# IN (SELECT USER# FROM SYS.USER$ WHERE NAME IN ('RDSADMIN','SYS','SYSTEM','RDS_DATAGUARD','RDSSEC') ) )"

The examples in this section show one way to import data into an Oracle database. However, Oracle Data Pump permits many ways to import data. To learn more about Oracle Data Pump, see the Oracle Database documentation.

The examples in this section use the DBMS_DATAPUMP package. The same tasks can be accomplished by using the Oracle Data Pump command line utilities impdp and expdp. You can install these utilities on a remote host as part of an Oracle Client installation, including Oracle Instant Client.

Importing data with Oracle Data Pump and an Amazon S3 bucket

The following import process uses Oracle Data Pump and an Amazon S3 bucket. The process exports data on the source database using the Oracle DBMS_DATAPUMP package and puts the dump file in an Amazon S3 bucket. It then downloads the dump file from the Amazon S3 bucket to the DATA_PUMP_DIR directory on the target Amazon RDS for Oracle DB instance. The final step imports the data from the copied dump file into the Amazon RDS for Oracle DB instance using the package DBMS_DATAPUMP.

The process has the following requirements:

  • You must have an Amazon S3 bucket available for file transfers, and the Amazon S3 bucket must be in the same AWS Region as the DB instance. For instructions, see Create a bucket in the Amazon Simple Storage Service Getting Started Guide.

  • The object that you upload into the Amazon S3 bucket must be 5 TB or less. For more information about working with objects in Amazon S3, see Amazon Simple Storage Service User Guide.

    Note

    If you dump file exceeds 5 TB, you can run the Oracle Data Pump export with the parallel option. This operation spreads the data into multiple dump files so that you do not exceed the 5 TB limit for individual files.

  • You must prepare the Amazon S3 bucket for Amazon RDS integration by following the instructions in Configuring IAM permissions for RDS for Oracle integration with Amazon S3.

  • You must ensure that you have enough storage space to store the dump file on the source instance and the target DB instance.

Note

This process imports a dump file into the DATA_PUMP_DIR directory, a preconfigured directory on all Oracle DB instances. This directory is located on the same storage volume as your data files. When you import the dump file, the existing Oracle data files use more space. Thus, you should make sure that your DB instance can accommodate that additional use of space. The imported dump file is not automatically deleted or purged from the DATA_PUMP_DIR directory. To remove the imported dump file, use UTL_FILE.FREMOVE, found on the Oracle website.

The import process using Oracle Data Pump and an Amazon S3 bucket has the following steps.

Step 1: Grant privileges to the user on the Amazon RDS target instance

To grant privileges to the user on the RDS target instance, take the following steps:

  1. Use SQL Plus or Oracle SQL Developer to connect to the Amazon RDS target Oracle DB instance into which the data will be imported. Connect as the Amazon RDS master user. For information about connecting to the DB instance, see Connecting to your Oracle DB instance.

  2. Create the required tablespaces before you import the data. For more information, see Creating and sizing tablespaces.

  3. If the user account into which the data is imported doesn't exist, create the user account and grant the necessary permissions and roles. If you plan to import data into multiple user schemas, create each user account and grant the necessary privileges and roles to it.

    For example, the following commands create a new user and grant the necessary permissions and roles to import the data into the user's schema.

    CREATE USER schema_1 IDENTIFIED BY <password>; GRANT CREATE SESSION, RESOURCE TO schema_1; ALTER USER schema_1 QUOTA 100M ON users;

    This example grants the new user the CREATE SESSION privilege and the RESOURCE role. Additional privileges and roles might be required depending on the database objects that you import.

    Note

    Replace schema_1 with the name of your schema in this step and in the following steps.

Step 2: Use DBMS_DATAPUMP to create a dump file

Use SQL Plus or Oracle SQL Developer to connect to the source Oracle instance with an administrative user. If the source database is an Amazon RDS for Oracle DB instance, connect with the Amazon RDS master user. Next, use the Oracle Data Pump utility to create a dump file.

The following script creates a dump file named sample.dmp in the DATA_PUMP_DIR directory that contains the SCHEMA_1 schema. Replace SCHEMA_1 with the name of the schema that you want to export.

DECLARE v_hdnl NUMBER; BEGIN v_hdnl := DBMS_DATAPUMP.OPEN( operation => 'EXPORT', job_mode => 'SCHEMA', job_name => null ); DBMS_DATAPUMP.ADD_FILE( handle => v_hdnl , filename => 'sample.dmp' , directory => 'DATA_PUMP_DIR', filetype => dbms_datapump.ku$_file_type_dump_file ); DBMS_DATAPUMP.ADD_FILE( handle => v_hdnl, filename => 'sample_exp.log', directory => 'DATA_PUMP_DIR' , filetype => dbms_datapump.ku$_file_type_log_file ); DBMS_DATAPUMP.METADATA_FILTER(v_hdnl,'SCHEMA_EXPR','IN (''SCHEMA_1'')'); DBMS_DATAPUMP.METADATA_FILTER( v_hdnl, 'EXCLUDE_NAME_EXPR', q'[IN (SELECT NAME FROM sys.OBJ$ WHERE TYPE# IN (66,67,74,79,59,62,46) AND OWNER# IN (SELECT USER# FROM SYS.USER$ WHERE NAME IN ('RDSADMIN','SYS','SYSTEM','RDS_DATAGUARD','RDSSEC') ) ) ]', 'PROCOBJ' ); DBMS_DATAPUMP.START_JOB(v_hdnl); END; /
Note

Data Pump jobs are started asynchronously. For information about monitoring a Data Pump job, see Monitoring job status in the Oracle documentation. You can view the contents of the export log by using the rdsadmin.rds_file_util.read_text_file procedure. For more information, see Reading files in a DB instance directory.

Step 3: Upload the dump file to your Amazon S3 bucket

Upload the dump file to the Amazon S3 bucket.

Use the Amazon RDS procedure rdsadmin.rdsadmin_s3_tasks.upload_to_s3 to copy the dump file to the Amazon S3 bucket. The following example uploads all of the files from the DATA_PUMP_DIR directory to an Amazon S3 bucket named mys3bucket.

SELECT rdsadmin.rdsadmin_s3_tasks.upload_to_s3( p_bucket_name => 'mys3bucket', p_directory_name => 'DATA_PUMP_DIR') AS TASK_ID FROM DUAL;

The SELECT statement returns the ID of the task in a VARCHAR2 data type.

For more information, see Uploading files from your RDS for Oracle DB instance to an Amazon S3 bucket.

Step 4: Copy the exported dump file from the Amazon S3 bucket to the target DB instance

Use SQL Plus or Oracle SQL Developer to connect to the Amazon RDS target Oracle DB instance. Next, use the Amazon RDS procedure rdsadmin.rdsadmin_s3_tasks.download_from_s3 to copy the dump file from the Amazon S3 bucket to the target DB instance. The following example downloads all of the files from an Amazon S3 bucket named mys3bucket to the DATA_PUMP_DIR directory.

SELECT rdsadmin.rdsadmin_s3_tasks.download_from_s3( p_bucket_name => 'mys3bucket', p_directory_name => 'DATA_PUMP_DIR') AS TASK_ID FROM DUAL;

The SELECT statement returns the ID of the task in a VARCHAR2 data type.

For more information, see Downloading files from an Amazon S3 bucket to an Oracle DB instance.

Step 5: Use DBMS_DATAPUMP to import the data file on the target DB instance

Use Oracle Data Pump to import the schema in the DB instance. Additional options such as METADATA_REMAP might be required.

Connect to the DB instance with the Amazon RDS master user account to perform the import.

DECLARE v_hdnl NUMBER; BEGIN v_hdnl := DBMS_DATAPUMP.OPEN( operation => 'IMPORT', job_mode => 'SCHEMA', job_name => null); DBMS_DATAPUMP.ADD_FILE( handle => v_hdnl, filename => 'sample_copied.dmp', directory => 'DATA_PUMP_DIR', filetype => dbms_datapump.ku$_file_type_dump_file); DBMS_DATAPUMP.ADD_FILE( handle => v_hdnl, filename => 'sample_imp.log', directory => 'DATA_PUMP_DIR', filetype => dbms_datapump.ku$_file_type_log_file); DBMS_DATAPUMP.METADATA_FILTER(v_hdnl,'SCHEMA_EXPR','IN (''SCHEMA_1'')'); DBMS_DATAPUMP.START_JOB(v_hdnl); END; /
Note

Data Pump jobs are started asynchronously. For information about monitoring a Data Pump job, see Monitoring job status in the Oracle documentation. You can view the contents of the import log by using the rdsadmin.rds_file_util.read_text_file procedure. For more information, see Reading files in a DB instance directory.

You can verify the data import by viewing the user's tables on the DB instance. For example, the following query returns the number of tables for SCHEMA_1.

SELECT COUNT(*) FROM DBA_TABLES WHERE OWNER='SCHEMA_1';

Step 6: Clean up

After the data has been imported, you can delete the files that you don't want to keep. You can list the files in the DATA_PUMP_DIR using the following command.

SELECT * FROM TABLE(rdsadmin.rds_file_util.listdir('DATA_PUMP_DIR')) ORDER BY MTIME;

To delete files in the DATA_PUMP_DIR that you no longer require, use the following command.

EXEC UTL_FILE.FREMOVE('DATA_PUMP_DIR','<file name>');

For example, the following command deletes the file named "sample_copied.dmp".

EXEC UTL_FILE.FREMOVE('DATA_PUMP_DIR','sample_copied.dmp');

The following import process uses Oracle Data Pump and the Oracle DBMS_FILE_TRANSFER package. The process connects to a source Oracle instance, which can be an on-premises or Amazon EC2 instance, or an Amazon RDS for Oracle DB instance. The process then exports data using the DBMS_DATAPUMP package. Next, it uses the DBMS_FILE_TRANSFER.PUT_FILE method to copy the dump file from the Oracle instance to the DATA_PUMP_DIR directory on the target Amazon RDS for Oracle DB instance that is connected using a database link. The final step imports the data from the copied dump file into the Amazon RDS for Oracle DB instance using the DBMS_DATAPUMP package.

The process has the following requirements:

  • You must have execute privileges on the DBMS_FILE_TRANSFER and DBMS_DATAPUMP packages.

  • You must have write privileges to the DATA_PUMP_DIR directory on the source DB instance.

  • You must ensure that you have enough storage space to store the dump file on the source instance and the target DB instance.

Note

This process imports a dump file into the DATA_PUMP_DIR directory, a preconfigured directory on all Oracle DB instances. This directory is located on the same storage volume as your data files. When you import the dump file, the existing Oracle data files use more space. Thus, you should make sure that your DB instance can accommodate that additional use of space. The imported dump file is not automatically deleted or purged from the DATA_PUMP_DIR directory. To remove the imported dump file, use UTL_FILE.FREMOVE, found on the Oracle website.

The import process using Oracle Data Pump and the DBMS_FILE_TRANSFER package has the following steps.

Step 1: Grant privileges to the user on the Amazon RDS target instance

To grant privileges to the user on the RDS target instance, take the following steps:

  1. Use SQL Plus or Oracle SQL Developer to connect to the Amazon RDS target Oracle DB instance into which the data will be imported. Connect as the Amazon RDS master user. For information about connecting to the DB instance, see Connecting to your Oracle DB instance.

  2. Create the required tablespaces before you import the data. For more information, see Creating and sizing tablespaces.

  3. If the user account into which the data is imported doesn't exist, create the user account and grant the necessary permissions and roles. If you plan to import data into multiple user schemas, create each user account and grant the necessary privileges and roles to it.

    For example, the following commands create a new user and grant the necessary permissions and roles to import the data into the user's schema.

    CREATE USER schema_1 IDENTIFIED BY <password>; GRANT CREATE SESSION, RESOURCE TO schema_1; ALTER USER schema_1 QUOTA 100M ON users;

    This example grants the new user the CREATE SESSION privilege and the RESOURCE role. Additional privileges and roles might be required depending on the database objects that you import.

    Note

    Replace schema_1 with the name of your schema in this step and in the following steps.

Step 2: Grant privileges to the user on the source database

Use SQL*Plus or Oracle SQL Developer to connect to the Oracle instance that contains the data to be imported. If necessary, create a user account and grant the necessary permissions.

Note

If the source database is an Amazon RDS instance, you can skip this step. You use your Amazon RDS master user account to perform the export.

The following commands create a new user and grant the necessary permissions.

CREATE USER export_user IDENTIFIED BY <password>; GRANT CREATE SESSION, CREATE TABLE, CREATE DATABASE LINK TO export_user; ALTER USER export_user QUOTA 100M ON users; GRANT READ, WRITE ON DIRECTORY data_pump_dir TO export_user; GRANT SELECT_CATALOG_ROLE TO export_user; GRANT EXECUTE ON DBMS_DATAPUMP TO export_user; GRANT EXECUTE ON DBMS_FILE_TRANSFER TO export_user;

Step 3: Use DBMS_DATAPUMP to create a dump file

Use SQL Plus or Oracle SQL Developer to connect to the source Oracle instance with an administrative user or with the user you created in step 2. If the source database is an Amazon RDS for Oracle DB instance, connect with the Amazon RDS master user. Next, use the Oracle Data Pump utility to create a dump file.

The following script creates a dump file named sample.dmp in the DATA_PUMP_DIR directory.

DECLARE v_hdnl NUMBER; BEGIN v_hdnl := DBMS_DATAPUMP.OPEN( operation => 'EXPORT' , job_mode => 'SCHEMA' , job_name => null ); DBMS_DATAPUMP.ADD_FILE( handle => v_hdnl, filename => 'sample.dmp' , directory => 'DATA_PUMP_DIR' , filetype => dbms_datapump.ku$_file_type_dump_file ); DBMS_DATAPUMP.ADD_FILE( handle => v_hdnl , filename => 'sample_exp.log' , directory => 'DATA_PUMP_DIR' , filetype => dbms_datapump.ku$_file_type_log_file ); DBMS_DATAPUMP.METADATA_FILTER( v_hdnl , 'SCHEMA_EXPR' , 'IN (''SCHEMA_1'')' ); DBMS_DATAPUMP.METADATA_FILTER( v_hdnl, 'EXCLUDE_NAME_EXPR', q'[IN (SELECT NAME FROM sys.OBJ$ WHERE TYPE# IN (66,67,74,79,59,62,46) AND OWNER# IN (SELECT USER# FROM SYS.USER$ WHERE NAME IN ('RDSADMIN','SYS','SYSTEM','RDS_DATAGUARD','RDSSEC') ) ) ]', 'PROCOBJ' ); DBMS_DATAPUMP.START_JOB(v_hdnl); END; /
Note

Data Pump jobs are started asynchronously. For information about monitoring a Data Pump job, see Monitoring job status in the Oracle documentation. You can view the contents of the export log by using the rdsadmin.rds_file_util.read_text_file procedure. For more information, see Reading files in a DB instance directory.

Step 4: Create a database link to the target DB instance

Create a database link between your source instance and your target DB instance. Your local Oracle instance must have network connectivity to the DB instance in order to create a database link and to transfer your export dump file.

Perform this step connected with the same user account as the previous step.

If you are creating a database link between two DB instances inside the same VPC or peered VPCs, the two DB instances should have a valid route between them. The security group of each DB instance must allow ingress to and egress from the other DB instance. The security group inbound and outbound rules can refer to security groups from the same VPC or a peered VPC. For more information, see Adjusting database links for use with DB instances in a VPC.

The following command creates a database link named to_rds that connects to the Amazon RDS master user at the target DB instance.

CREATE DATABASE LINK to_rds CONNECT TO <master_user_account> IDENTIFIED BY <password> USING '(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=<dns or ip address of remote db>) (PORT=<listener port>))(CONNECT_DATA=(SID=<remote SID>)))';

Step 5: Use DBMS_FILE_TRANSFER to copy the exported dump file to the target DB instance

Use DBMS_FILE_TRANSFER to copy the dump file from the source database instance to the target DB instance. The following script copies a dump file named sample.dmp from the source instance to a target database link named to_rds (created in the previous step).

BEGIN DBMS_FILE_TRANSFER.PUT_FILE( source_directory_object => 'DATA_PUMP_DIR', source_file_name => 'sample.dmp', destination_directory_object => 'DATA_PUMP_DIR', destination_file_name => 'sample_copied.dmp', destination_database => 'to_rds' ); END; /

Step 6: Use DBMS_DATAPUMP to import the data file to the target DB instance

Use Oracle Data Pump to import the schema in the DB instance. Additional options such as METADATA_REMAP might be required.

Connect to the DB instance with the Amazon RDS master user account to perform the import.

DECLARE v_hdnl NUMBER; BEGIN v_hdnl := DBMS_DATAPUMP.OPEN( operation => 'IMPORT', job_mode => 'SCHEMA', job_name => null); DBMS_DATAPUMP.ADD_FILE( handle => v_hdnl, filename => 'sample_copied.dmp', directory => 'DATA_PUMP_DIR', filetype => dbms_datapump.ku$_file_type_dump_file ); DBMS_DATAPUMP.ADD_FILE( handle => v_hdnl, filename => 'sample_imp.log', directory => 'DATA_PUMP_DIR', filetype => dbms_datapump.ku$_file_type_log_file); DBMS_DATAPUMP.METADATA_FILTER(v_hdnl,'SCHEMA_EXPR','IN (''SCHEMA_1'')'); DBMS_DATAPUMP.START_JOB(v_hdnl); END; /
Note

Data Pump jobs are started asynchronously. For information about monitoring a Data Pump job, see Monitoring job status in the Oracle documentation. You can view the contents of the import log by using the rdsadmin.rds_file_util.read_text_file procedure. For more information, see Reading files in a DB instance directory.

You can verify the data import by viewing the user's tables on the DB instance. For example, the following query returns the number of tables for schema_1.

SELECT COUNT(*) FROM DBA_TABLES WHERE OWNER='SCHEMA_1';

Step 7: Clean up

After the data has been imported, you can delete the files that you don't want to keep. You can list the files in DATA_PUMP_DIR using the following command.

SELECT * FROM TABLE(rdsadmin.rds_file_util.listdir('DATA_PUMP_DIR')) ORDER BY MTIME;

To delete files in DATA_PUMP_DIR that you no longer require, use the following command.

EXEC UTL_FILE.FREMOVE('DATA_PUMP_DIR','<file name>');

For example, the following command deletes the file named "sample_copied.dmp".

EXEC UTL_FILE.FREMOVE('DATA_PUMP_DIR','sample_copied.dmp');