Amazon Relational Database Service
User Guide (API Version 2013-09-09)
« PreviousNext »
View the PDF for this guide.Go to the AWS Discussion Forum for this product.Go to the Kindle Store to download this guide in Kindle format.Did this page help you?  Yes | No |  Tell us about it...

Importing Data Into Oracle on Amazon RDS

How you import data into an Amazon RDS DB instance depends on the amount of data you have and the number and variety of database objects in your database. For example, you can use Oracle SQL Developer to import a simple, 20 MB database; you want to use Oracle Data Pump to import complex databases or databases that are several hundred megabytes or several terabytes in size.

Before you use any of these migration techniques, we recommend the best practice of taking a backup of your database. You can back up your Amazon RDS instances by creating snapshots. Later, you can restore the database from the snapshots using the Restore from DB Snapshot or Restore to Point In Time options on the RDS tab of the AWS Management Console. You can also use the Amazon RDS command line methods rds-restore-db-instance-from-db-snapshot or rds-restore-db-instance-to-point-in-time. These and other best practices are addressed in this section.

Oracle SQL Developer

For small databases, you can use Oracle SQL Developer, a graphical Java tool distributed without cost by Oracle. You can install this tool on your desktop computer (Windows, Linux, or Mac) or on one of your servers. Oracle SQL Developer provides options for migrating data between two Oracle databases, or for migrating data from other databases, such as MySQL, to Oracle. Oracle SQL Developer is best suited for migrating small databases. We recommend that you read the Oracle SQL Developer product documentation before you begin migrating your data.

After you install SQL Developer, you can use it to connect to your source and target databases. Use the Database Copy command on the Tools menu to copy your data to your Amazon RDS instance.

To download Oracle SQL Developer, go to http://www.oracle.com/technetwork/developer-tools/sql-developer.

Oracle also has documentation on how to migrate from other databases, including MySQL and SQL Server. To learn more, go to http://www.oracle.com/technetwork/database/migration.

Oracle Data Pump

Oracle Data Pump is a long-term replacement for the Oracle Export/Import utilities and is the preferred way to move large amounts of data from an Oracle installation to an Amazon RDS DB instance. You can use Oracle Data Pump for several scenarios:

  • Import data from an Amazon EC2 instance with an Oracle database to an Oracle DB instance

  • Import data from a database on an Oracle DB instance to another Oracle DB instance

  • Import data from a database on an Oracle DB instance in a VPC to another Oracle DB instance with or without a VPC

  • Import data from a local Oracle database to an Amazon RDS DB instance

The following process uses Oracle Data Pump and the DBMS_FILE_TRANSFER package. The process connects to an Oracle instance and exports data using Oracle Data Pump. It then uses the DBMS_FILE_TRANSFER.PUT_FILE method to copy the dump file from the Oracle instance to the DATA_PUMP_DIR on the target DB instance that is connected via a database link. The final step imports the data from the copied dump file into the RDS instance.

The process has the following requirements:

  • You must have execute privileges on the DBMS_FILE_TRANSFER package

  • The target DB instance must be version 11.2.0.2.v6 or later

  • You must have write privileges to the DATA_PUMP_DIR directory on the source DB instance

  • You must ensure that you have enough storage space to store the dump file on the source instance and the target DB instance

Note

This process imports a dump file into the DATA_PUMP_DIR directory, a preconfigured directory on all Oracle DB instances. This directory is located on the same storage volume as your data files. When you import the dump file, the existing Oracle data files will use more space, so you should make sure that your DB instance can accommodate that additional use of space as well. Given that the current storage size limit of an Oracle DB instance is 3 TB, you can't use this process to upload any dump file larger than about 1.3 TB. Note that the imported dump file is not automatically deleted or purged from the DATA_PUMP_DIR directory. Use UTL_FILE.FREMOVE to remove the imported dump file.

The import process using Oracle Data Pump and the DBMS_FILE_TRANSFER package has the following steps:

  • Step 1: Grant privileges to user on source database

  • Step 2: Use DBMS_DATAPUMP to create a dump file

  • Step 3: Create a database link to the target DB instance

  • Step 4: Use DBMS_FILE_TRANSFER to copy the exported dump file to the Amazon RDS instance

  • Step 5: Import the dump file into a database on the Amazon RDS instance

Step 1: Grant privileges to user on source database

Use SQL Plus or Oracle SQL Developer to connect to the Oracle instance that contains the data to be imported. If necessary, create a user account and grant the necessary permissions.

The following commands create a new user and grant the necessary permissions:

SQL> create user USER1 identified by test123;
SQL> grant create session, create table to USER1;
SQL> alter user USER1 quota 100M on users;
SQL> grant read, write on directory data_pump_dir to USER1;
SQL> grant execute on dbms_datapump to USER1;
                            

You can use your own table, or you can create one to test the process. The following commands create a sample table for importing into a DB instance:

SQL> create table USER1.tab1
tablespace users
as select 'USER1_'||object_name str_col, sysdate dt_col from all_objects;                                
                            

Step 2: Use DBMS_DATAPUMP to create a dump file

Use SQL Plus or Oracle SQL Developer to connect to the Oracle instance and use the Oracle Data Pump utility to create a dump file. The following script creates a dump file named tab1.dmp in the DATA_PUMP_DIR directory.

DECLARE
hdnl NUMBER;
BEGIN
hdnl := DBMS_DATAPUMP.open( operation => 'EXPORT', job_mode => 'SCHEMA', job_name=>null);
DBMS_DATAPUMP.ADD_FILE( handle => hdnl, filename => 'tab1.dmp', directory => 'DATA_PUMP_DIR', filetype => dbms_datapump.ku$_file_type_dump_file);
DBMS_DATAPUMP.add_file( handle => hdnl, filename => 'exp.log', directory => 'DATA_PUMP_DIR', filetype => dbms_datapump.ku$_file_type_log_file);
DBMS_DATAPUMP.METADATA_FILTER(hdnl,'SCHEMA_EXPR','IN (''USER1'')');
DBMS_DATAPUMP.start_job(hdnl);
END;
/    
                         

Step 3: Create a database link to the target DB instance

Next, create a database link between your source instance and your target DB instance. The following command creates a database link named to_rds to another user at the target DB instance database:

Note

>If you are creating a database link between two DB instances inside a VPC, the two DB instances must be either in the same VPC or you must create an EC2 or VPC security group that both DB instances are a member of.


create database link to_rds connect to USER2 identified by user2pwd
using '(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST="<dns or ip address of remote db>)(PORT=<listener port>))(CONNECT_DATA=(SID=<remoteSID>)))';

                       

Step 4: Use DBMS_FILE_TRANSFER to copy the exported dump file to an Amazon RDS DB instance

Next, use DGMS_FILE_TRANSFER to copy the dump file from the source database instance to the target DB instance. The following script copies a dump file named tab1.dmp from the source instance to a target database link named to_rds (created in the previous step):

BEGIN
  DBMS_FILE_TRANSFER.PUT_FILE(
    source_directory_object       => 'DATA_PUMP_DIR',
    source_file_name              => 'tab1.dmp',
    destination_directory_object  => 'DATA_PUMP_DIR',
    destination_file_name         => 'tab1_copied.dmp', 
    destination_database          => 'to_rds' 
  );
END;
/
                     

Step 5: Use Data Pump to import the data file on the DB instance

On the DB instance, use Oracle Data Pump to import the schema. The first part of the listing shows the format for the data import statement, and the second part shows importing a data file called tab1_copied.dmp.

impdp <username>@<TNS_ENTRY> DUMPFILE=user1copied.dmp DIRECTORY=DATA_PUMP_DIR full=y 

impdp copy1@copy1 DUMPFILE=tab1_copied.dmp DIRECTORY=DATA_PUMP_DIR full=y  

You can verify the data import by viewing the table on the DB instance.

SQL> select count(*) from user1.tab1; 

Oracle Export/Import Utilities

The Oracle Export/Import utilities are best suited for migrations where the data size is small and data types such as binary float and double are not required. The import process creates the schema objects so you do not need to run a script to create them beforehand, making this process well suited for databases with small tables. The following example demonstrates how these utilities can be used to export and import specific tables.

Export the tables from the source database using the command below. Substitute username/password as appropriate.

exp cust_dba@ORCL FILE=exp_file.dmp TABLES=(tab1,tab2,tab3) LOG=exp_file.log

The export process creates a binary dump file that contains both the schema and data for the specified tables. Now this schema and data can be imported into a target database using the command:

imp cust_dba@targetdb FROMUSER=cust_schema TOUSER=cust_schema \  
TABLES=(tab1,tab2,tab3) FILE=exp_file.dmp LOG=imp_file.log
                

There are other variations of the Export and Import commands that might be better suited to your needs. See Oracle's documentation for full details.

Oracle SQL*Loader

Oracle SQL*Loader is well suited for large databases that have a limited number of objects in them. Since the process involved in exporting from a source database and loading to a target database is very specific to the schema, the following example creates the sample schema objects, exports from a source, and then loads it into a target database.

  1. Create a sample source table using the command below.

                            
    create table customer_0 tablespace users as select rownum id, o.* from 
    all_objects o, all_objects x where rownum <= 1000000;
    
                        
  2. On the target Amazon RDS instance, create a destination table that will be used to load the data.

    create table customer_1 tablespace users as select 0 as id, owner, 
    object_name, created from all_objects where 1=2;
                        
  3. The data will be exported from the source database to a flat file with delimiters. This example uses SQL*Plus for this purpose. For your data, you will likely need to generate a script that does the export for all the objects in the database.

    alter session set nls_date_format = 'YYYY/MM/DD HH24:MI:SS'; set linesize 800 
    HEADING OFF FEEDBACK OFF array 5000 pagesize 0 spool customer_0.out SET 
    MARKUP HTML PREFORMAT ON SET COLSEP ',' SELECT id, owner, object_name, 
    created FROM customer_0; spool off
                        
  4. You need to create a control file to describe the data. Again, depending on your data, you will need to build a script that does this step.

                            
    cat << EOF > sqlldr_1.ctl 
    load data
    infile customer_0.out
    into table customer_1
    APPEND
    fields terminated by "," optionally enclosed by '"'
    (
    id     	    	  POSITION(01:10)         INTEGER EXTERNAL,
    owner         	  POSITION(12:41)         CHAR,
    object_name   	  POSITION(43:72)         CHAR,
    created	          POSITION(74:92)         date "YYYY/MM/DD HH24:MI:SS"
    )                     
                        

    If needed, copy the files generated by the preceding code to a staging area, such as an Amazon EC2 instance.

  5. Finally, import the data using SQL*Loader with the appropriate username and password for the target database.

    sqlldr cust_dba@targetdb control=sqlldr_1.ctl BINDSIZE=10485760 READSIZE=10485760 ROWS=1000
                        

Oracle Materialized Views

You can also make use of Oracle materialized view replication to migrate large datasets efficiently. Replication allows you to keep the target tables in sync with the source on an ongoing basis, so the actual cutover to Amazon RDS can be done later, if needed. The replication is set up using a database link from the Amazon RDS instance to the source database.

One requirement for materialized views is to allow access from the target database to the source database. In the following example, access rules were enabled on the source database to allow the Amazon RDS target database to connect to the source over SQLNet.

  1. Create a user account on both source and Amazon RDS target instances that can authenticate with the same password.

    create user dblink_user identified by password       default tablespace users       
    temporary tablespace temp; grant create session to dblink_user; grant select 
    any table to dblink_user; grant select any dictionary to dblink_user;
                        
  2. Create a database link from the Amazon RDS target instance to the source instance using the newly created dblink_user.

    create database link remote_site
      connect to dblink_user identified by password
      using '(description=(address=(protocol=tcp) (host=<myhost>) (port=<listener port>))          
    (connect_data=(sid=<sourcedb sid>)))';
                        
  3. Test the link:

    select * from v$instance@remote_site;
                        
  4. Create a sample table with primary key and materialized view log on the source instance.

                            
    create table customer_0 tablespace users as select rownum id, o.* from 
    all_objects o, all_objects x where rownum <= 1000000; alter table customer_0 
    add constraint pk_customer_0 primary key (id) using index; create 
    materialized view log on customer_0;
    
                        
  5. On the target Amazon RDS instance, create a materialized view.

    CREATE MATERIALIZED VIEW customer_0    BUILD IMMEDIATE   REFRESH FAST    AS 
    SELECT * FROM cust_dba.customer_0@remote_site;