How you import data into an Amazon RDS DB instance depends on the amount of data you have and the number and variety of database objects in your database. For example, you can use Oracle SQL Developer to import a simple, 20 MB database; you want to use Oracle Data Pump to import complex databases or databases that are several hundred megabytes or several terabytes in size.
Before you use any of these migration techniques, we recommend the best
practice of taking a backup of your database. You can back up your Amazon RDS
instances by creating snapshots. Later, you can restore the database from the
snapshots using the Restore from DB Snapshot or Restore to Point In Time options on
the RDS tab of the AWS Management Console. You can also use the Amazon RDS command
rds-restore-db-instance-to-point-in-time. These and other best
practices are addressed in this section.
For small databases, you can use Oracle SQL Developer, a graphical Java tool distributed without cost by Oracle. You can install this tool on your desktop computer (Windows, Linux, or Mac) or on one of your servers. Oracle SQL Developer provides options for migrating data between two Oracle databases, or for migrating data from other databases, such as MySQL, to Oracle. Oracle SQL Developer is best suited for migrating small databases. We recommend that you read the Oracle SQL Developer product documentation before you begin migrating your data.
After you install SQL Developer, you can use it to connect to your source and target databases. Use the Database Copy command on the Tools menu to copy your data to your Amazon RDS instance.
To download Oracle SQL Developer, go to http://www.oracle.com/technetwork/developer-tools/sql-developer.
Oracle also has documentation on how to migrate from other databases, including MySQL and SQL Server. To learn more, go to http://www.oracle.com/technetwork/database/migration.
Oracle Data Pump is a long-term replacement for the Oracle Export/Import utilities and is the preferred way to move large amounts of data from an Oracle installation to an Amazon RDS DB instance. You can use Oracle Data Pump for several scenarios:
Import data from an Amazon EC2 instance with an Oracle database to an Oracle DB instance
Import data from a database on an Oracle DB instance to another Oracle DB instance
Import data from a database on an Oracle DB instance in a VPC to another Oracle DB instance with or without a VPC
Import data from a local Oracle database to an Amazon RDS DB instance
The following process uses Oracle Data Pump and the DBMS_FILE_TRANSFER package. The process connects to an Oracle instance and exports data using Oracle Data Pump. It then uses the DBMS_FILE_TRANSFER.PUT_FILE method to copy the dump file from the Oracle instance to the DATA_PUMP_DIR on the target DB instance that is connected via a database link. The final step imports the data from the copied dump file into the RDS instance.
The process has the following requirements:
You must have execute privileges on the DBMS_FILE_TRANSFER package
The target DB instance must be version 126.96.36.199.v6 or later
You must have write privileges to the DATA_PUMP_DIR directory on the source DB instance
You must ensure that you have enough storage space to store the dump file on the source instance and the target DB instance
This process imports a dump file into the DATA_PUMP_DIR directory, a preconfigured directory on all Oracle DB instances. This directory is located on the same storage volume as your data files. When you import the dump file, the existing Oracle data files will use more space, so you should make sure that your DB instance can accommodate that additional use of space as well. Given that the current storage size limit of an Oracle DB instance is 3 TB, you can't use this process to upload any dump file larger than about 1.3 TB. Note that the imported dump file is not automatically deleted or purged from the DATA_PUMP_DIR directory. Use UTL_FILE.FREMOVE to remove the imported dump file.
The import process using Oracle Data Pump and the DBMS_FILE_TRANSFER package has the following steps:
Step 1: Grant privileges to user on source database
Step 2: Use DBMS_DATAPUMP to create a dump file
Step 3: Create a database link to the target DB instance
Step 4: Use DBMS_FILE_TRANSFER to copy the exported dump file to the Amazon RDS instance
Step 5: Import the dump file into a database on the Amazon RDS instance
Use SQL Plus or Oracle SQL Developer to connect to the Oracle instance that contains the data to be imported. If necessary, create a user account and grant the necessary permissions.
The following commands create a new user and grant the necessary permissions:
SQL> create user USER1 identified by test123; SQL> grant create session, create table to USER1; SQL> alter user USER1 quota 100M on users; SQL> grant read, write on directory data_pump_dir to USER1; SQL> grant execute on dbms_datapump to USER1;
You can use your own table, or you can create one to test the process. The following commands create a sample table for importing into a DB instance:
SQL> create table USER1.tab1 tablespace users as select 'USER1_'||object_name str_col, sysdate dt_col from all_objects;
Use SQL Plus or Oracle SQL Developer to connect to the Oracle instance and use the Oracle Data Pump utility to create a dump file. The following script creates a dump file named tab1.dmp in the DATA_PUMP_DIR directory.
DECLARE hdnl NUMBER; BEGIN hdnl := DBMS_DATAPUMP.open( operation => 'EXPORT', job_mode => 'SCHEMA', job_name=>null); DBMS_DATAPUMP.ADD_FILE( handle => hdnl, filename => 'tab1.dmp', directory => 'DATA_PUMP_DIR', filetype => dbms_datapump.ku$_file_type_dump_file); DBMS_DATAPUMP.add_file( handle => hdnl, filename => 'exp.log', directory => 'DATA_PUMP_DIR', filetype => dbms_datapump.ku$_file_type_log_file); DBMS_DATAPUMP.METADATA_FILTER(hdnl,'SCHEMA_EXPR','IN (''USER1'')'); DBMS_DATAPUMP.start_job(hdnl); END; /
Next, create a database link between your source instance and your target DB instance. Note that your local Oracle instance must have network connectivity to the DB instance in order to create a database link and to transfer your export file.
The following command creates a database link named to_rds to another user at the target DB instance database:
If you are creating a database link between two DB instances inside a VPC, the two DB instances must be either in the same VPC, be in VPCs that have an established VPC peering connection, or you must create an EC2 or VPC security group that both DB instances are a member of.
create database link to_rds connect to USER2 identified by user2pwd using '(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST="<dns or ip address of remote db>)(PORT=<listener port>))(CONNECT_DATA=(SID=<remoteSID>)))';
Next, use DBMS_FILE_TRANSFER to copy the dump file from the source database instance to the target DB instance. The following script copies a dump file named tab1.dmp from the source instance to a target database link named to_rds (created in the previous step):
BEGIN DBMS_FILE_TRANSFER.PUT_FILE( source_directory_object => 'DATA_PUMP_DIR', source_file_name => 'tab1.dmp', destination_directory_object => 'DATA_PUMP_DIR', destination_file_name => 'tab1_copied.dmp', destination_database => 'to_rds' ); END; /
On the DB instance, use Oracle Data Pump to import the schema. The first part of the listing shows the format for the data import statement, and the second part shows importing a data file called tab1_copied.dmp.
impdp <username>@<TNS_ENTRY> DUMPFILE=user1copied.dmp DIRECTORY=DATA_PUMP_DIR full=y impdp copy1@copy1 DUMPFILE=tab1_copied.dmp DIRECTORY=DATA_PUMP_DIR full=y
You can verify the data import by viewing the table on the DB instance.
SQL> select count(*) from user1.tab1;
The Oracle Export/Import utilities are best suited for migrations where the data size is small and data types such as binary float and double are not required. The import process creates the schema objects so you do not need to run a script to create them beforehand, making this process well suited for databases with small tables. The following example demonstrates how these utilities can be used to export and import specific tables.
Export the tables from the source database using the command below. Substitute username/password as appropriate.
exp cust_dba@ORCL FILE=exp_file.dmp TABLES=(tab1,tab2,tab3) LOG=exp_file.log
The export process creates a binary dump file that contains both the schema and data for the specified tables. Now this schema and data can be imported into a target database using the command:
imp cust_dba@targetdb FROMUSER=cust_schema TOUSER=cust_schema \ TABLES=(tab1,tab2,tab3) FILE=exp_file.dmp LOG=imp_file.log
There are other variations of the Export and Import commands that might be better suited to your needs. See Oracle's documentation for full details.
Oracle SQL*Loader is well suited for large databases that have a limited number of objects in them. Since the process involved in exporting from a source database and loading to a target database is very specific to the schema, the following example creates the sample schema objects, exports from a source, and then loads it into a target database.
Create a sample source table using the command below.
create table customer_0 tablespace users as select rownum id, o.* from all_objects o, all_objects x where rownum <= 1000000;
On the target Amazon RDS instance, create a destination table that will be used to load the data.
create table customer_1 tablespace users as select 0 as id, owner, object_name, created from all_objects where 1=2;
The data will be exported from the source database to a flat file with delimiters. This example uses SQL*Plus for this purpose. For your data, you will likely need to generate a script that does the export for all the objects in the database.
alter session set nls_date_format = 'YYYY/MM/DD HH24:MI:SS'; set linesize 800 HEADING OFF FEEDBACK OFF array 5000 pagesize 0 spool customer_0.out SET MARKUP HTML PREFORMAT ON SET COLSEP ',' SELECT id, owner, object_name, created FROM customer_0; spool off
You need to create a control file to describe the data. Again, depending on your data, you will need to build a script that does this step.
cat << EOF > sqlldr_1.ctl load data infile customer_0.out into table customer_1 APPEND fields terminated by "," optionally enclosed by '"' ( id POSITION(01:10) INTEGER EXTERNAL, owner POSITION(12:41) CHAR, object_name POSITION(43:72) CHAR, created POSITION(74:92) date "YYYY/MM/DD HH24:MI:SS" )
If needed, copy the files generated by the preceding code to a staging area, such as an Amazon EC2 instance.
Finally, import the data using SQL*Loader with the appropriate username and password for the target database.
sqlldr cust_dba@targetdb control=sqlldr_1.ctl BINDSIZE=10485760 READSIZE=10485760 ROWS=1000
You can also make use of Oracle materialized view replication to migrate large datasets efficiently. Replication allows you to keep the target tables in sync with the source on an ongoing basis, so the actual cutover to Amazon RDS can be done later, if needed. The replication is set up using a database link from the Amazon RDS instance to the source database.
One requirement for materialized views is to allow access from the target database to the source database. In the following example, access rules were enabled on the source database to allow the Amazon RDS target database to connect to the source over SQLNet.
Create a user account on both source and Amazon RDS target instances that can authenticate with the same password.
create user dblink_user identified by password default tablespace users temporary tablespace temp; grant create session to dblink_user; grant select any table to dblink_user; grant select any dictionary to dblink_user;
Create a database link from the Amazon RDS target instance to the source instance using the newly created dblink_user.
create database link remote_site connect to dblink_user identified by password using '(description=(address=(protocol=tcp) (host=<myhost>) (port=<listener port>)) (connect_data=(sid=<sourcedb sid>)))';
Test the link:
select * from v$instance@remote_site;
Create a sample table with primary key and materialized view log on the source instance.
create table customer_0 tablespace users as select rownum id, o.* from all_objects o, all_objects x where rownum <= 1000000; alter table customer_0 add constraint pk_customer_0 primary key (id) using index; create materialized view log on customer_0;
On the target Amazon RDS instance, create a materialized view.
CREATE MATERIALIZED VIEW customer_0 BUILD IMMEDIATE REFRESH FAST AS SELECT * FROM cust_dba.customer_0@remote_site;