Planning your database migration process
The previous section discussed some of the key considerations to take into account while migrating databases to Amazon Aurora. Once you have determined that Aurora is the right fit for your application, the next step is to decide on a preliminary migration approach and create a database migration plan.
Homogeneous migration
If your source database is a MySQL 5.6 or 5.7 compliant database (MySQL, MariaDB, Percona, and so on.), then migration to Aurora is quite straightforward.
Homogeneous migration with downtime
If your application can accommodate a predictable length of downtime during off-peak hours, migration with the downtime is the simplest option and is a highly recommended approach. Most database migration projects fall into this category as most applications already have a well-defined maintenance window. You have the following options to migrate your database with downtime.
-
RDS snapshot migration — If your source database is running on Amazon RDS MySQL 5.6 or 5.7, you can simply migrate a snapshot of that database to Amazon Aurora. For migrations with downtime, you either have to stop your application or stop writing to the database while snapshot and migration is in progress. The time to migrate primarily depends upon the size of the database and can be determined ahead of the production migration by running a test migration. Snapshot migration option is explained in the RDS snapshot migration section of this document.
-
Migration using native MySQL tools — You may use native MySQL tools to migrate your data and schema to Aurora. This is a great option when you need more control over the database migration process, you are more comfortable using native MySQL tools, and other migration methods are not performing as well for your use case. You can create a dump of your data using the mysqldump utility, and then import that data into an existing Amazon Aurora MySQL DB cluster. For more information, refer to Migrating from MySQL to Amazon Aurora by using mysqldump. You can copy the full and incremental backup files from your database to an Amazon S3 bucket, and then restore an Amazon Aurora MySQL DB cluster from those files. This option can be considerably faster than migrating data using mysqldump. For more information, refer to Migrating data from MySQL by using an Amazon S3 bucket.
-
Migration using AWS Database Migration Service (AWS DMS) — One-time migration using AWS DMS is another tool for moving your source database to Amazon Aurora. Before you can use AWS DMS to move the data, you need to copy the database schema from source to target using native MySQL tools. For the step-by-step process, refer to the Migrating data section of this document. Using AWS DMS is a great option when you don’t have experience using native MySQL tools.
Homogeneous migration with near-zero downtime
In some scenarios you might want to migrate your database to Aurora with minimal downtime. Here are two examples:
-
When your database is relatively large and the migration time using downtime options is longer than your application maintenance window.
-
When you want to run source and target databases in parallel for testing purposes.
In such cases, you can replicate changes from your source MySQL database to Aurora in real time using replication. You have a couple of options to choose from:
-
Near-zero downtime migration using MySQL binlog replication — Amazon Aurora supports traditional MySQL binlog replication. If you are running MySQL database, chances are that you are already familiar with classic binlog replication setup. If that’s the case, and you want more control over the migration process, one-time database load using native tools coupled with binlog replication gives you a familiar migration path to Aurora.
-
Near-zero downtime migration using AWS Database Migration Service (AWS DMS) — In addition to supporting one-time migration, AWS DMS also supports real-time data replication using change data capture (CDC) from source to target. AWS DMS takes care of the complexities related to initial data copy, setting up replication instances, and monitoring replication. After the initial database migration is complete, the target database remains synchronized with the source for as long as you choose. If you are not familiar with binlog replication, AWS DMS is the next best option for homogenous, near-zero downtime migrations to Amazon Aurora. Refer to the Introduction and general approach to AWS DMS section of this document.
-
Near-zero downtime migration using Aurora Read Replica — If your source database is running on Amazon RDS MySQL 5.6 or 5.7, you can migrate from a MySQL DB instance to an Aurora MySQL DB cluster by creating an Aurora read replica of your source MySQL DB instance. When the replica lag between the MySQL DB instance and the Aurora Read Replica is zero, you can direct your client applications to the Aurora read replica. This migration option is explained in the Migration using Aurora Read Replica section of this document.
Heterogeneous migration
If you are looking to migrate a non-MySQL-compliant database (Oracle, SQL Server, PostgresSQL, and so on) to Amazon Aurora, several options can help you accomplish this migration quickly and easily.
Schema migration
Schema migration from a non-MySQL-compliant database to Amazon Aurora can be achieved using the AWS Schema Conversion Tool. This tool is a desktop application that helps you convert your database schema from an Oracle, Microsoft SQL Server, or PostgreSQL database to an Amazon RDS MySQL DB instance or an Amazon Aurora DB cluster. In cases where the schema from your source database cannot be automatically and completely converted, the AWS Schema Conversion Tool provides guidance on how you can create the equivalent schema in your target Amazon RDS database. For details, refer to the Migrating the database schema section of this document.
Data migration
While supporting homogenous migrations with near-zero downtime, AWS Database Migration Service (AWS DMS) also supports continuous replication across heterogeneous databases and is a preferred option to move your source database to your target database, for both migrations with downtime and migrations with near-zero downtime. Once the migration has started, AWS DMS manages all the complexities of the migration process like data type transformation, compression, and parallel transfer (for faster data transfer) while ensuring that data changes to the source database that occur during the migration process are automatically replicated to the target.
Besides using AWS DMS, you can use various third-party tools like Qlik Replicate, Tungsten Replicator, Oracle Golden Gate, etc. to migrate your data to Amazon Aurora. Whatever tool you choose, take performance and licensing costs into consideration before finalizing your toolset for migration.
Migrating large databases to Amazon Aurora
Migration of large datasets presents unique challenges in every database migration project. Many successful large database migration projects use a combination of the following strategies:
-
Migration with continuous replication — Large databases typically have extended downtime requirements while moving data from source to target. To reduce the downtime, you can first load baseline data from source to target and then enable replication (using MySQL native tools, AWS DMS, or third-party tools) for changes to catch up.
-
Copy static tables first — If your database relies on large static tables with reference data, you may migrate these large tables to the target database before migrating your active dataset. You can use AWS DMS to copy tables selectively or export and import these tables manually.
-
Multiphase migration — Migration of large database with thousands of tables can be broken down into multiple phases. For example, you may move a set of tables with no cross joins queries every weekend until the source database is fully migrated to the target database. Note that in order to achieve this, you need to make changes in your application to connect to two databases simultaneously while your dataset is on two distinct nodes. Although this is not a common migration pattern, this is an option nonetheless.
-
Database cleanup — Many large databases contain data and tables that remain unused. In many cases, developers and DBAs keep backup copies of tables in the same database, or they just simply forget to drop unused tables. Whatever the reason, a database migration project provides an opportunity to clean up the existing database before the migration. If some tables are not being used, you might either drop them or archive them to another database. You might also delete old data from large tables or archive that data to flat files.
Partition and shard consolidation on Amazon Aurora
If you are running multiple shards or functional partitions of your database to achieve high performance, you have an opportunity to consolidate these partitions or shards on a single Aurora database. A single Amazon Aurora instance can scale up to 128 TB, supports thousands of tables, and supports a significantly higher number of reads and writes than a standard MySQL database. Consolidating these partitions on a single Aurora instance not only reduces the total cost of ownership and simplify database management, but it also significantly improves performance of cross-partition queries.
-
Functional partitions — Functional partitioning means dedicating different nodes to different tasks. For example, in an ecommerce application, you might have one database node serving product catalog data, and another database node capturing and processing orders. As a result, these partitions usually have distinct, nonoverlapping schemas.
-
Consolidation strategy — Migrate each functional partition as a distinct schema to your target Aurora instance. If your source database is MySQL compliant, use native MySQL tools to migrate the schema and then use AWS DMS to migrate the data, either one time or continuously using replication. If your source database is non-MySQL complaint, use AWS Schema Conversion Tool to migrate the schemas to Aurora and use AWS DMS for one-time load or continuous replication.
-
Data shards — If you have the same schema with distinct sets of data across multiple nodes, you are leveraging database sharding. For example, a high-traffic blogging service may shard user activity and data across multiple database shards while keeping the same table schema.
-
Consolidation strategy — Since all shards share the same database schema, you only need to create the target schema once. If you are using a MySQL- compliant database, use native tools to migrate the database schema to Aurora. If you are using a non-MySQL database, use AWS Schema Conversion Tool to migrate the database schema to Aurora. Once the database schema has been migrated, it is best to stop writes to the database shards and use native tools or an AWS DMS one-time data load to migrate an individual shard to Aurora. If writes to the application cannot be stopped for an extended period, you might still use AWS DMS with replication but only after proper planning and testing.
Migration options at a glance
Table 1 — Migration options
Source database type |
Migration with downtime |
Near-zero downtime migration |
---|---|---|
Amazon RDS MySQL |
Option 1: RDS snapshot migration Option 2: Manual migration using native tools* Option 3: Schema migration using native tools and data load using AWS DMS |
Option 1: Migration using native tools + binlog replication Option 2: Migrate using Aurora Read Replica Option 3: Schema migration using native tools + AWS DMS for data movement |
MySQL Amazon EC2 or on- premises |
Option 1: Migration using native tools Option 2: Schema migration with native tools + AWS DMS for data load |
Option 1: Migration using native tools + binlog replication Option 2: Schema migration using native tools + AWS DMS to move data |
Oracle/SQL server |
Option 1: AWS Schema Conversion Tool + AWS DMS (recommended) Option 2: Manual or third- party tool for schema conversion + manual or third- party data load in target |
Option 1: AWS Schema Conversion Tool + AWS DMS (recommended) Option 2: Manual or third-party tool for schema conversion + manual or third-party data load in target + third-party tool for replication. |
Other non-MySQL databases |
Option: Manual or third-party tool for schema conversion + manual or third-party data load in target |
Option: Manual or third-party tool for schema conversion + manual or third-party data load in target + third-party tool for replication (GoldenGate and so on.) |
*MySQL Native tools: mysqldump, SELECT INTO OUTFILE, third-party
tools like mydumper/myloader
.