AWS Schema Conversion Tool
User Guide (Version 1.0)

Understanding Workload Categories

AWS WQF evaluates your migration workload, and classifies it into a workload category that characterized by the way your database and app are architected. Based on this categorization, WQF analyzes the components your system uses and extrapolates the type of work needed to do the migration. Based on this analysis, AWS WQF estimates how easy or difficult you can expect the migration to be. It also estimates the type of work involved and the level of effort required.

Category 1: Workloads That Use ODBC and JDBC

This category typically has fewer than 50 custom stored procedures, or has simple stored procedures that are used for access controls. Applications using this data connect to the database using Open Database Connectivity (ODBC) or Java Database Connectivity (JDBC) instead of using proprietary drivers that have nonstandard extensions. Application logic resides in code outside the database (Java, Python, Ruby, and so on). For these databases, there is either no requirement for supporting a read-replica or Multi-AZ deployment, or these features are offered through replication-based technologies.

In this category, data warehouses use a star or snowflake schema with a reporting layer that uses engine-specific SQL or ANSI SQL, like Amazon QuickSight or Tableau. Porting to Amazon Redshift is relatively straightforward because the data model is retained, and enhanced with defining sort keys, distribution keys, compression, and properly configuring workload management (WLM).

These workloads are easy to port to Amazon Aurora and Amazon RDS. A migration in this category usually requires few person-hours.

Category 2: Workloads with Light Use of Proprietary Features

Workloads in this category use a combination of app code (Java, Python, Ruby, and so on) and stored procedure code. Stored procedures are used when the logic is cumbersome to implement in app code. Generally, this type of workload has less than 200 stored procedures and doesn't use advanced SQL language features. Schema migration is simple because data structures such as tables and views are used.

In this category, data warehouse workloads can stage data in tables and transform it using SQL wrapped in simple stored procedures. Data warehouse writes might have some microbatching or a large number of updates, deletes, and transactions. The data warehouse can also use proprietary online analytical processing (OLAP) extensions such as CUBE, ROLLUP, or PIVOT.

Migration involves moving the stored procedure logic outside the database and reworking SQL reports to deal with the lack of native functions. These are relatively easy to migrate. You can expect the migration of this type of workload to consume a moderate number of person-hours.

Category 3: Workloads with Heavy Use of Proprietary Features

Workloads in this category are completely driven by advanced stored procedure logic or proprietary features. In the field, many of the workloads in this category have as many as 100,000 lines of database-resident code and features. These workloads also use advanced features such as virtual private databases, column obfuscation, tuning options, and user-defined types. They consume a large amount of time for translation into alternate execution environments. Some of these workloads rely on native hardware features, such as Exadata, Supercluster, and PDW. High-performance workloads often fall into this category. The tuning options present in local code have to be translated and tested with options available on the target database.

Data warehouses in this category contain large numbers of stored procedures and user-defined functions that either orchestrate extract, transform, load (ETL) operations or create business views. Their ETL processes can't be easily expressed in Amazon Redshift, although much of the business logic might be rendered as views. These data warehouses can also have many thousands of tables with a large number of transactions to manage the ETL workflow. When migrating with Amazon Redshift as a target, such workloads require rearchitecting the app to separate the transactional workload from reporting. Rearchitecting also requires pulling logic out of the data warehouse and into another compute layer.

These workloads are difficult to migrate and might constitute a material risk to the customer. You can expect the migration of this workload to consume a significant number of person-hours.

Category 4: Engine-Specific Workloads

Workloads in this category use frameworks that can work only with a specific commercial database engine. For example, database-specific app frameworks include Oracle Forms, Oracle Reports, Oracle ADF and Oracle APEX (Application Express), or apps that use .NET ActiveRecord extensively. Migrating these workloads to an open-source or NoSQL database can require a complete reimplementation of the app to separate presentation logic from the database.

A data warehouse in this category might rely heavily on proprietary features such as Geospatial at petabyte scale. These features might contain proprietary logic in OLAP data structures. Workloads might have availability, replication, or user concurrency requirements that can't be met by a single Availability Zone architecture. They might have latency requirements that preclude the use of Amazon Athena.

These workloads are very difficult to migrate. You can expect the migration of this workload to take a very large number of person-hours. Such a migration can also constitute a significant risk to undertake. The migration of this workload might not be supported from the perspective of certification or third-party support.

Category 5: Nonportable, Unacceptable Risk, or "Lift and Shift" Workloads

Workloads in this category might be implemented on database engines that have no cloud-based equivalent. Their underlying operating system might not be supported by AWS. For example, it might use mainframe, Power, or RISC architectures. In some cases, the database might use native code extensions such as Oracle Call Interface to run the business logic. This business logic is often considered "legacy" by the customer, even if it is still business-critical. In some cases, customers don't have the source code for these programs. Data warehouse and OLTP workloads share the same attributes for this category.

You can migrate these apps to Amazon EC2. They might have emulation requirements or require other third-party solutions. In some cases, the risk of moving these workloads from the existing environment might be too high to justify. In that case, it's appropriate to maintain high-performance connectivity to the on-premises implementation with a network topology that supports the app requirements.