Migrate Oracle external tables to Amazon Aurora PostgreSQL-Compatible Edition - AWS Prescriptive Guidance

Migrate Oracle external tables to Amazon Aurora PostgreSQL-Compatible Edition

Created by anuradha chintha (AWS) and Rakesh Raghav (AWS)

Source: Databases: Relational

Target: Databases: Relational

R Type: Re-architect

Environment: PoC or pilot

Technologies: Migration; Databases

Workload: Open-source

AWS services: Amazon S3

Summary

External tables give Oracle the ability to query data that is stored outside the database in flat files. You can use the ORACLE_LOADER driver to access any data stored in any format that can be loaded by the SQL*Loader utility. You can't use Data Manipulation Language (DML) on external tables, but you can use external tables for query, join, and sort operations.

Amazon Aurora PostgreSQL-Compatible Edition doesn't provide functionality similar to external tables in Oracle. Instead, you must use modernization to develop a scalable solution that meets functional requirements and is frugal.

This pattern provides steps for migrating different types of Oracle external tables to Aurora PostgreSQL-Compatible Edition on the Amazon Web Services (AWS) Cloud by using the aws_s3 extension.

Prerequisites and limitations

Prerequisites 

  • An active AWS account

  • AWS Command Line Interface (AWS CLI)

  • An available Aurora PostgreSQL-Compatible database instance

  • An on-premises Oracle database with an external table

  • pg.Client API

  • Data files 

Limitations 

  • This pattern doesn't provide the functionality to act as a replacement for Oracle external tables. However, the steps and sample code can be enhanced further to achieve your database modernization goals.

  • Files should not contain the character that is passing as a delimiter in aws_s3 export and import functions.

Architecture

Source technology stack  

Oracle

Source Architecture 

Target technology stack 

  • Amazon Aurora PostgreSQL-Compatible

  • Amazon CloudWatch

  • AWS Lambda

  • AWS Secrets Manager

  • Amazon Simple Notification Service (Amazon SNS)

  • Amazon Simple Storage Service (Amazon S3)

Target architecture 

The following diagram shows a high-level representation of the solution.

Automation and scale

Any additions or changes to the external tables can be handled with metadata maintenance.

Tools

  • Amazon Aurora PostgreSQL-Compatible – Amazon Aurora PostgreSQL-Compatible Edition is a fully managed, PostgreSQL-compatible, and ACID-compliant relational database engine that combines the speed and reliability of high-end commercial databases with the cost-effectiveness of open-source databases.

  • AWS CLI – AWS Command Line Interface (AWS CLI) is a unified tool to manage your AWS services. With only one tool to download and configure, you can control multiple AWS services from the command line and automate them through scripts.

  • Amazon CloudWatch – Amazon CloudWatch monitors Amazon S3 resources and utilization.

  • AWS Lambda – AWS Lambda is a serverless compute service that supports running code without provisioning or managing servers, creating workload-aware cluster scaling logic, maintaining event integrations, or managing runtimes. In this pattern, Lambda runs the database function whenever a file is uploaded to Amazon S3.

  • AWS Secrets Manager – AWS Secrets Manager is a service for credential storage and retrieval. Using Secrets Manager, you can replace hardcoded credentials in your code, including passwords, with an API call to Secrets Manager to retrieve the secret programmatically.

  • Amazon S3 – Amazon Simple Storage Service (Amazon S3) provides a storage layer to receive and store files for consumption and transmission to and from the Aurora PostgreSQL-Compatible cluster.

  • aws_s3 – The aws_s3 extension integrates Amazon S3 and Aurora PostgreSQL-Compatible.

  • Amazon SNS – Amazon Simple Notification Service (Amazon SNS) coordinates and manages the delivery or sending of messages between publishers and clients. In this pattern, Amazon SNS is used to send notifications.

Code 

Whenever a file is placed in the S3 bucket, a DB function must be created and called from the processing application or the Lambda function. For details, see the code (attached).

Epics

TaskDescriptionSkills required
Add an external file to the source database.

Create an external file, and move it to the oracle directory.

DBA
TaskDescriptionSkills required
Create an Aurora PostgreSQL database.

Create a DB instance in your Amazon Aurora PostgreSQL-Compatible cluster.

DBA
Create a schema and tables.

The tables include actual tables, staging tables, error and log tables, and a metatable.

DBA, Developer
Create the DB function.

To create the DB function, use the load_external_table_latest.sql file (attached).

DBA, Developer
TaskDescriptionSkills required
Create a role.

Create a role with permissions to access Amazon S3 and Amazon Relational Database Service (Amazon RDS). This role will be assigned to Lambda for running the pattern.

DBA
Create the Lambda function.

To create a Lambda function, use the attached code and the role that you created.

DBA
Configure an S3 bucket event trigger.

Configure a mechanism to call the Lambda function for all object creation events in the S3 bucket.

DBA
Create a secret.

Create a secret name for the database credentials using Secrets Manager. Pass the secret in the Lambda function.

DBA
Upload the Lambda supporting files.

Upload a .zip file that contains the Lambda support packages and the attached Python script for connecting to Aurora PostgreSQL-Compatible. The Python code calls the function that you created in the database.

DBA
TaskDescriptionSkills required
Create an S3 bucket.

On the Amazon S3 console, create an S3 bucket with a unique name that does not contain leading slashes. An S3 bucket name is globally unique, and the namespace is shared by all AWS accounts.

DBA
Create IAM policies.

To create the AWS Identity and Access Management (IAM) policies, use s3bucketpolicy_for_import.json (attached).

DBA
Create roles.

Create two roles for Aurora PostgreSQL-Compatible, one role for Import and one role for Export. Assign the corresponding policies to the roles.

DBA
Attach the roles to the Aurora PostgreSQL-Compatible cluster.

Under Manage roles, attach the Import and Export roles to the Aurora PostgreSQL cluster.

DBA
Create supporting objects for Aurora PostgreSQL-Compatible.

Use ext_tbl_scripts.sql (attached) for table scripts. Use load_external_Table_latest.sql (attached) for the custom function.

DBA
TaskDescriptionSkills required
Upload a file into the S3 bucket.

To upload a test file into the S3 bucket, use the console or the following command in AWS CLI. 

aws s3 cp /Users/Desktop/ukpost/exttbl/"testing files"/aps s3://s3importtest/inputext/aps

As soon as the file is uploaded, a bucket event initiates the Lambda function, which runs the Aurora PostgreSQL-Compatible function.

DBA
Check the data and the log and error files.

The Aurora PostgreSQL-Compatible function loads the files into the main table, and it creates .log and .bad files in the S3 bucket.

DBA
Monitor the solution.

In the Amazon CloudWatch console, monitor the Lambda function.

DBA

Attachments

To access additional content that is associated with this document, unzip the following file: attachment.zip