AWS Database Migration Service
User Guide (Version API Version 2016-01-01)

Using a PostgreSQL Database as a Source for AWS DMS

You can migrate data from one or many PostgreSQL databases using AWS DMS. With a PostgreSQL database as a source, you can migrate data to either another PostgreSQL database or one of the other supported databases. AWS DMS supports a PostgreSQL version 9.4 and later (for versions 9.x), 10.x, and 11.x database as a source for these types of databases:

  • On-premises databases

  • Databases on an EC2 instance

  • Databases on an Amazon RDS DB instance

  • Databases on an Amazon Aurora DB instance with PostgreSQL compatibility

Note

PostreSQL versions 11.x are supported as a source only in AWS DMS versions 3.3.0 and later. You can use PostgreSQL version 9.4 and later (for versions 9.x) and 10.x as a source in any DMS version.

Note

PostgreSQL versions 10.x contain numerous changes in function names and folder names from previous versions.

In some cases, you might use a PostgreSQL version 10.x database as a source and an AWS DMS version earlier than 3.3.0. In these cases, see Using PostgreSQL Version 10.x as a Source for AWS DMS for information on preparing your database as a source for AWS DMS.

If you use a PostgreSQL 10.x database as a source and DMS 3.3.0 or later, don't perform the preparations for source 10.x databases with earlier DMS versions.

For a summary of the DMS version requirements to use the supported PostgreSQL source versions, see the following table.

PostgreSQL Source Version

AWS DMS Version to Use

9.x

Use any available DMS version.

10.x

If you use a DMS version earlier than 3.3.0, prepare the PostgreSQL source using the wrapper functions described in Using PostgreSQL Version 10.x as a Source for AWS DMS.

If you use a DMS version 3.3.0 or later, don't create these wrapper functions. You can use the PostgreSQL source without any additional preparation.

11.x

Use DMS version 3.3.0.

You can use SSL to encrypt connections between your PostgreSQL endpoint and the replication instance. For more information on using SSL with a PostgreSQL endpoint, see Using SSL With AWS Database Migration Service.

For a homogeneous migration from a PostgreSQL database to a PostgreSQL database on AWS, the following is true:

  • JSONB columns on the source are migrated to JSONB columns on the target.

  • JSON columns are migrated as JSON columns on the target.

  • HSTORE columns are migrated as HSTORE columns on the target.

For a heterogeneous migration with PostgreSQL as the source and a different database engine as the target, the situation is different. In this case, JSONB, JSON, and HSTORE columns are converted to the AWS DMS intermediate type of NCLOB and then translated to the corresponding NCLOB column type on the target. In this case, AWS DMS treats JSONB data as if it were a LOB column. During the full load phase of a migration, the target column must be nullable.

AWS DMS supports change data capture (CDC) for PostgreSQL tables with primary keys. If a table doesn't have a primary key, the write-ahead logs (WAL) don't include a before image of the database row and AWS DMS can't update the table.

AWS DMS supports CDC on Amazon RDS PostgreSQL databases when the DB instance is configured to use logical replication. Amazon RDS supports logical replication for a PostgreSQL DB instance version 9.4.9 and higher and 9.5.4 and higher. Amazon RDS also supports logical replication for an Amazon Aurora DB instance using versions 2.2.0 and 2.2.1, with PostgreSQL 10.6 compatibility.

For additional details on working with PostgreSQL databases and AWS DMS, see the following sections.

Migrating from PostgreSQL to PostgreSQL Using AWS DMS

For a heterogeneous migration, where you are migrating from a database engine other than PostgreSQL to a PostgreSQL database, AWS DMS is almost always the best migration tool to use. But for a homogeneous migration, where you are migrating from a PostgreSQL database to a PostgreSQL database, native tools can be more effective.

We recommend that you use native PostgreSQL database migration tools such as pg_dump under the following conditions:

  • You have a homogeneous migration, where you are migrating from a source PostgreSQL database to a target PostgreSQL database.

  • You are migrating an entire database.

  • The native tools allow you to migrate your data with minimal downtime.

The pg_dump utility uses the COPY command to create a schema and data dump of a PostgreSQL database. The dump script generated by pg_dump loads data into a database with the same name and recreates the tables, indexes, and foreign keys. You can use the pg_restore command and the -d parameter to restore the data to a database with a different name.

For more information about importing a PostgreSQL database into Amazon RDS for PostgreSQL or Amazon Aurora (PostgreSQL), see https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide//PostgreSQL.Procedural.Importing.html.

Using DMS to Migrate Data from PostgreSQL to PostgreSQL

AWS DMS can migrate data from, for example, a source PostgreSQL database that is on premises to a target Amazon RDS for PostgreSQL or Amazon Aurora (PostgreSQL) instance. Core or basic PostgreSQL data types most often migrate successfully.

Data types that are supported on the source database but aren't supported on the target may not migrate successfully. AWS DMS streams some data types as strings if the data type is unknown. Some data types, such as XML and JSON, can successfully migrate as small files but can fail if the are large documents.

The following table shows source PostgreSQL data types and whether they can be migrated successfully:

Data type Migrates successfully Will partially migrate Will not migrate Comments
INTEGER X
SMALLINT X
BIGINT X
NUMERIC/DECIMAL(p,s) X With 0<p<39 and 0<s
NUMERIC/DECIMAL X p>38 or p=s=0
REAL X
DOUBLE X
SMALLSERIAL X
SERIAL X
BIGSERIAL X
MONEY X
CHAR X Without specified precision
CHAR(n) X
VARCHAR X Without specified precision
VARCHAR(n) X
TEXT X
BYTEA X
TIMESTAMP X
TIMESTAMP(Z) X
DATE X
TIME X
TIME (z) X
INTERVAL X
BOOLEAN X
ENUM X
CIDR X
INET X
MACADDR X
TSVECTOR X
TSQUERY X
XML X
POINT X
LINE X
LSEG X
BOX X
PATH X
POLYGON X
CIRCLE X
JSON X
ARRAY X
COMPOSITE X
RANGE X

Prerequisites for Using a PostgreSQL Database as a Source for AWS DMS

For a PostgreSQL database to be a source for AWS DMS, do the following:

  • Use a PostgreSQL database that is version 9.4.x or later.

  • Grant superuser permissions for the user account specified for the PostgreSQL source database.

  • Add the IP address of the AWS DMS replication server to the pg_hba.conf configuration file.

  • Set the following parameters and values in the postgresql.conf configuration file:

    • Set wal_level = logical

    • Set max_replication_slots to a value greater than 1.

      The max_replication_slots value should be set according to the number of tasks that you want to run. For example, to run five tasks you need to set a minimum of five slots. Slots open automatically as soon as a task starts and remain open even when the task is no longer running. You need to manually delete open slots.

    • Set max_wal_senders to a value greater than 1.

      The max_wal_senders parameter sets the number of concurrent tasks that can run.

    • Set wal_sender_timeout =0

      The wal_sender_timeout parameter terminates replication connections that are inactive longer than the specified number of milliseconds. Although the default is 60 seconds, we recommend that you set this parameter to zero, which disables the timeout mechanism.

  • The parameter idle_in_transaction_session_timeout in PostgreSQL versions 9.6 and later lets you cause idle transactions to time out and fail. Some AWS DMS transactions are idle for some time before the AWS DMS engine uses them again. Do not end idle transactions when you use AWS DMS.

Security Requirements When Using a PostgreSQL Database as a Source for AWS DMS

The only security requirement when using PostgreSQL as a source is that the user account specified must be a registered user in the PostgreSQL database.

Limitations on Using a PostgreSQL Database as a Source for AWS DMS

The following limitations apply when using PostgreSQL as a source for AWS DMS:

  • A captured table must have a primary key. If a table doesn't have a primary key, AWS DMS ignores DELETE and UPDATE record operations for that table.

  • Timestamp with a time zone type column isn't supported.

  • AWS DMS ignores an attempt to update a primary key segment. In these cases, the target identifies the update as one that didn't update any rows. However, because the results of updating a primary key in PostgreSQL are unpredictable, no records are written to the exceptions table.

  • AWS DMS doesn't support the Start Process Changes from Timestamp run option.

  • AWS DMS supports full load and change processing on Amazon RDS for PostgreSQL. For information on how to prepare a PostgreSQL DB instance and to set it up for using CDC, see Setting Up an Amazon RDS PostgreSQL DB Instance as a Source.

  • Replication of multiple tables with the same name where each name has a different case (for example, table1, TABLE1, and Table1) can cause unpredictable behavior. Because of this issue, AWS DMS doesn't support this type of replication.

  • In most cases, AWS DMS supports change processing of CREATE, ALTER, and DROP DDL statements for tables. AWS DMS doesn't support this change processing if the tables are held in an inner function or procedure body block or in other nested constructs.

    For example, the following change isn't captured:

    CREATE OR REPLACE FUNCTION attu.create_distributors1() RETURNS void LANGUAGE plpgsql AS $$ BEGIN create table attu.distributors1(did serial PRIMARY KEY,name varchar(40) NOT NULL); END; $$;
  • AWS DMS doesn't support change processing of TRUNCATE operations.

  • The OID LOB data type isn't migrated to the target.

  • If your source is a PostgreSQL database that is on-premises or on an Amazon EC2 instance, ensure that the test_decoding output plugin is installed on your source endpoint. You can find this plugin in the Postgres contrib package. For more information about the test-decoding plugin, see the PostgreSQL documentation.

  • AWS DMS doesn’t support change processing to set column default values (using the ALTER COLUMN SET DEFAULT clause on ALTER TABLE statements).

  • AWS DMS doesn’t support change processing to set column nullability (using the ALTER COLUMN [SET|DROP] NOT NULL clause on ALTER TABLE statements).

  • AWS DMS doesn't support replication of partitioned tables. When a partitioned table is detected, the following occurs:

    • The endpoint reports a list of parent and child tables.

    • AWS DMS creates the table on the target as a regular table with the same properties as the selected tables.

    • If the parent table in the source database has the same primary key value as its child tables, a "duplicate key" error is generated.

Note

To replicate partitioned tables from a PostgreSQL source to a PostgreSQL target, you first need to manually create the parent and child tables on the target. Then you define a separate task to replicate to those tables. In such a case, you set the task configuration to Truncate before loading.

Note

The PostgreSQL NUMERIC data type isn't fixed in size. When transferring data that is a NUMERIC data type but without precision and scale, DMS uses NUMERIC(28,6) (a precision of 28 and scale of 6) by default. As an example, the value 0.611111104488373 from the source is converted to 0.611111 on the PostgreSQL target.

Setting Up an Amazon RDS PostgreSQL DB Instance as a Source

You can use an Amazon RDS for PostgreSQL DB instance or Read Replica as a source for AWS DMS. A DB instance can be used for both full-load and CDC (ongoing replication); a Read Replica can only be used for full-load tasks and cannot be used for CDC.

You use the AWS master user account for the PostgreSQL DB instance as the user account for the PostgreSQL source endpoint for AWS DMS. The master user account has the required roles that allow it to set up change data capture (CDC). If you use an account other than the master user account, the account must have the rds_superuser role and the rds_replication role. The rds_replication role grants permissions to manage logical slots and to stream data using logical slots.

If you don't use the master user account for the DB instance, you must create several objects from the master user account for the account that you use. For information about creating the needed objects, see Migrating an Amazon RDS for PostgreSQL Database Without Using the Master User Account.

Using CDC with an RDS for PostgreSQL DB Instance

You can use PostgreSQL's native logical replication feature to enable CDC during a database migration of an Amazon RDS PostgreSQL DB instance. This approach reduces downtime and ensures that the target database is in sync with the source PostgreSQL database. Amazon RDS supports logical replication for a PostgreSQL DB instance version 9.4.9 and higher and 9.5.4 and higher.

Note

Amazon RDS for PostgreSQL Read Replicas cannot be used for CDC (ongoing replication).

To enable logical replication for an RDS PostgreSQL DB instance, do the following:

  • In general, use the AWS master user account for the PostgreSQL DB instance as the user account for the PostgreSQL source endpoint. The master user account has the required roles that allow it to set up CDC. If you use an account other than the master user account, you must create several objects from the master account for the account that you use. For more information, see Migrating an Amazon RDS for PostgreSQL Database Without Using the Master User Account.

  • Set the rds.logical_replication parameter in your DB parameter group to 1. This is a static parameter that requires a reboot of the DB instance for the parameter to take effect. As part of applying this parameter, AWS DMS sets the wal_level, max_wal_senders, max_replication_slots, and max_connections parameters. These parameter changes can increase WAL generation, so you should only set the rds.logical_replication parameter when you are using logical slots.

  • A best practice is to set the wal_sender_timeout parameter to 0. Setting this parameter to 0 prevents PostgreSQL from terminating replication connections that are inactive longer than the specified timeout. When AWS DMS is migrating data, replication connections need to be able to last longer than the specified timeout.

Migrating an Amazon RDS for PostgreSQL Database Without Using the Master User Account

In some cases, you might not use the master user account for the Amazon RDS PostgreSQL DB instance that you are using as a source. In these cases, you need to create several objects to capture data definition language (DDL) events. You create these objects in the account other than the master account and then create a trigger in the master user account.

Note

If you set the captureDDLs extra connection attribute to N on the source endpoint, you don't have to create the following table and trigger on the source database.

Use the following procedure to create these objects. The user account other than the master account is referred to as the NoPriv account in this procedure.

To create objects

  1. Choose the schema where the objects are to be created. The default schema is public. Ensure that the schema exists and is accessible by the NoPriv account.

  2. Log in to the PostgreSQL DB instance using the NoPriv account.

  3. Create the table awsdms_ddl_audit by running the following command, replacing <objects_schema> in the code following with the name of the schema to use.

    create table <objects_schema>.awsdms_ddl_audit ( c_key bigserial primary key, c_time timestamp, -- Informational c_user varchar(64), -- Informational: current_user c_txn varchar(16), -- Informational: current transaction c_tag varchar(24), -- Either 'CREATE TABLE' or 'ALTER TABLE' or 'DROP TABLE' c_oid integer, -- For future use - TG_OBJECTID c_name varchar(64), -- For future use - TG_OBJECTNAME c_schema varchar(64), -- For future use - TG_SCHEMANAME. For now - holds current_schema c_ddlqry text -- The DDL query associated with the current DDL event )
  4. Create the function awsdms_intercept_ddl by running the following command, replacing <objects_schema> in the code following with the name of the schema to use.

    CREATE OR REPLACE FUNCTION <objects_schema>.awsdms_intercept_ddl() RETURNS event_trigger LANGUAGE plpgsql SECURITY DEFINER AS $$ declare _qry text; BEGIN if (tg_tag='CREATE TABLE' or tg_tag='ALTER TABLE' or tg_tag='DROP TABLE') then SELECT current_query() into _qry; insert into <objects_schema>.awsdms_ddl_audit values ( default,current_timestamp,current_user,cast(TXID_CURRENT()as varchar(16)),tg_tag,0,'',current_schema,_qry ); delete from <objects_schema>.awsdms_ddl_audit; end if; END; $$;
  5. Log out of the NoPriv account and log in with an account that has the rds_superuser role assigned to it.

  6. Create the event trigger awsdms_intercept_ddl by running the following command.

    CREATE EVENT TRIGGER awsdms_intercept_ddl ON ddl_command_end EXECUTE PROCEDURE <objects_schema>.awsdms_intercept_ddl();

When you have completed the procedure preceding, you can create the AWS DMS source endpoint using the NoPriv account.

Removing AWS DMS Artifacts from a PostgreSQL Source Database

To capture DDL events, AWS DMS creates various artifacts in the PostgreSQL database when a migration task starts. When the task completes, you might want to remove these artifacts. To remove the artifacts, issue the following statements (in the order they appear), where {AmazonRDSMigration} is the schema in which the artifacts were created:

drop event trigger awsdms_intercept_ddl;

The event trigger doesn't belong to a specific schema.

drop function {AmazonRDSMigration}.awsdms_intercept_ddl() drop table {AmazonRDSMigration}.awsdms_ddl_audit drop schema {AmazonRDSMigration}

Note

Dropping a schema should be done with extreme caution, if at all. Never drop an operational schema, especially not a public one.

Additional Configuration Settings When Using a PostgreSQL Database as a Source for AWS DMS

You can add additional configuration settings when migrating data from a PostgreSQL database in two ways:

  • You can add values to the extra connection attribute to capture DDL events and to specify the schema in which the operational DDL database artifacts are created. For more information, see Extra Connection Attributes When Using PostgreSQL as a Source for AWS DMS.

  • You can override connection string parameters. Select this option if you need to do either of the following:

    • Specify internal AWS DMS parameters. Such parameters are rarely required and are therefore not exposed in the user interface.

    • Specify pass-through (passthru) values for the specific database client. AWS DMS includes pass-through parameters in the connection sting passed to the database client.

Using PostgreSQL Version 10.x as a Source for AWS DMS

PostgreSQL version 10.x databases have numerous changes in function names and folder names from previous PostgreSQL versions. These changes make certain migration actions not backward compatible when using AWS DMS versions earlier than 3.3.0.

Note

If you use a PostgreSQL 10.x database as a source in for DMS 3.3.0 or later, don't perform the preparations described following. You can use the PostgreSQL source without any additional preparation.

Because most of the name changes are superficial, AWS DMS has created wrapper functions that let AWS DMS work with PostgreSQL versions 10.x. The wrapper functions are prioritized higher than functions in pg_catalog. In addition, we ensure that schema visibility of existing schemas isn't changed so that we don't override any other system catalog functions such as user-defined functions.

To use these wrapper functions before you perform any migration tasks, run the following SQL code on the source PostgreSQL database. Use the same AWS DMS user account that you used to create the source endpoint.

BEGIN; CREATE SCHEMA IF NOT EXISTS fnRenames; CREATE OR REPLACE FUNCTION fnRenames.pg_switch_xlog() RETURNS pg_lsn AS $$ SELECT pg_switch_wal(); $$ LANGUAGE SQL; CREATE OR REPLACE FUNCTION fnRenames.pg_xlog_replay_pause() RETURNS VOID AS $$ SELECT pg_wal_replay_pause(); $$ LANGUAGE SQL; CREATE OR REPLACE FUNCTION fnRenames.pg_xlog_replay_resume() RETURNS VOID AS $$ SELECT pg_wal_replay_resume(); $$ LANGUAGE SQL; CREATE OR REPLACE FUNCTION fnRenames.pg_current_xlog_location() RETURNS pg_lsn AS $$ SELECT pg_current_wal_lsn(); $$ LANGUAGE SQL; CREATE OR REPLACE FUNCTION fnRenames.pg_is_xlog_replay_paused() RETURNS boolean AS $$ SELECT pg_is_wal_replay_paused(); $$ LANGUAGE SQL; CREATE OR REPLACE FUNCTION fnRenames.pg_xlogfile_name(lsn pg_lsn) RETURNS TEXT AS $$ SELECT pg_walfile_name(lsn); $$ LANGUAGE SQL; CREATE OR REPLACE FUNCTION fnRenames.pg_last_xlog_replay_location() RETURNS pg_lsn AS $$ SELECT pg_last_wal_replay_lsn(); $$ LANGUAGE SQL; CREATE OR REPLACE FUNCTION fnRenames.pg_last_xlog_receive_location() RETURNS pg_lsn AS $$ SELECT pg_last_wal_receive_lsn(); $$ LANGUAGE SQL; CREATE OR REPLACE FUNCTION fnRenames.pg_current_xlog_flush_location() RETURNS pg_lsn AS $$ SELECT pg_current_wal_flush_lsn(); $$ LANGUAGE SQL; CREATE OR REPLACE FUNCTION fnRenames.pg_current_xlog_insert_location() RETURNS pg_lsn AS $$ SELECT pg_current_wal_insert_lsn(); $$ LANGUAGE SQL; CREATE OR REPLACE FUNCTION fnRenames.pg_xlog_location_diff(lsn1 pg_lsn, lsn2 pg_lsn) RETURNS NUMERIC AS $$ SELECT pg_wal_lsn_diff(lsn1, lsn2); $$ LANGUAGE SQL; CREATE OR REPLACE FUNCTION fnRenames.pg_xlogfile_name_offset(lsn pg_lsn, OUT TEXT, OUT INTEGER) AS $$ SELECT pg_walfile_name_offset(lsn); $$ LANGUAGE SQL; CREATE OR REPLACE FUNCTION fnRenames.pg_create_logical_replication_slot(slot_name name, plugin name, temporary BOOLEAN DEFAULT FALSE, OUT slot_name name, OUT xlog_position pg_lsn) RETURNS RECORD AS $$ SELECT slot_name::NAME, lsn::pg_lsn FROM pg_catalog.pg_create_logical_replication_slot(slot_name, plugin, temporary); $$ LANGUAGE SQL; ALTER USER <user name> SET search_path = fnRenames, pg_catalog, "$user", public; -- DROP SCHEMA fnRenames CASCADE; -- ALTER USER PG_User SET search_path TO DEFAULT; COMMIT;

Note

If you don't run this preparatory code on a source PostgreSQL 10.x database for DMS versions earlier than 3.3.0, you see an error like the following.

2018-10-29T02:57:50 [SOURCE_CAPTURE ]E: RetCode: SQL_ERROR SqlState: 42703 NativeError: 1 Message: ERROR: column &quot;xlog_position&quot; does not exist;, No query has been executed with that handle [1022502] (ar_odbc_stmt.c:3647)

After you have upgraded your AWS DMS version to 3.3.0 or later, follow these steps:

  1. Remove the fnRenames reference from the ALTER USER statement you use to set your source PostgreSQL 10.x configuration search path.

  2. Delete the fnRenames schema from your PostgreSQL database.

If you don't follow these steps after upgrading, you see the following error in the log when the fnRenames schema is accessed:

RetCode: SQL_ERROR SqlState: 42703 NativeError: 1 Message: ERROR: column "lsn" does not exist;

Extra Connection Attributes When Using PostgreSQL as a Source for AWS DMS

You can use extra connection attributes to configure your PostgreSQL source. You specify these settings when you create the source endpoint. If you have multiple connection attribute settings, separate them from each other by semicolons with no additional white space.

The following table shows the extra connection attributes that you can use when using PostgreSQL as a source for AWS DMS:

Name Description

captureDDLs

To capture DDL events, AWS DMS creates various artifacts in the PostgreSQL database when the task starts. You can later remove these artifacts as described in Removing AWS DMS Artifacts from a PostgreSQL Source Database.

If this value is set to N, you don't have to create tables or triggers on the source database. For more information, see Migrating an Amazon RDS for PostgreSQL Database Without Using the Master User Account.

Streamed DDL events are captured.

Default value: Y

Valid values: Y/N

Example: captureDDLs=Y;

ddlArtifactsSchema

The schema in which the operational DDL database artifacts are created.

Default value: public

Valid values: String

Example: ddlArtifactsSchema=xyzddlschema;

failTasksOnLobTruncation

When set to true, this value causes a task to fail if the actual size of a LOB column is greater than the specified LobMaxSize.

If task is set to Limited LOB mode and this option is set to true, the task fails instead of truncating the LOB data.

Default value: false

Valid values: Boolean

Example: failTasksOnLobTruncation=true;

executeTimeout

Sets the client statement timeout for the PostgreSQL instance, in seconds. The default value is 60 seconds.

Example: executeTimeout=100;

Source Data Types for PostgreSQL

The following table shows the PostgreSQL source data types that are supported when using AWS DMS and the default mapping to AWS DMS data types.

For information on how to view the data type that is mapped in the target, see the section for the target endpoint you are using.

For additional information about AWS DMS data types, see Data Types for AWS Database Migration Service.

PostgreSQL Data Types

AWS DMS Data Types

INTEGER

INT4

SMALLINT

INT2

BIGINT

INT8

NUMERIC (p,s)

If precision is from 0 through 38, then use NUMERIC.

If precision is 39 or greater, then use STRING.

DECIMAL(P,S)

If precision is from 0 through 38, then use NUMERIC.

If precision is 39 or greater, then use STRING.

REAL

REAL4

DOUBLE

REAL8

SMALLSERIAL

INT2

SERIAL

INT4

BIGSERIAL

INT8

MONEY

NUMERIC(38,4)

The MONEY data type is mapped to FLOAT in SQL Server.

CHAR

WSTRING (1)

CHAR(N)

WSTRING (n)

VARCHAR(N)

WSTRING (n)

TEXT

NCLOB

BYTEA

BLOB

TIMESTAMP

TIMESTAMP

TIMESTAMP (z)

TIMESTAMP

TIMESTAMP with time zone

Not supported

DATE

DATE

TIME

TIME

TIME (z)

TIME

INTERVAL

STRING (128)—1 YEAR, 2 MONTHS, 3 DAYS, 4 HOURS, 5 MINUTES, 6 SECONDS

BOOLEAN

CHAR (5) false or true

ENUM

STRING (64)

CIDR

STRING (50)

INET

STRING (50)

MACADDR

STRING (18)

BIT (n)

STRING (n)

BIT VARYING (n)

STRING (n)

UUID

STRING

TSVECTOR

CLOB

TSQUERY

CLOB

XML

CLOB

POINT

STRING (255) "(x,y)"

LINE

STRING (255) "(x,y,z)"

LSEG

STRING (255) "((x1,y1),(x2,y2))"

BOX

STRING (255) "((x1,y1),(x2,y2))"

PATH

CLOB "((x1,y1),(xn,yn))"

POLYGON

CLOB "((x1,y1),(xn,yn))"

CIRCLE

STRING (255) "(x,y),r"

JSON

NCLOB

JSONB

NCLOB

ARRAY

NCLOB

COMPOSITE

NCLOB

HSTORE

NCLOB

INT4RANGE

STRING (255)

INT8RANGE

STRING (255)

NUMRANGE

STRING (255)

STRRANGE

STRING (255)

PostgreSQL column sizes affect the conversion of PostgreSQL LOB data types to AWS DMS data types. To work with this, take the following steps for the following AWS DMS data types:

  • BLOB – Set Limit LOB size to the Maximum LOB size (KB) value at task creation.

  • CLOB – Replication handles each character as a UTF8 character. Therefore, find the length of the longest character text in the column, shown here as max_num_chars_text and use it to specify the value for Limit LOB size to. If the data includes 4-byte characters, multiply by 2 to specify the Limit LOB size to value, which is in bytes. In this case, Limit LOB size to is equal to max_num_chars_text multiplied by 2.

  • NCLOB – Replication handles each character as a double-byte character. Therefore, find the length of the longest character text in the column (max_num_chars_text) and multiply by 2 to specify the value for Limit LOB size to. In this case, Limit LOB size to is equal to max_num_chars_text multiplied by 2. If the data includes 4-byte characters, multiply by 2 again. In this case, Limit LOB size to is equal to max_num_chars_text multiplied by 4.