Adding a column using an expression Flagging target records using an expression Replicating source table headers using expressions Using SQLite functions to build expressions Adding metadata to a target table using expressions

Using transformation rule expressions to define column content

To define content for new and existing columns, you can use an expression within a transformation rule. For example, using expressions you can add a column or replicate source table headers to a target. You can also use expressions to flag records on target tables as inserted, updated, or deleted at the source.

Topics

Adding a column using an expression
Flagging target records using an expression
Replicating source table headers using expressions
Using SQLite functions to build expressions
Adding metadata to a target table using expressions

Adding a column using an expression

To add columns to tables using an expression in a transformation rule, use an add-column rule action and a column rule target.

The following example adds a new column to the ITEM table. It sets the new column name to FULL_NAME, with a data type of string, 50 characters long. The expression concatenates the values of two existing columns, FIRST_NAME and LAST_NAME, to evaluate to FULL_NAME. The schema-name, table-name, and expression parameters refer to objects in the source database table. Value and the data-type block refer to objects in the target database table.


{
    "rules": [
        {
            "rule-type": "selection", 
            "rule-id": "1",
            "rule-name": "1",
            "object-locator": {
                "schema-name": "Test",
                "table-name": "%"
            },
            "rule-action": "include"
        },
        {
            "rule-type": "transformation",
            "rule-id": "2",
            "rule-name": "2",
            "rule-action": "add-column",
            "rule-target": "column",
            "object-locator": {
                "schema-name": "Test",
                "table-name": "ITEM"
            },
            "value": "FULL_NAME",
            "expression": "$FIRST_NAME||'_'||$LAST_NAME",
            "data-type": {
                 "type": "string",
                 "length": 50
            }
        }
    ]
}

Flagging target records using an expression

To flag records in target tables as inserted, updated, or deleted in the source table, use an expression in a transformation rule. The expression uses an operation_indicator function to flag records. Records deleted from the source aren't deleted from the target. Instead, the target record is flagged with a user-provided value to indicate that it was deleted from the source.

Note

The operation_indicator function works only on tables that have a primary key on both source and target database.

For example, the following transformation rule first adds a new Operation column to a target table. It then updates the column with the value D whenever a record is deleted from a source table.


{
      "rule-type": "transformation",
      "rule-id": "2",
      "rule-name": "2",
      "rule-target": "column",
      "object-locator": {
        "schema-name": "%",
        "table-name": "%"
      },
      "rule-action": "add-column",
      "value": "Operation",
      "expression": "operation_indicator('D', 'U', 'I')",
      "data-type": {
        "type": "string",
        "length": 50
      }
}

Replicating source table headers using expressions

By default, headers for source tables aren't replicated to the target. To indicate which headers to replicate, use a transformation rule with an expression that includes the table column header.

You can use the following column headers in expressions.

Header	Value in ongoing replication	Value in full load	Data type
AR_H_STREAM_POSITION	The stream position value from the source. This value might be the system change number (SCN) or the log sequence number (LSN), depending on the source endpoint.	An empty string.	STRING
AR_H_TIMESTAMP	A timestamp indicating the time of the change.	A timestamp indicating the current time data arrives at the target.	DATETIME (scale=7)
AR_H_COMMIT_TIMESTAMP	A timestamp indicating the time of the commit.	A timestamp indicating the current time.	DATETIME (scale=7)
AR_H_OPERATION	INSERT, UPDATE, or DELETE	INSERT	STRING
AR_H_USER	The user name, ID, or any other information that the source provides about the user that made the change. This header is supported on the SQL Server and Oracle (version 11.2.0.3 and higher) source endpoints only.	An empty string.	STRING
AR_H_CHANGE_SEQ	A unique incrementing number from the source database that consists of a timestamp and an auto incrementing number. The value depends on the source database system.	An empty string.	STRING

The following example adds a new column to the target by using the stream position value from the source. For SQL Server, the stream position value is the LSN for the source endpoint. For Oracle, the stream position value is the SCN for the source endpoint.


{
      "rule-type": "transformation",
     "rule-id": "2",
      "rule-name": "2",
      "rule-target": "column",
      "object-locator": {
        "schema-name": "%",
        "table-name": "%"
      },
      "rule-action": "add-column",
      "value": "transact_id",
      "expression": "$AR_H_STREAM_POSITION",
      "data-type": {
        "type": "string",
        "length": 50
      }
    }

The following example adds a new column to the target that has a unique incrementing number from the source. This value represents a 35 digit unique number at task level. The first 16 digits are part of a timestamp, and the last 19 digits are the record_id number incremented by the DBMS.


{
"rule-type": "transformation",
"rule-id": "2",
"rule-name": "2",
"rule-target": "column",
"object-locator": {
"schema-name": "%",
"table-name": "%"
},
"rule-action": "add-column",
"value": "transact_id",
"expression": "$AR_H_CHANGE_SEQ",
"data-type": {
"type": "string",
"length": 50
}
}

Using SQLite functions to build expressions

You use table settings to specify any settings that you want to apply to the selected table or view for a specified operation. Table-settings rules are optional.

Note

Instead of the concept of tables and views, MongoDB and DocumentDB databases store data records as documents that are gathered together in collections. So then, when migrating from a MongoDB or DocumentDB source, consider the range segmentation type of parallel load settings for selected collections rather than tables and views.

Following, you can find string functions that you can use to build transformation rule expressions.

String functions	Description
`lower(x)`	The `lower(x)` function returns a copy of string `x` with all characters converted to lowercase. The default, built-in `lower` function works for ASCII characters only.
`upper(x)`	The `upper(x)` function returns a copy of string `x` with all characters converted to uppercase. The default, built-in `upper` function works for ASCII characters only.
`ltrim(x,y)`	The `ltrim(x,y)` function returns a string formed by removing all characters that appear in y from the left side of x. If there is no value for y, `ltrim(x)` removes spaces from the left side of x.
`replace(x,y,z)`	The `replace(x,y,z)` function returns a string formed by substituting string z for every occurrence of string y in string x.
`rtrim(x,y)`	The `rtrim(x,y)` function returns a string formed by removing all characters that appear in y from the right side of x. If there is no value for y, `rtrim(x)` removes spaces from the right side of x.
`substr(x,y,z)`	The `substr(x,y,z)` function returns a substring of the input string `x` that begins with the `y`th character, and which is `z` characters long. If `z` is omitted, `substr(x,y)` returns all characters through the end of string `x` beginning with the `y`th character. The leftmost character of `x` is number 1. If `y` is negative, the first character of the substring is found by counting from the right rather than the left. If `z` is negative, then the `abs(z)` characters preceding the `y`th character are returned. If `x` is a string, then the characters' indices refer to actual UTF-8 characters. If `x` is a BLOB, then the indices refer to bytes.
`trim(x,y)`	The `trim(x,y)` function returns a string formed by removing all characters that appear in `y` from both sides of `x`. If there is no value for `y`, `trim(x)` removes spaces from both sides of `x`.

Following, you can find LOB functions that you can use to build transformation rule expressions.

LOB functions	Description
`hex(x)`	The `hex` function receives a BLOB as an argument and returns an uppercase hexadecimal string version of the BLOB content.
`randomblob (N)`	The `randomblob(N)` function returns an `N`-byte BLOB that contains pseudorandom bytes. If `N` is less than 1, a 1-byte random BLOB is returned.
`zeroblob(N)`	The `zeroblob(N)` function returns a BLOB that consists of `N` bytes of 0x00.

Following, you can find numeric functions that you can use to build transformation rule expressions.

Numeric functions	Description
`abs(x)`	The `abs(x)` function returns the absolute value of the numeric argument `x`. The `abs(x)` function returns NULL if `x` is NULL. The `abs(x)` function returns 0.0 if `x` is a string or BLOB that can't be converted to a numeric value.
`random()`	The `random` function returns a pseudorandom integer between -9,223,372,036,854,775,808 and +9,223,372,036,854,775,807.
`round (x,y)`	The `round (x,y)` function returns a floating-point value `x` rounded to `y` digits to the right of the decimal point. If there is no value for `y`, it's assumed to be 0.
`max (x,y...)`	The multiargument `max` function returns the argument with the maximum value, or returns NULL if any argument is NULL. The `max` function searches its arguments from left to right for an argument that defines a collating function. If one is found, it uses that collating function for all string comparisons. If none of the arguments to `max` define a collating function, the `BINARY` collating function is used. The `max` function is a simple function when it has two or more arguments, but it operates as an aggregate function if it has a single argument.
`min (x,y...)`	The multiargument `min` function returns the argument with the minimum value. The `min` function searches its arguments from left to right for an argument that defines a collating function. If one is found, it uses that collating function for all string comparisons. If none of the arguments to `min` define a collating function, the `BINARY` collating function is used. The `min` function is a simple function when it has two or more arguments, but it operates as an aggregate function if it has a single argument.

Following, you can find NULL check functions that you can use to build transformation rule expressions.

NULL check functions Description

NULL check functions	Description
`coalesce (x,y...)`	The `coalesce` function returns a copy of its first non-NULL argument, but it returns NULL if all arguments are NULL. The coalesce function has at least two arguments.
`ifnull(x,y)`	The `ifnull` function returns a copy of its first non-NULL argument, but it returns NULL if both arguments are NULL. The `ifnull` function has exactly two arguments. The `ifnull` function is the same as `coalesce` with two arguments.
`nullif(x,y)`	The `nullif(x,y)` function returns a copy of its first argument if the arguments are different, but it returns NULL if the arguments are the same. The `nullif(x,y)` function searches its arguments from left to right for an argument that defines a collating function. If one is found, it uses that collating function for all string comparisons. If neither argument to nullif defines a collating function, then the `BINARY` collating function is used.

coalesce (x,y...)

The coalesce function returns a copy of its first non-NULL argument, but it returns NULL if all arguments are NULL. The coalesce function has at least two arguments.

ifnull(x,y)

The ifnull function returns a copy of its first non-NULL argument, but it returns NULL if both arguments are NULL. The ifnull function has exactly two arguments. The ifnull function is the same as coalesce with two arguments.

nullif(x,y)

The nullif(x,y) function returns a copy of its first argument if the arguments are different, but it returns NULL if the arguments are the same.

The nullif(x,y) function searches its arguments from left to right for an argument that defines a collating function. If one is found, it uses that collating function for all string comparisons. If neither argument to nullif defines a collating function, then the BINARY collating function is used.

Following, you can find date and time functions that you can use to build transformation rule expressions.

Date and time functions	Description
`date(timestring, modifier, modifier...)`	The `date` function returns the date in the format YYYY-MM-DD.
`time(timestring, modifier, modifier...)`	The `time` function returns the time in the format HH:MM:SS.
`datetime(timestring, modifier, modifier...)`	The `datetime` function returns the date and time in the format YYYY-MM-DD HH:MM:SS.
`julianday(timestring, modifier, modifier...)`	The `julianday` function returns the number of days since noon in Greenwich on November 24, 4714 B.C.
`strftime(format, timestring, modifier, modifier...)`	The `strftime` function returns the date according to the format string specified as the first argument, using one of the following variables: `%d`: day of month `%H`: hour 00–24 `%f`: fractional seconds SS.SSS `%j`: day of year 001–366 `%J`: Julian day number `%m`: month 01–12 `%M`: minute 00–59 `%s`: seconds since 1970-01-01 `%S`: seconds 00–59 `%w`: day of week 0–6 sunday==0 `%W`: week of year 00–53 `%Y`: year 0000–9999 `%%`: %

Following, you can find a hash function that you can use to build transformation rule expressions.

Hash function Description

Hash function	Description
`hash_sha256(x)`	The `hash` function generates a hash value for an input column (using the SHA-256 algorithm) and returns the hexadecimal value of the generated hash value. To use the `hash` function in an expression, add `hash_sha256(x)` to the expression and replace `x` with the source column name.

hash_sha256(x)

The hash function generates a hash value for an input column (using the SHA-256 algorithm) and returns the hexadecimal value of the generated hash value.

To use the hash function in an expression, add hash_sha256(x) to the expression and replace x with the source column name.

Using a CASE expression

The SQLite CASE expression evaluates a list of conditions and returns an expression based on the result. Syntax is shown following.


    CASE case_expression
     WHEN when_expression_1 THEN result_1
     WHEN when_expression_2 THEN result_2
     ...
     [ ELSE result_else ] 
    END

# Or 

     CASE
     WHEN case_expression THEN result_1
     WHEN case_expression THEN result_2
     ...
     [ ELSE result_else ] 
    END

Examples

Example of adding a new string column to the target table using a case condition

The following example transformation rule adds a new string column, emp_seniority, to the target table, employee. It uses the SQLite round function on the salary column, with a case condition to check if the salary equals or exceeds 20,000. If it does, the column gets the value SENIOR, and anything else has the value JUNIOR.


  {
      "rule-type": "transformation",
      "rule-id": "2",
      "rule-name": "2",
      "rule-action": "add-column",
      "rule-target": "column",
      "object-locator": {
        "schema-name": "public",
        "table-name": "employee"
      },
      "value": "emp_seniority",
      "expression": " CASE WHEN round($emp_salary)>=20000 THEN ‘SENIOR’ ELSE ‘JUNIOR’ END",
      "data-type": {
        "type": "string",
        "length": 50
      }

  }

Example of adding a new string column to the target table using a SUBSTR function

The following example transformation rule adds a new string column using SQLite operators or functions to define the data in a column. This approach involves using SQLite functions to transform the GUID data loaded from Oracle to UUID format before inserting it into the Postgresql target table.

Following rule uses the SQLite substring (SUBSTR), hexadecimal function (HEX), and lowercase (LOWER) functions to break the GUID data into several groups separated by hyphens, specifically a group of 8 digits followed by three groups of 4 digits followed by a group of 12 digits, for a total of 32 digits representing the 128 bits.

Here is the sample source data and output on target post processing through transformation rule:

Source Table (Oracle GUID format)

T_COL2


06F6949D234911EE80670242AC120002
1A2B3C4D5E6F11EE80670242AC120003
F5E4D3C2B1A011EE80670242AC120004

Target Table (PostgreSQL UUID format)

T_COL2_TMP


06f6949d-2349-11ee-8067-0242ac120002
1a2b3c4d-5e6f-11ee-8067-0242ac120003
f5e4d3c2-b1a0-11ee-8067-0242ac120004


{
  "rule-type": "transformation",
  "rule-id": "2",
  "rule-name": "2",
  "rule-action": "add-column",
  "rule-target": "column",
  "object-locator": {
    "schema-name": "SPORTS",
    "table-name": "TEST_TBL_2"
  },
  "value": "t_col2_tmp",
  "expression": "CASE LOWER(SUBSTR(HEX($T_COL2), 1, 8) || '-' || SUBSTR(HEX($T_COL2), 9, 4) || '-' || SUBSTR(HEX($T_COL2), 13, 4) || '-' || SUBSTR(HEX($T_COL2), 17, 4) || '-' || SUBSTR(HEX($T_COL2), 21, 12)) WHEN '----' THEN NULL ELSE LOWER(SUBSTR(HEX($T_COL2), 1, 8) || '-' || SUBSTR(HEX($T_COL2), 9, 4) || '-' || SUBSTR(HEX($T_COL2), 13, 4) || '-' || SUBSTR(HEX($T_COL2), 17, 4) || '-' || SUBSTR(HEX($T_COL2), 21, 12)) END",
  "data-type": {
    "type": "string",
    "length": 60
  }
}

Example of adding a new date column to the target table

The following example adds a new date column, createdate, to the target table, employee. When you use the SQLite date function datetime, the date is added to the newly created table for each row inserted.


  {
      "rule-type": "transformation",
      "rule-id": "2",
      "rule-name": "2",
      "rule-action": "add-column",
      "rule-target": "column",
      "object-locator": {
        "schema-name": "public",
        "table-name": "employee"
      },
      "value": "createdate",
      "expression": "datetime ()",
      "data-type": {
        "type": "datetime",
        "precision": 6
      }
  }

Example of adding a new numeric column to the target table

The following example adds a new numeric column, rounded_emp_salary, to the target table, employee. It uses the SQLite round function to add the rounded salary.


  {
      "rule-type": "transformation",
      "rule-id": "2",
      "rule-name": "2",
      "rule-action": "add-column",
      "rule-target": "column",
      "object-locator": {
        "schema-name": "public",
        "table-name": "employee"
      },
      "value": "rounded_emp_salary",
      "expression": "round($emp_salary)",
      "data-type": {
        "type": "int8"
      }
  }

Example of adding a new string column to the target table using the hash function

The following example adds a new string column, hashed_emp_number, to the target table, employee. The SQLite hash_sha256(x) function creates hashed values on the target for the source column, emp_number.


  {
      "rule-type": "transformation",
      "rule-id": "2",
      "rule-name": "2",
      "rule-action": "add-column",
      "rule-target": "column",
      "object-locator": {
        "schema-name": "public",
        "table-name": "employee"
      },
      "value": "hashed_emp_number",
      "expression": "hash_sha256($emp_number)",
      "data-type": {
        "type": "string",
        "length": 64
      }
  }

Adding metadata to a target table using expressions

You can add the metadata information to the target table by using the expressions following:

$AR_M_SOURCE_SCHEMA – The name of the source schema.
$AR_M_SOURCE_TABLE_NAME – The name of the source table.
$AR_M_SOURCE_COLUMN_NAME – The name of a column in the source table.
$AR_M_SOURCE_COLUMN_DATATYPE – The data type of a column in the source table.

Example of adding a column for a schema name using the schema name from the source

The example following adds a new column named schema_name to the target by using the schema name from the source.


  {
      "rule-type": "transformation",
      "rule-id": "2",
      "rule-name": "2",
      "rule-action": "add-column",
      "rule-target": "column",
      "object-locator": {
        "schema-name": "%",
        "table-name": "%"
      },
      "rule-action": "add-column",
      "value":"schema_name",
      "expression": "$AR_M_SOURCE_SCHEMA", 
      "data-type": { 
         "type": "string",
         "length": 50
      }
  }

Warning Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Document Conventions

Transformation rules and actions

Table and collection settings rules and operations