Data structure recipe steps - AWS Glue DataBrew

Data structure recipe steps

Use these recipe steps to tabulate and summarize data from different perspectives, or to perform advanced functions.

SCALE

Scales or normalizes the range of data in a numeric column.

Parameters
  • sourceColumn — The name of an existing column.

  • strategy — The operation to be applied to the column values:

    • MIN_MAX — Rescales the values into a range of [0,1].

    • SCALE_BETWEEN — Rescales the values into a range of two specified values.

    • MEAN_NORMALIZATION — Rescales the data to have a mean (μ) of 0 and standard deviation (σ) of 1 within a range of [-1, 1].

    • Z_SCORE — Linearly scales data values to have a mean (μ) of 0 and standard deviation (σ) of 1. Best for handling outliers.

  • targetColumn — The name of a column to contain the results.

Example

{ "Action": { "Operation": "NORMALIZATION", "Parameters": { "sourceColumn": "all_votes", "strategy": "MIN_MAX", "targetColumn": "all_votes_normalized" } } }