AWS::DataBrew::Ruleset Rule - AWS CloudFormation

AWS::DataBrew::Ruleset Rule

Represents a single data quality requirement that should be validated in the scope of this dataset.

Syntax

To declare this entity in your AWS CloudFormation template, use the following syntax:

JSON

{ "CheckExpression" : String, "ColumnSelectors" : [ ColumnSelector, ... ], "Disabled" : Boolean, "Name" : String, "SubstitutionMap" : [ SubstitutionValue, ... ], "Threshold" : Threshold }

Properties

CheckExpression

The expression which includes column references, condition names followed by variable references, possibly grouped and combined with other conditions. For example, (:col1 starts_with :prefix1 or :col1 starts_with :prefix2) and (:col1 ends_with :suffix1 or :col1 ends_with :suffix2). Column and value references are substitution variables that should start with the ':' symbol. Depending on the context, substitution variables' values can be either an actual value or a column name. These values are defined in the SubstitutionMap. If a CheckExpression starts with a column reference, then ColumnSelectors in the rule should be null. If ColumnSelectors has been defined, then there should be no columnn reference in the left side of a condition, for example, is_between :val1 and :val2.

Required: Yes

Type: String

Pattern: ^[><0-9A-Za-z_.,:)(!= ]+$

Minimum: 4

Maximum: 1024

Update requires: No interruption

ColumnSelectors

List of column selectors. Selectors can be used to select columns using a name or regular expression from the dataset. Rule will be applied to selected columns.

Required: No

Type: Array of ColumnSelector

Minimum: 1

Update requires: No interruption

Disabled

A value that specifies whether the rule is disabled. Once a rule is disabled, a profile job will not validate it during a job run. Default value is false.

Required: No

Type: Boolean

Update requires: No interruption

Name

The name of the rule.

Required: Yes

Type: String

Minimum: 1

Maximum: 128

Update requires: No interruption

SubstitutionMap

The map of substitution variable names to their values used in a check expression. Variable names should start with a ':' (colon). Variable values can either be actual values or column names. To differentiate between the two, column names should be enclosed in backticks, for example, ":col1": "`Column A`".

Required: No

Type: Array of SubstitutionValue

Update requires: No interruption

Threshold

The threshold used with a non-aggregate check expression. Non-aggregate check expressions will be applied to each row in a specific column, and the threshold will be used to determine whether the validation succeeds.

Required: No

Type: Threshold

Update requires: No interruption