FLAG_OUTLIERS
Returns a new column containing a customizable value in each row that indicates if the source column value is an outlier.
Parameters
-
sourceColumn
– Specifies the name of an existing numeric column that might contain outliers. -
targetColumn
– Specifies the name of a new column where the results of the outlier evaluation strategy is to be inserted. -
outlierStrategy
– Specifies the approach to use in detecting outliers. Valid values include the following:-
Z_SCORE
– Identifies a value as an outlier when it deviates from the mean by more than the standard deviation threshold. -
MODIFIED_Z_SCORE
– Identifies a value as an outlier when it deviates from the median by more than the median absolute deviation threshold. -
IQR
– Identifies a values as an outlier when it falls beyond the first and last quartile of column data. The interquartile range (IQR) measures where the middle 50% of the data points are.
-
-
threshold
– Specifies the threshold value to use when detecting outliers. ThesourceColumn
value is identified as an outlier if the score that's calculated with theoutlierStrategy
exceeds this number. The default is 3. -
trueString
– Specifies the string value to use if an outlier is detected. The default is "True". -
falseString
– Specifies the string value to use if no outlier is detected. The default is "False".
The following examples display syntax for a single RecipeAction operation. A recipe contains at least one RecipeStep operation, and a recipe step contains at least one recipe action. A recipe action runs the data transform that you specify. A group of recipe actions run in sequential order to create the final dataset.