Statistical Variance and Deviation Functions - Amazon Kinesis Data Analytics SQL Reference

Statistical Variance and Deviation Functions

Each of these functions takes a set of numbers, ignores nulls, and can be used as either an aggregate function or an analytical function. For more information, see Aggregate Functions and Analytic Functions.

The relationships among these functions are described in the following table.

Function purpose Function name Formula Comments

Hotspots

HOTSPOTS (expr)

Detects hotspots of frequently occurring data in the data stream.

Random Cut Forest

RANDOM_CUT_FOREST (expr)

Detects anomalies in the data stream.

Random Cut Forest with Explanation

RANDOM_CUT_FOREST_WITH_EXPLANATION (expr)

Detects anomalies in the data stream, and returns an attribution score based on how anomalous the data in each column is.

Population variance

VAR_POP(expr)

( SUM(expr*expr) - SUM(expr)*SUM(expr) / COUNT(expr)) / COUNT(expr)

Applied to an empty set, it returns null.

Population standard deviation

STDDEV_POP

Square root of the population variance (VAR_POP).

When VAR_POP returns null, STDDEV_POP returns null.

Sample variance

VAR_SAMP

(SUM(expr*expr) - SUM(expr)*SUM(expr) / COUNT(expr)) / (COUNT(expr)−1)

Applied to an empty set, it returns null.

Applied to an input set of one element, VAR_SAMP returns null.

Sample standard deviation

STDDEV_SAMP (expr)

Square root of the sample variance (VAR_SAMP).

Applied to only 1 row of input data, STDDEV_SAMP returns null.