Example: Detecting Data Anomalies and Getting an Explanation (RANDOM_CUT_FOREST_WITH_EXPLANATION Function) - Amazon Kinesis Data Analytics for SQL Applications Developer Guide

After careful consideration, we have decided to discontinue Amazon Kinesis Data Analytics for SQL applications in two steps:

1. From October 15, 2025, you will not be able to create new Kinesis Data Analytics for SQL applications.

2. We will delete your applications starting January 27, 2026. You will not be able to start or operate your Amazon Kinesis Data Analytics for SQL applications. Support will no longer be available for Amazon Kinesis Data Analytics for SQL from that time. For more information, see Amazon Kinesis Data Analytics for SQL Applications discontinuation.

Example: Detecting Data Anomalies and Getting an Explanation (RANDOM_CUT_FOREST_WITH_EXPLANATION Function)

Amazon Kinesis Data Analytics provides the RANDOM_CUT_FOREST_WITH_EXPLANATION function, which assigns an anomaly score to each record based on values in the numeric columns. The function also provides an explanation of the anomaly. For more information, see RANDOM_CUT_FOREST_WITH_EXPLANATION in the Amazon Managed Service for Apache Flink SQL Reference.

In this exercise, you write application code to obtain anomaly scores for records in your application's streaming source. You also obtain an explanation for each anomaly.

First Step

Step 1: Prepare the Data