Example: Detecting Hotspots on a Stream (HOTSPOTS Function)

Amazon Kinesis Data Analytics provides the HOTSPOTS function, which can locate and return information about relatively dense regions in your data. For more information, see HOTSPOTS in the Amazon Managed Service for Apache Flink SQL Reference.

In this exercise, you write application code to locate hotspots on your application's streaming source. To set up the application, you do the following steps:

Set up a streaming source – You set up a Kinesis stream and write sample coordinate data as shown following:
```
{"x": 7.921782426109737, "y": 8.746265312709893, "is_hot": "N"}
{"x": 0.722248626528026, "y": 4.648868803193405, "is_hot": "Y"}
```
The example provides a Python script for you to populate the stream. The x and y values are randomly generated, with some records being clustered around certain locations.

The is_hot field is provided as an indicator if the script intentionally generated the value as part of a hotspot. This can help you evaluate whether the hotspot detection function is working properly.
Create the application – Using the AWS Management Console, you then create a Kinesis Data Analytics application. Configure the application input by mapping the streaming source to an in-application stream (SOURCE_SQL_STREAM_001). When the application starts, Kinesis Data Analytics continuously reads the streaming source and inserts records into the in-application stream.

In this exercise, you use the following code for the application:
```
CREATE OR REPLACE STREAM "DESTINATION_SQL_STREAM" (
    "x" DOUBLE, 
    "y" DOUBLE, 
    "is_hot" VARCHAR(4),
    HOTSPOTS_RESULT VARCHAR(10000)
); 
CREATE OR REPLACE PUMP "STREAM_PUMP" AS 
    INSERT INTO "DESTINATION_SQL_STREAM" 
    SELECT "x", "y", "is_hot", "HOTSPOTS_RESULT" 
    FROM TABLE (
        HOTSPOTS(   
            CURSOR(SELECT STREAM "x", "y", "is_hot" FROM "SOURCE_SQL_STREAM_001"), 
            1000, 
            0.2, 
            17)
    );
```
The code reads rows in the SOURCE_SQL_STREAM_001, analyzes it for significant hotspots, and writes the resulting data to another in-application stream (DESTINATION_SQL_STREAM). You use pumps to insert rows in in-application streams. For more information, see In-Application Streams and Pumps.
Configure the output – You configure the application output to send data from the application to an external destination, which is another Kinesis data stream. Review the hotspot scores and determine what scores indicate that a hotspot occurred (and that you need to be alerted). You can use an AWS Lambda function to further process hotspot information and configure alerts.
Verify the output – The example includes a JavaScript application that reads data from the output stream and displays it graphically, so you can view the hotspots that the application generates in real time.

The exercise uses the US West (Oregon) (us-west-2) to create these streams and your application. If you use any other Region, update the code accordingly.

Topics

Warning Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Document Conventions

Step 3: Examine the Results

Step 1: Create Streams