Application Code - Amazon Kinesis Data Analytics for SQL Applications Developer Guide

For new projects, we recommend that you use the new Managed Service for Apache Flink Studio over Kinesis Data Analytics for SQL Applications. Managed Service for Apache Flink Studio combines ease of use with advanced analytical capabilities, enabling you to build sophisticated stream processing applications in minutes.

Application Code

Application code is a series of SQL statements that process input and produce output. These SQL statements operate on in-application streams and reference tables. For more information, see Amazon Kinesis Data Analytics for SQL Applications: How It Works.

For information about the SQL language elements that are supported by Kinesis Data Analytics, see Amazon Kinesis Data Analytics SQL Reference.

In relational databases, you work with tables, using INSERT statements to add records and the SELECT statement to query the data. In Amazon Kinesis Data Analytics, you work with streams. You can write a SQL statement to query these streams. The results of querying one in-application stream are always sent to another in-application stream. When performing complex analytics, you might create several in-application streams to hold the results of intermediate analytics. And then finally, you configure application output to persist results of the final analytics (from one or more in-application streams) to external destinations. In summary, the following is a typical pattern for writing application code:

  • The SELECT statement is always used in the context of an INSERT statement. That is, when you select rows, you insert results into another in-application stream.

  • The INSERT statement is always used in the context of a pump. That is, you use pumps to write to an in-application stream.

The following example application code reads records from one in-application (SOURCE_SQL_STREAM_001) stream and write it to another in-application stream (DESTINATION_SQL_STREAM). You can insert records to in-application streams using pumps, as shown following:

CREATE OR REPLACE STREAM "DESTINATION_SQL_STREAM" (ticker_symbol VARCHAR(4), change DOUBLE, price DOUBLE); -- Create a pump and insert into output stream. CREATE OR REPLACE PUMP "STREAM_PUMP" AS INSERT INTO "DESTINATION_SQL_STREAM" SELECT STREAM ticker_symbol, change,price FROM "SOURCE_SQL_STREAM_001";
Note

The identifiers that you specify for stream names and column names follow standard SQL conventions. For example, if you put quotation marks around an identifier, it makes the identifier case sensitive. If you don't, the identifier defaults to uppercase. For more information about identifiers, see Identifiers in the Amazon Managed Service for Apache Flink SQL Reference.

Your application code can consist of many SQL statements. For example:

  • You can write SQL queries in a sequential manner where the result of one SQL statement feeds into the next SQL statement.

  • You can also write SQL queries that run independent of each other. For example, you can write two SQL statements that query the same in-application stream, but send output into different in-applications streams. You can then query the newly created in-application streams independently.

You can create in-application streams to save intermediate results. You insert data in in-application streams using pumps. For more information, see In-Application Streams and Pumps.

If you add an in-application reference table, you can write SQL to join data in in-application streams and reference tables. For more information, see Example: Adding Reference Data to a Kinesis Data Analytics Application.

According to the application's output configuration, Amazon Kinesis Data Analytics writes data from specific in-application streams to the external destination according to the application's output configuration. Make sure that your application code writes to the in-application streams specified in the output configuration.

For more information, see the following topics: