Appendix B: YAML File Configuration
The Streaming Analytics Pipeline includes a YAML file that contains configuration information for the Amazon Kinesis Data Analytics application that the solution creates. Review the parameters in the YAML file and modify them as necessary for your implementation. Then, upload the file to an Amazon S3 bucket.
streaming-analytics-pipeline-config.yaml: Use this file to specify your Amazon Kinesis Analytics application configuration.
Parameter | Default | Description |
---|---|---|
Input Format Type | CSV |
The format of the records of the source stream. Choose CSV or JSON .
|
Record Column Delimiter | “,” |
The column delimiter of CSV-formatted data from the source stream. For example, Note Leave this parameter blank if you chose |
Record Row Delimiter | “/n” |
The row delimiter of CSV-formatted data from the source stream. For example, Note Leave this parameter blank if you chose |
Record Row Path | “$” |
The path to the top-level parent that contains the records. Note Leave this parameter blank if you chose |
Output Format Type | CSV |
The format of the analyzed data that is put in the output stream. Choose CSV or JSON .
|
Columns | <Requires Input> |
A list of dictionary values specifying the name, SQL type, and (if necessary) record
row path mapping. For example, CSV: {Name: pressure, SqlType: DOUBLE} or JSON: {Name: pressure, SqlType: DOUBLE, Mapping: $.pressure} .
|
SQL Code | <Requires Input> |
The Amazon Kinesis Data Analytics application code. The code will be copied to the application. |
The YAML configuration file is not required to run this solution. If you do not specify a file location, the solution will launch an Kinesis Data Analytics application with the following configuration with a “catch all” schema.
Note
If you provide a YAML configuration file location, you must complete the Format section of the file.
# Update this file according to your Input Schema and application code # Note: pay attention to indentation - it matters format: InputFormatType: CSV RecordColumnDelimiter: "," RecordRowDelimiter: "\n" RecordRowPath: "$" OutputFormatType: CSV columns: - {Name: temp, SqlType: TINYINT} - {Name: segmentId, SqlType: CHAR(4)} - {Name: sensorIp, SqlType: VARCHAR(15)} - {Name: pressure, SqlType: DOUBLE} - {Name: incline, SqlType: DOUBLE} - {Name: flow, SqlType: BIGINT} - {Name: captureTs, SqlType: TIMESTAMP} - {Name: sensorId, SqlType: CHAR(4)} sql_code: | -- Paste your SQL code here CREATE OR REPLACE STREAM "DESTINATION_SQL_STREAM" ( temp TINYINT, sensorIp VARCHAR(15), sensorId CHAR(4), captureTs TIMESTAMP, pressure DOUBLE); CREATE OR REPLACE PUMP "STREAM_PUMP" AS INSERT INTO "DESTINATION_SQL_STREAM" SELECT STREAM "temp", "sensorIp", "sensorId", "captureTs", "pressure" FROM "SOURCE_SQL_STREAM_001";