Table properties - Managed Service for Apache Flink

Amazon Managed Service for Apache Flink was previously known as Amazon Kinesis Data Analytics for Apache Flink.

Table properties

In addition to data fields, your AWS Glue tables provide other information to your Studio notebook using table properties. Managed Service for Apache Flink uses the following AWS Glue table properties:

To add a property to an AWS Glue table, do the following:

  1. Sign in to the AWS Management Console and open the AWS Glue console at https://console.aws.amazon.com/glue/.

  2. From the list of tables, choose the table that your application uses to store its data connection information. Choose Action, Edit table details.

  3. Under Table Properties, enter managed-flink.proctime for key and user_action_time for Value.

Using Apache Flink time values

Apache Flink provides time values that describe when stream processing events occured, such as Processing Time and Event Time. To include these values in your application output, you define properties on your AWS Glue table that tell the Managed Service for Apache Flink runtime to emit these values into the specified fields.

The keys and values you use in your table properties are as follows:

Timestamp Type Key Value
Processing Time managed-flink.proctime The column name that AWS Glue will use to expose the value. This column name does not correspond to an existing table column.
Event Time managed-flink.rowtime The column name that AWS Glue will use to expose the value. This column name corresponds to an existing table column.

managed-flink.watermark.column_name.milliseconds

The watermark interval in milliseconds

Using Flink connector and format properties

You provide information about your data sources to your application's Flink connectors using AWS Glue table properties. Some examples of the properties that Managed Service for Apache Flink uses for connectors are as follows:

Connector Type Key Value
Kafka format The format used to deserialize and serialize Kafka messages, e.g. json or csv.
scan.startup.mode The startup mode for the Kafka consumer, e.g. earliest-offset or timestamp.
Kinesis format The format used to deserialize and serialize Kinesis data stream records, e.g. json or csv.
aws.region The AWS region where the stream is defined.
S3 (Filesystem) format The format used to deserialize and serialize files, e.g. json or csv.
path The Amazon S3 path, e.g. s3://mybucket/.

For more information about other connectors besides Kinesis and Apache Kafka, see your connector's documentation.