Create the table for ALB connection logs in Athena using partition projection - Amazon Athena

Create the table for ALB connection logs in Athena using partition projection

Because ALB connection logs have a known structure whose partition scheme you can specify in advance, you can reduce query runtime and automate partition management by using the Athena partition projection feature. Partition projection automatically adds new partitions as new data is added. This removes the need for you to manually add partitions by using ALTER TABLE ADD PARTITION.

The following example CREATE TABLE statement automatically uses partition projection on ALB connection logs from a specified date until the present for a single AWS region. The statement is based on the example in the previous section but adds PARTITIONED BY and TBLPROPERTIES clauses to enable partition projection. In the LOCATION and storage.location.template clauses, replace the placeholders with values that identify the Amazon S3 bucket location of your ALB connection logs. For more information about connection log file location, see Connection log files in the User Guide for Application Load Balancers. For projection.day.range, replace 2023/01/01 with the starting date that you want to use. After you run the query successfully, you can query the table. You do not have to run ALTER TABLE ADD PARTITION to load the partitions. For information about each log file field, see Connection log entries.

CREATE EXTERNAL TABLE IF NOT EXISTS alb_connection_logs ( time string, client_ip string, client_port int, listener_port int, tls_protocol string, tls_cipher string, tls_handshake_latency double, leaf_client_cert_subject string, leaf_client_cert_validity string, leaf_client_cert_serial_number string, tls_verify_status string, conn_trace_id string ) PARTITIONED BY ( day STRING ) ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.RegexSerDe' WITH SERDEPROPERTIES ( 'serialization.format' = '1', 'input.regex' = '([^ ]*) ([^ ]*) ([0-9]*) ([0-9]*) ([A-Za-z0-9.-]*) ([^ ]*) ([-.0-9]*) \"([^\"]*)\" ([^ ]*) ([^ ]*) ([^ ]*) ?([^ ]*)?( .*)?' ) LOCATION 's3://amzn-s3-demo-bucket/AWSLogs/<ACCOUNT-NUMBER>/elasticloadbalancing/<REGION>/' TBLPROPERTIES ( "projection.enabled" = "true", "projection.day.type" = "date", "projection.day.range" = "2023/01/01,NOW", "projection.day.format" = "yyyy/MM/dd", "projection.day.interval" = "1", "projection.day.interval.unit" = "DAYS", "storage.location.template" = "s3://amzn-s3-demo-bucket/AWSLogs/<ACCOUNT-NUMBER>/elasticloadbalancing/<REGION>/${day}" )

For more information about partition projection, see Use partition projection with Amazon Athena.