CloudTrail Lake concepts and terminology - AWS CloudTrail

CloudTrail Lake concepts and terminology

This section describes the key concepts and terms to help you use AWS CloudTrail Lake.

Event data stores

Events are aggregated into event data stores, which are immutable collections of events based on criteria that you select by applying advanced event selectors.

You can create an event data store to log CloudTrail management events and data events, CloudTrail Insights events, AWS Audit Manager evidence, AWS Config configuration items, or events outside of AWS.

Advanced event selectors

Advanced event selectors determine which events to include in an event data store. Advanced event selectors help you control costs by logging only those events that are important to you.

For management events and data events, you can use advanced event selectors to filter events. For example, if you're creating an event data store to collect management events, you can filter out AWS Key Management Service (AWS KMS) or Amazon Relational Database Service (Amazon RDS) Data API events. Typically, AWS KMS actions such as Encrypt, Decrypt, and GenerateDataKey generate more than 99 percent of events.

For AWS Config configuration items, Audit Manager evidence, or events outside of AWS, advanced event selectors are used only to include events of that type in the event data store.

Federation

Federation lets you see the metadata associated with an event data store in the AWS Glue Data Catalog and run SQL queries on the event data using Amazon Athena. The table metadata stored in the AWS Glue Data Catalog lets the Athena query engine know how to find, read, and process the data that you want to query.

When you enable Lake query federation, CloudTrail creates the federated resources on your behalf and registers those resources with AWS Lake Formation. After Lake federation is enabled, you can directly query your event data in Athena without needing to perform any additional steps. For more information, see Federate an event data store.

Pricing option

When you create an event data store, you choose the pricing option that you want to use for the event data store. The pricing option determines the cost for ingesting and storing events, and the default and maximum retention periods for the event data store. For information about pricing, see AWS CloudTrail Pricing and Managing CloudTrail Lake costs.

Retention period

An event data store’s retention period determines how long event data is kept in the event data store. CloudTrail Lake determines whether to retain an event by checking if the eventTime of the event is within the specified retention period. For example, if you specify a retention period of 90 days, CloudTrail will remove events when their eventTime is older than 90 days.

Default retention period

An event data store’s default retention period is the default number of days that event data is kept in the event data store. During an event data store’s default retention period, storage is included with ingestion pricing at no additional charge. After the default retention period, pricing for storage is pay-as-you-go.

Maximum retention period

An event data store’s maximum retention period represents the maximum number of days that you can keep data in an event data store.

Termination protection

By default, event data stores enable termination protection, which protects an event data store from being accidentally deleted. To delete an event data store with termination protection enabled, choose Change termination protection from the Actions menu on the event data store’s details page. Then you can proceed with deleting the event data store. For more information, see Change termination protection with the console.

Integrations

You can use CloudTrail Lake integrations to log and store user activity data from the following sources:

  • Outside of AWS

  • Any source in your hybrid environments, such as in-house or software as a service (SaaS) applications hosted on premises or in the cloud, virtual machines, or containers

An integration requires a channel to deliver the events and an event data store to receive the events. After you set up your integration, call the PutAuditEvents API operation to ingest your application activity into CloudTrail. Then, you can use CloudTrail Lake to search, query, and analyze the data that is logged from your applications. For more information, see Create an integration with an event source outside of AWS.

Integration type

There are two types of integrations: direct and solution. With direct integrations, the partner calls the PutAuditEvents API operation to deliver events to the event data store for your AWS account. With solution integrations, the application runs in your AWS account and the application calls the PutAuditEvents API operation to deliver events to the event data store for your AWS account.

Channels

Activity events from sources outside of AWS work by using channels to bring events into CloudTrail Lake from external partners that work with CloudTrail, or from your own sources. When you create a channel, you choose one or more event data stores to store events that arrive from the channel source. You can change the destination event data stores for a channel as needed, as long as the destination event data stores are set to log eventCategory="ActivityAuditLog" events. When you create a channel for events from an external partner, you provide a channel Amazon Resource Name (ARN) to the partner or source application.

Resource-based policies

Resource-based policies are JSON policy documents that you attach to a resource. The resource-based policy attached to the channel allows the source to transmit events through the channel. If a channel doesn't have a resource policy, only the channel owner can call the PutAuditEvents API operation on the channel. For more information, see AWS CloudTrail resource-based policy examples.

Queries

Note

Introducing a preview feature for CloudTrail Lake queries that uses generative artificial intelligence (generative AI) capabilities to produce a SQL query from an English language prompt. For more information, see Create CloudTrail Lake queries from English language prompts.

Queries in CloudTrail Lake are authored in SQL. You can build a query on the CloudTrail Lake Editor tab by writing the query in SQL from scratch, or by opening a saved or sample query and editing it. You can't overwrite an included sample query with your changes, but you can save it as a new query. For more information, see Create or edit a query with the CloudTrail console.

CloudTrail Lake supports all valid Presto SELECT statements and functions. For more information about the supported SQL functions and operators, see Functions and Operators on the Presto documentation website.

Dashboard

By using the CloudTrail Lake dashboard, you can visualize the events in an event data store and see events trends, such as top AWS services, users, and errors. For more information, see View CloudTrail Lake dashboards with the CloudTrail console.

Dashboard type

The dashboard types available for an event data store depend upon the advanced event selectors configuration of the event data store. For example, if a dashboard type displays information about CloudTrail management events, you can only select the dashboard if the currently selected event data store collects CloudTrail management events.

The following are the available dashboard types:

  • Overview dashboard – Shows the most active users, AWS Regions, and AWS services by event count. You can also view information about read and write management event activity, most throttled events, and the top errors. This dashboard is available for event data stores that collect management events.

  • Management Events dashboard – Shows console sign-in events, access denied events, destructive actions, and top errors by user. You can also view information about TLS versions and outdated TLS calls by user. This dashboard is available for event data stores that collect management events.

  • S3 Data Events dashboard – Shows Amazon S3 account activity, most accessed S3 objects, top S3 users, and top S3 actions. This dashboard is available for event data stores that collect Amazon S3 data events.

  • Insights Events dashboard - Shows the overall proportion of Insights events by Insights type, the proportion of Insights events by Insights type for the top users and services, and the number of Insights events per day. The dashboard also includes a widget that lists up to 30 days of Insights events. This dashboard is only available for event data stores that collect Insights events.

    Note
    • After you enable CloudTrail Insights for the first time on the source event data store, it can take up to 7 days for CloudTrail to deliver the first Insights event, if unusual activity is detected. For more information, see Understanding Insights events delivery.

    • The Insights Events dashboard only displays information about the Insights events collected by the selected event data store, which is determined by the configuration of the source event data store. For example, if you configure the source event data store to enable Insights events on ApiCallRateInsight but not ApiErrorRateInsight, you won't see information about Insights events on ApiErrorRateInsight.

Widgets

Widgets are the components that make up a dashboard and provide a visualization, such as a line chart or bar graph. Each widget represents an underlying query. When you choose Run queries, CloudTrail runs a system-generated query to populate the data for each widget.