What is Amazon Security Lake? - Amazon Security Lake

Amazon Security Lake is in preview release. Your use of the Amazon Security Lake preview is subject to Section 2 of the AWS Service Terms ("Betas and Previews").

What is Amazon Security Lake?

Amazon Security Lake is a fully managed security data lake service. You can use Security Lake to automatically centralize security data from AWS and third-party sources into a data lake that's stored in your AWS account. Security Lake helps you analyze security data, so you can get a more complete understanding of your security posture across the entire organization. With Security Lake, you can also improve the protection of your workloads, applications, and data.

The data lake is backed by Amazon Simple Storage Service (Amazon S3) buckets, and you retain ownership over your data.

Security Lake automates the collection of security-related log and event data from integrated AWS services and third-party services. It also helps you manage the lifecycle of data with customizable retention and replication settings. Security Lake converts ingested data into Apache Parquet format and a standard open-source schema called the Open Cybersecurity Schema Framework (OCSF).

Other AWS services and third-party services can subscribe to the data that's stored in Security Lake for incident response and security data analytics.

Overview of Security Lake

Overview diagram of the Amazon Security Lake data lake

Features of Security Lake

Here are some key ways that Security Lake helps you centralize, manage, and subscribe to security-related log and event data.

Variety of supported log and event sources

Security Lake collects security logs and events from multiple sources, including on-premises, AWS services, and third-party services. After ingesting logs, regardless of the source, you can access them centrally, and manage their lifecycle.

Data transformation and normalization

Security Lake automatically partitions incoming data from natively supported AWS services and converts it to a storage- and query-efficient Parquet format. It also transforms data from natively supported AWS services to the Open Cybersecurity Schema Framework (OCSF) open-source schema. This makes the data compatible with other AWS services and third-party providers without the need for post-processing. Since Security Lake normalizes data, many security solutions can consume this data in parallel.

Multiple levels of access for subscribers

Subscribers consume data stored in Security Lake. You can choose a subscriber's level of access to your data. Subscribers may consume data only from the sources, and in the AWS Regions, that you specify. Subscribers may be automatically notified about new objects as they're written to the data lake. Or, subscribers can query data from the data lake. Security Lake automatically creates and exchanges the credentials needed between Security Lake and the subscriber.

Multi-account and multi-Region data management

You can centrally enable Security Lake across all Regions where it's available, and across multiple AWS accounts. In Security Lake, you can also specify a rollup Region to consolidate security log and event data from multiple Regions. This can help you comply with data residency compliance requirements.

Configurable and customizable

Security Lake is a configurable and customizable service. You can specify which sources, accounts, and Regions you want to configure log collection for. You can also specify a subscriber's level of access to the data lake.

Accessing Security Lake

For a list of Regions where Security Lake is currently available, see Amazon Security Lake Regions and endpoints. To learn more about Regions, see AWS service endpoints in the AWS General Reference.

In each Region, you can access Security Lake in any of the following ways:

AWS Management Console

The AWS Management Console is a browser-based interface that you can use to create and manage AWS resources. The Security Lake console provides access to your Security Lake account and resources. You can perform most Security Lake tasks by using the Security Lake console.

Security Lake API

To access Security Lake programmatically, use the Security Lake API, and issue HTTPS requests directly to the service. For more information, see the Security Lake API Reference.

AWS Command Line Interface (AWS CLI)

With the AWS CLI, you can issue commands at your system's command line to perform Security Lake tasks and AWS tasks. Using the command line can be faster and more convenient than using the console. The command line tools are also useful if you want to build scripts that perform tasks. For information about installing and using the AWS CLI, see the AWS Command Line Interface.

AWS SDKs

AWS provides SDKs that consist of libraries and sample code for various programming languages and platforms, such as Java, Go, Python, C++, and .NET. The SDKs provide convenient, programmatic access to Security Lake and other AWS services. They also handle tasks such as cryptographically signing requests, managing errors, and retrying requests automatically. For information about installing and using the AWS SDKs, see Tools to Build on AWS.

The following are other AWS services that Security Lake uses:

  • AWS Glue – Security Lake uses AWS Glue crawlers to create the AWS Glue Data Catalog tables and send newly written data to the Data Catalog. Security Lake also stores partition metadata for AWS Lake Formation tables in the Data Catalog.

  • AWS Lake Formation – Security Lake creates a separate Lake Formation table for each source that contributes data to Security Lake. Lake Formation tables contain information about data from each source, including schema, partition, and data location information. Subscribers have the option to consume data by querying the Lake Formation tables.

  • AWS Lambda – Security Lake uses Lambda functions to support extract, transform, and load (ETL) jobs on raw data and to register partitions for source data in AWS Glue.

  • Amazon S3 – Security Lake stores your data as Amazon S3 objects. Storage classes and retention settings are based on Amazon S3 offerings. Security Lake doesn't support Amazon S3 Select.

Security Lake collects data from custom sources in addition to the following AWS services:

  • AWS CloudTrail management and data events (S3, Lambda)

  • Amazon Route 53 resolver query logs

  • AWS Security Hub findings

  • Amazon Virtual Private Cloud (Amazon VPC) flow logs

For more information about these sources, see Collecting data from AWS services. You can consume the Amazon S3 objects in your security data lake by creating a subscriber that can read data in the OCSF schema. You can also query data by using Amazon Athena, Amazon Redshift, and third-party subscription services that integrate with AWS Glue.