Creating a data lake from an AWS CloudTrail source - AWS Lake Formation

Creating a data lake from an AWS CloudTrail source

This tutorial guides you through the actions to take on the Lake Formation console to create and load your first data lake from an AWS CloudTrail source.

High-level steps for creating a data lake

  1. Register an Amazon Simple Storage Service (Amazon S3) path as a data lake.

  2. Grant Lake Formation permissions to write to the Data Catalog and to Amazon S3 locations in the data lake.

  3. Create a database to organize the metadata tables in the Data Catalog.

  4. Use a blueprint to create a workflow. Run the workflow to ingest data from a data source.

  5. Set up your Lake Formation permissions to allow others to manage data in the Data Catalog and the data lake.

  6. Set up Amazon Athena to query the data that you imported into your Amazon S3 data lake.

  7. For some data store types, set up Amazon Redshift Spectrum to query the data that you imported into your Amazon S3 data lake.