Step 8: Use a blueprint to create a workflow - AWS Lake Formation

Step 8: Use a blueprint to create a workflow

In order to read the CloudTrail logs, understand their structure, create the appropriate tables in the Data Catalog, we need to set up a workflow that consists of a AWS Glue crawlers, jobs, triggers and workflows. Lake Formation's blueprints simplifies this process.

The workflow generates the jobs, crawlers, and triggers that discover and ingest data into your data lake. You create a workflow based on one of the predefined Lake Formation blueprints.

  1. In the Lake Formation console, in the navigation pane, choose Blueprints, and then choose Use blueprint.

  2. On the Use a blueprint page, under Blueprint type, choose AWS CloudTrail.

  3. Under Import source, choose a CloudTrail source and start date.

  4. Under Import target, specify these parameters:

    Target database lakeformation_cloudtrail
    Target storage location s3://<yourName>-datalake-cloudtrail
    Data format Parquet
  5. For import frequency, choose Run on demand.

  6. Under Import options, specify these parameters:

    Workflow name lakeformationcloudtrailtest
    IAM role LakeFormationWorkflowRole
    Table prefix cloudtrailtest
    Note

    Must be lower case.

  7. Choose Create, and wait for the console to report that the workflow was successfully created.

    Tip

    Did you get the following error message?

    User: arn:aws:iam::<account-id>:user/<datalake_administrator_user> is not authorized to perform: iam:PassRole on resource:arn:aws:iam::<account-id>:role/LakeFormationWorkflowRole...

    If so, check that you replaced <account-id> in the inline policy for the data lake administrator user with a valid AWS account number.