Automation and access control - AWS Prescriptive Guidance

Automation and access control

Automation

Pipeline automation is a crucial part of modern data-centric architecture design. To successfully run your production system, we recommend that you have a data pipeline that has a start trigger, connecting steps, and a mechanism for separating failed and passed stages. It's also important to log failures while not hindering the rest of the ETL process.

You can use AWS Glue workflows to create a pipeline. The pipeline supports all AWS Glue jobs, Amazon EventBridge triggers, and crawlers. You can also create workflows from scratch or by using AWS Glue blueprints. A blueprint provides a framework that helps you get started on reusable use cases. For example, this could be a workflow to import data from Amazon S3 into a DynamoDB table. You can even use parameters to make the blueprint reusable.

If the data pipeline involves more services outside of AWS Glue, then we recommend that you use AWS Step Functions as the orchestrator. Step Functions can create automated workflows, including manual approval steps for security incident response. You can also use Step Functions for large-scale parallel or sequential processing.

Finally, we recommend using EventBridge to insert triggers on schedules, events, or on demand. You can also use EventBridge to create pipelines with filters.

Access control

We recommend that you use AWS Identity and Access Management (IAM) for access control. IAM allows you to specify who or what can access services and resources in AWS and centrally manage fine-grained permissions. Every phase of the lifecycle—from storage to automation to using processing tools—requires the right access permissions. While working with data-centric use cases, you can use AWS Lake Formation to simplify the process of making data available for wide-ranging analytics and also across accounts.