Data lifecycle - AWS Prescriptive Guidance

Data lifecycle

To build a data pipeline, you must first ingest data into AWS from an external or internal data source, such as a file server, database, storage bucket, or from an API call. The ingested data may or may not go through transformation, such as anonymization, column dropping, or data cleaning.

This section provides an overview of the stages in the data lifecycle process, as shown in the following diagram.

Data lifecycle overview diagram

These stages include the following:

  • Data collection

  • Data preparation and cleaning

  • Data quality checks

  • Data visualization and analysis

  • Monitoring and debugging

  • IaC deployment

  • Automation and access control