To build a data pipeline, you must first ingest data into AWS from an external or internal data source, such as a file server, database, storage bucket, or from an API call. The ingested data may or may not go through transformation, such as anonymization, column dropping, or data cleaning.
This section provides an overview of the stages in the data lifecycle process, as shown in the following diagram.

These stages include the following:
Data collection
Data preparation and cleaning
Data quality checks
Data visualization and analysis
Monitoring and debugging
IaC deployment
Automation and access control