Recommended AWS architecture for new product demand forecasting
As you scale your AI/ML pipeline to multiple products and regions, it is recommended that you follow machine learning operations (MLOps) best practices for reproducibility, reliability, and scalability. For more information, see Implement MLOps in the Amazon SageMaker AI documentation. The following image shows an example AWS architecture for implementing an ML model that forecasts demand for new product introductions.

The example AWS architecture consists of three layers: Data engineering, DevOps, and Data science.
The Data engineering layer focuses on ingesting data from corporate data sources by using AWS Glue and then storing the data in a cost-effective manner in Amazon Simple Storage Service (Amazon S3). AWS Glue is a fully managed serverless ETL service that helps you categorize, clean, transform, and reliably transfer data between different data stores. Amazon S3 is an object storage service that offers scalability, data availability, security, and performance. The Data engineering layer also shows offline batch inference deployment by using batch transform in Amazon SageMaker AI. The batch transform obtains the input data from Amazon S3 and sends it in one or more HTTP requests through Amazon API Gateway to the inference pipeline model. Amazon API Gateway is a fully managed service that helps you create, publish, maintain, monitor, and secure APIs at any scale. Finally, the Data engineering layer shows the use of Amazon CloudWatch, a service that gives visibility into system-wide performance and helps you set alarms, automatically react to changes, and gain a unified view of operational health. CloudWatch stores the log files to an Amazon S3 bucket that you specify.
The DevOps layer uses API Gateway, CloudWatch, and Amazon SageMaker AI Model Monitor for real-time inference deployment. Model Monitor helps you set up an automated alert-triggering system for deviations in the model quality, such as data drift and anomalies. Amazon CloudWatch Logs collects log files from Model Monitor and notifies you when the quality of your model hits certain thresholds, which you preset. The DevOps layer also shows the use of AWS CodePipeline for automating code delivery pipelines.
The Data science layer shows the use of Amazon SageMaker AI Pipelines and Amazon SageMaker AI Feature Store to manage the machine learning lifecycle. SageMaker AI Pipelines is a purpose-built workflow orchestration service that helps you automate all ML phases, from data preprocessing to model monitoring. With an intuitive UI and Python SDK, you can manage repeatable end-to-end ML pipelines at scale. The native integration with multiple AWS services helps you customize the ML lifecycle based on your MLOps requirements. Feature Store is a fully managed, purpose-built repository to store, share, and manage features for ML models. Features are inputs to ML models, and they are used during training and inference.