Select your cookie preferences

We use essential cookies and similar tools that are necessary to provide our site and services. We use performance cookies to collect anonymous statistics, so we can understand how customers use our site and make improvements. Essential cookies cannot be deactivated, but you can choose “Customize” or “Decline” to decline performance cookies.

If you agree, AWS and approved third parties will also use cookies to provide useful site features, remember your preferences, and display relevant content, including relevant advertising. To accept or decline all non-essential cookies, choose “Accept” or “Decline.” To make more detailed choices, choose “Customize.”

Well-Architected machine learning design principles - Machine Learning Lens

Well-Architected machine learning design principles

Well-Architected ML design principles are a set of considerations used as the basis for a well-architected ML workload.

Following the Well-Architected Framework guidelines, use these general design principles to facilitate good design in the cloud for ML workloads:

  • Assign ownership- Apply the right skills and the right number of resources along with accountability and empowerment to increase productivity.

  • Provide protection - Apply security controls to systems and services hosting model data, algorithms, computation, and endpoints. This ensures secure and uninterrupted operations.

  • Enable resiliency - Ensure fault tolerance and the recoverability of ML models through version control, traceability, and explainability.

  • Enable reusability - Use independent modular components that can be shared and reused. This helps enable reliability, improve productivity, and optimize cost.

  • Enable reproducibility - Use version control across components, such as infrastructure, data, models, and code. Track changes back to a point-in-time release. This approach enables model governance and audit standards.

  • Optimize resources - Perform trade-off analysis across available resources and configurations to achieve optimal outcome.

  • Reduce cost - Identify the potentials for reducing cost through automation or optimization, analyzing processes, resources, and operations.

  • Enable automation - Use technologies, such as pipelining, scripting, and continuous integration (CI), continuous delivery (CD), and continuous training (CT), to increase agility, improve performance, sustain resiliency, and reduce cost.

  • Enable continuous improvement - Evolve and improve the workload through continuous monitoring, analysis, and learning.

  • Minimize environmental impact - Establish sustainability goals and understand the impact of ML models. Use managed services and adopt efficient hardware and software and maximize their utilization.

PrivacySite termsCookie preferences
© 2025, Amazon Web Services, Inc. or its affiliates. All rights reserved.