Well-Architected machine learning design principles
Well-Architected ML design principles are a set of considerations used as the basis for a well-architected ML workload.
Following the Well-Architected Framework guidelines, use these general design principles to facilitate good design in the cloud for ML workloads:
-
Assign ownership- Apply the right skills and the right number of resources along with accountability and empowerment to increase productivity.
-
Provide protection - Apply security controls to systems and services hosting model data, algorithms, computation, and endpoints. This ensures secure and uninterrupted operations.
-
Enable resiliency - Ensure fault tolerance and the recoverability of ML models through version control, traceability, and explainability.
-
Enable reusability - Use independent modular components that can be shared and reused. This helps enable reliability, improve productivity, and optimize cost.
-
Enable reproducibility - Use version control across components, such as infrastructure, data, models, and code. Track changes back to a point-in-time release. This approach enables model governance and audit standards.
-
Optimize resources - Perform trade-off analysis across available resources and configurations to achieve optimal outcome.
-
Reduce cost - Identify the potentials for reducing cost through automation or optimization, analyzing processes, resources, and operations.
-
Enable automation - Use technologies, such as pipelining, scripting, and continuous integration (CI), continuous delivery (CD), and continuous training (CT), to increase agility, improve performance, sustain resiliency, and reduce cost.
-
Enable continuous improvement - Evolve and improve the workload through continuous monitoring, analysis, and learning.
-
Minimize environmental impact - Establish sustainability goals and understand the impact of ML models. Use managed services and adopt efficient hardware and software and maximize their utilization.