MLREL-13: Ensure a recoverable endpoint with a managed version control strategy - Machine Learning Lens

MLREL-13: Ensure a recoverable endpoint with a managed version control strategy

Ensure an endpoint responsible for hosting model predictions, and all components responsible for generating that endpoint, are fully recoverable. Some of these components include model artifacts, container images, and endpoint configurations. Ensure all required components are version controlled, and traceable in a lineage tracker system.

Implementation plan

  • Implement MLOps best practices with Amazon SageMaker AI Pipelines and Projects - Amazon SageMaker AI Pipelines is a service for building machine learning pipelines. It automates developing, training, and deploying models in a versioned, predictable manner. Amazon SageMaker AI Projects enable teams of data scientists and developers to collaborate on machine learning business problems. A SageMaker AI project is an Service Catalog provisioned product that enables you to easily create an end-to-end ML solution. SageMaker AI Projects entities include pipeline executions, registered models, endpoints, datasets, and code repositories.

  • Use infrastructure as code (IaC) tools - Use AWS CloudFormation to define and build your infrastructure, including your model endpoints. Store your AWS CloudFormation code in git repositories so that you can version control your infrastructure code.

  • Use Amazon Elastic Container Registry (Amazon ECR) - Store your containers in Amazon ECR, an artifact repository for Docker containers. Amazon ECR automatically creates a version hash for your containers as you update them, allowing you to roll back to previous versions.

Documents

Blogs

Videos

Examples