Sustainability pillar - Best practices

The sustainability pillar focuses on environmental impacts, especially energy consumption and efficiency, since they are important levers for architects to inform direct action to reduce resource usage.

Best practices

Related best practices

Allow automatic scaling of the model endpoint (MLREL-11) - Configure automatic scaling for Amazon SageMaker Endpoints or use Serverless Inference and make efficient use of GPU with Amazon Elastic Inference. Elastic Inference allows you to attach just the right amount of GPU-powered inference acceleration to any EC2 or SageMaker instance type or ECS task. While training jobs process hundreds of data samples in parallel, inference jobs usually process a single input in real time, and thus consume a small amount of GPU compute. Amazon Elastic Inference allows you to reduce the cost and environmental impact of your inference by using GPU resources more efficiently.
Evaluate machine learning deployment option (cloud versus edge) (MLPER-11) - When working on IoT use-cases, evaluate if running ML inference at the edge can reduce the environmental impact of your workload. For that, consider factors like the compute capacity of your devices, their energy consumption or the emissions related to data transfer to the Cloud. When deploying ML models to edge devices, consider using Amazon SageMaker Edge Manager which integrates with SageMaker Neo and AWS IoT GreenGrass.
Select optimal computing instance size (MLCOST-09) - Amazon SageMaker Inference Recommender automates load testing and model tuning across SageMaker ML instances. It helps you select the best instance type and configuration (such as instance count, container parameters, and model optimizations) to ensure the maximum efficiency of the resources provisioned for inference.

Warning Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Document Conventions

MLCOST-26: Right-size the model hosting instance fleet

MLSUS-11: Align SLAs with sustainability goals