MLSUS-08: Select energy-efficient algorithms
To minimize resource usage, replace algorithms with more efficient versions that produce the same result.
Implementation plan
-
Begin with a simple algorithm to establish a baseline - Then test different algorithms with increasing complexity to observe whether performance has improved. If so, compare the performance gain against the difference in resources required.
-
Try to find simplified versions of algorithms - This approach helps you use less resources to achieve a similar outcome. For example, DistilBERT
, a distilled version of BERT , has 40% fewer parameters, runs 60% faster, and preserves 97% of its performance. -
Compress models size without significant loss of accuracy - Use pruning
to remove weights that don’t contribute much to the model. Use quantization to represent numbers with the low-bit integers without incurring significant loss in accuracy. These techniques speed up inference and save energy with limited impact on accuracy. -
Employ Amazon SageMaker Neo
- Optimize ML models for inference on SageMaker in the cloud and supported devices at the edge.
Documents
Blogs
-
Optimize AI/ML workloads for sustainability: Part 2, model development
-
Pruning machine learning models with Amazon SageMaker Debugger and Amazon SageMaker Experiments
-
Reduce ML inference costs on Amazon SageMaker with hardware and software acceleration
-
Unlock near 3x performance gains with XGBoost and Amazon SageMaker Neo
Metrics
-
Track the metrics related to the resources provisioned for your training and inference jobs (InstanceCount, InstanceType, and VolumeSizeInGB) and the efficient use of these resources (CPUUtilization, GPUUtilization, GPUMemoryUtilization, MemoryUtilization, and DiskUtilization) in the SageMaker Console, the CloudWatch Console or your SageMaker Debugger Profiling Report