SUS02-BP01 Scale infrastructure with user load - AWS Well-Architected Framework (2022-03-31)

SUS02-BP01 Scale infrastructure with user load

Identify periods of low or no utilization and scale down resources to eliminate excess capacity and improve efficiency.

Common anti-patterns:

  • You do not scale your infrastructure with user load.

  • You manually scale your infrastructure all the time.

  • You leave increased capacity after a scaling event instead of scaling back down.

Benefits of establishing this best practice: Configuring and testing workload elasticity will help reduce workload environmental impact, save money, and maintain performance benchmarks. You can take advantage of elasticity in the cloud to automatically scale capacity during and after user load spikes to make sure you are only using the exact number of resources needed to meet the needs of your customers.

Level of risk exposed if this best practice is not established: Medium

Implementation guidance

  • Elasticity matches the supply of resources you have against the demand for those resources. Instances, containers, and functions provide mechanisms for elasticity, either in combination with automatic scaling or as a feature of the service. Use elasticity in your architecture to ensure that workload can scale down quickly and easily during the period of low user load:

    Auto-scaling mechanism Where to use

    Amazon EC2 Auto Scaling

    Use to verify you have the correct number of Amazon EC2 instances available to handle the user load for your application.

    Application Auto Scaling

    Use to automatically scale the resources for individual AWS services beyond Amazon EC2, such as Lambda functions or Amazon Elastic Container Service (Amazon ECS) services.

    Kubernetes Cluster Autoscaler

    Use to automatically scale Kubernetes clusters on AWS.

  • Verify that the metrics for scaling up or down are validated against the type of workload being deployed. If you are deploying a video transcoding application, 100% CPU utilization is expected and should not be your primary metric. You can use a customized metric (such as memory utilization) for your scaling policy if required. To choose the right metrics, consider the following guidance for Amazon EC2:

    • The metric should be a valid utilization metric and describe how busy an instance is.

    • The metric value must increase or decrease proportionally to the number of instances in the Auto Scaling group.

  • Use dynamic scaling instead of manual scaling for your Auto Scaling group. We also recommend that you use target tracking scaling policies in your dynamic scaling.

  • Verify that workload deployments can handle both scale-up and scale-down events. Create test scenarios for scale-down events to ensure that the workload behaves as expected. You can use Activity history to test and verify a scaling activity for an Auto Scaling group.

  • Evaluate your workload for predictable patterns and proactively scale as you anticipate predicted and planned changes in demand. Use Predictive Scaling with Amazon EC2 Auto Scaling to eliminate the need to overprove capacity.

Resources

Related documents:

Related videos:

Related examples: