REL01-BP01 Aware of service quotas and constraints - AWS Well-Architected Framework (2023-04-10)

REL01-BP01 Aware of service quotas and constraints

This best practice was updated with new guidance on July 13th, 2023.

Be aware of your default quotas and manage your quota increase requests for your workload architecture. Know which cloud resource constraints, such as disk or network, are potentially impactful.

Desired outcome: Customers can prevent service degradation or disruption in their AWS accounts by implementing proper guidelines for monitoring key metrics, infrastructure reviews, and automation remediation steps to verify that services quotas and constraints are not reached that could cause service degradation or disruption.

Common anti-patterns:

  • Deploying a workload without understanding the hard or soft quotas and their limits for the services used.

  • Deploying a replacement workload without analyzing and reconfiguring the necessary quotas or contacting Support in advance.

  • Assuming that cloud services have no limits and the services can be used without consideration to rates, limits, counts, quantities.

  • Assuming that quotas will automatically be increased.

  • Not knowing the process and timeline of quota requests.

  • Assuming that the default cloud service quota is the identical for every service compared across regions.

  • Assuming that service constraints can be breached and the systems will auto-scale or add increase the limit beyond the resource’s constraints

  • Not testing the application at peak traffic in order to stress the utilization of its resources.

  • Provisioning the resource without analysis of the required resource size.

  • Overprovisioning capacity by choosing resource types that go well beyond actual need or expected peaks.

  • Not assessing capacity requirements for new levels of traffic in advance of a new customer event or deploying a new technology.

Benefits of establishing this best practice: Monitoring and automated management of service quotas and resource constraints can proactively reduce failures. Changes in traffic patterns for a customer’s service can cause a disruption or degradation if best practices are not followed. By monitoring and managing these values across all regions and all accounts, applications can have improved resiliency under adverse or unplanned events.

Level of risk exposed if this best practice is not established: High

Implementation guidance

Service Quotas is an AWS service that helps you manage your quotas for over 250 AWS services from one location. Along with looking up the quota values, you can also request and track quota increases from the Service Quotas console or using the AWS SDK. AWS Trusted Advisor offers a service quotas check that displays your usage and quotas for some aspects of some services. The default service quotas per service are also in the AWS documentation per respective service (for example, see Amazon VPC Quotas).

Some service limits, like rate limits on throttled APIs are set within the Amazon API Gateway itself by configuring a usage plan. Some limits that are set as configuration on their respective services include Provisioned IOPS, Amazon RDS storage allocated, and Amazon EBS volume allocations. Amazon Elastic Compute Cloud has its own service limits dashboard that can help you manage your instance, Amazon Elastic Block Store, and Elastic IP address limits. If you have a use case where service quotas impact your application’s performance and they are not adjustable to your needs, then contact AWS Support to see if there are mitigations.

Service quotas can be Region specific or can also be global in nature. Using an AWS service that reaches its quota will not act as expected in normal usage and may cause service disruption or degradation. For example, a service quota limits the number of DL Amazon EC2 that be used in an Region and that limit may be reached during a traffic scaling event using Auto Scaling groups (ASG).

Service quotas for each account should be assessed for usage on a regular basis to determine what the appropriate service limits might be for that account. These service quotas exist as operational guardrails, to prevent accidentally provisioning more resources than you need. They also serve to limit request rates on API operations to protect services from abuse.

Service constraints are different from service quotas. Service constraints represent a particular resource’s limits as defined by that resource type. These might be storage capacity (for example, gp2 has a size limit of 1 GB - 16 TB) or disk throughput (10,0000 iops). It is essential that a resource type’s constraint be engineered and constantly assessed for usage that might reach its limit. If a constraint is reached unexpectedly, the account’s applications or services may be degraded or disrupted.

If there is a use case where service quotas impact an application’s performance and they cannot be adjusted to required needs, contact AWS Support to see if there are mitigations. For more detail on adjusting fixed quotas, see REL01-BP03 Accommodate fixed service quotas and constraints through architecture.

There are a number of AWS services and tools to help monitor and manage Service Quotas. The service and tools should be leveraged to provide automated or manual checks of quota levels.

  • AWS Trusted Advisor offers a service quota check that displays your usage and quotas for some aspects of some services. It can aid in identifying services that are near quota.

  • AWS Management Console provides methods to display services quota values, manage, request new quotas, monitor status of quota requests, and display history of quotas.

  • AWS CLI and CDKs offer programmatic methods to automatically manage and monitor service quota levels and usage.

Implementation steps

For Service Quotas:

  • Review AWS Service Quotas.

  • To be aware of your existing service quotas, determine the services (like IAM Access Analyzer) that are used. There are approximately 250 AWS services controlled by service quotas. Then, determine the specific service quota name that might be used within each account and region. There are approximate 3000 service quota names per region.

  • Augment this quota analysis with AWS Config to find all AWS resources used in your AWS accounts.

  • Use AWS CloudFormation data to determine your AWS resources used. Look at the resources that were created either in the AWS Management Console or with the list-stack-resources AWS CLI command. You can also see resources configured to be deployed in the template itself.

  • Determine all the services your workload requires by looking at the deployment code.

  • Determine the service quotas that apply. Use the programmatically accessible information from Trusted Advisor and Service Quotas.

  • Establish an automated monitoring method (see REL01-BP02 Manage service quotas across accounts and regions and REL01-BP04 Monitor and manage quotas) to alert and inform if services quotas are near or have reached their limit.

  • Establish an automated and programmatic method to check if a service quota has been changed in one region but not in other regions in the same account (see REL01-BP02 Manage service quotas across accounts and regions and REL01-BP04 Monitor and manage quotas).

  • Automate scanning application logs and metrics to determine if there are any quota or service constraint errors. If these errors are present, send alerts to the monitoring system.

  • Establish engineering procedures to calculate the required change in quota (see REL01-BP05 Automate quota management) once it has been identified that larger quotas are required for specific services.

  • Create a provisioning and approval workflow to request changes in service quota. This should include an exception workflow in case of request deny or partial approval.

  • Create an engineering method to review service quotas prior to provisioning and using new AWS services before rolling out to production or loaded environments. (for example, load testing account).

For service constraints:

  • Establish monitoring and metrics methods to alert for resources reading close to their resource constraints. Leverage CloudWatch as appropriate for metrics or log monitoring.

  • Establish alert thresholds for each resource that has a constraint that is meaningful to the application or system.

  • Create workflow and infrastructure management procedures to change the resource type if the constraint is near utilization. This workflow should include load testing as a best practice to verify that new type is the correct resource type with the new constraints.

  • Migrate identified resource to the recommended new resource type, using existing procedures and processes.

Resources

Related best practices:

Related documents:

Related videos:

Related tools: