API Request Throttling - Amazon Elastic Compute Cloud

API Request Throttling

Amazon EC2 throttles EC2 API requests for each AWS account on a per-Region basis. We do this to help the performance of the service, and to ensure fair usage for all Amazon EC2 customers. Throttling ensures that calls to the Amazon EC2 API do not exceed the maximum allowed API request limits. API calls are subject to the request limits whether they originate from:

  • A third-party application

  • A command line tool

  • The Amazon EC2 console

If you exceed an API throttling limit, you get the RequestLimitExceeded error code. For more information, see Query API Request Rate.

How Throttling Is Applied

Amazon EC2 uses the token bucket algorithm to implement API throttling. With this algorithm, your account has a bucket that holds a specific number of tokens. The number of tokens in the bucket represents your throttling limit at any given second.

Amazon EC2 implements two types of API throttling:

Request Rate Limiting

With request rate limiting, you are throttled on the number of API requests you make. Each request that you make removes one token from the bucket. For example, if the bucket size for non-mutating (Describe*) API actions is 100 tokens, you can make up to 100 Describe* requests in one second. If you exceed 100 requests in a second, you are throttled and the remaining requests within that second fail.

Buckets automatically refill at a set rate. If the bucket is below its maximum capacity, a set number of tokens is added back to it every second until it reaches its maximum capacity. If the bucket is full when refill tokens arrive, they are discarded. The bucket cannot hold more than its maximum number of tokens. For example, say the bucket size for non-mutating (Describe*) API actions is 100 tokens, and the refill rate is 10 tokens per second. If you make 100 Describe* API requests in a second, the bucket is immediately reduced to zero (0) tokens. The bucket is then refilled by 10 tokens every second, until it reaches its maximum capacity of 100 tokens. This means that the previously empty bucket reaches its maximum capacity after 10 seconds.

You do not need to wait for the bucket to be completely full before you can make API requests. You can use tokens as they are added to the bucket. For example, say the bucket size for non-mutating (Describe*) API actions is 100 tokens, and the refill rate is 10 tokens per second. If you deplete the bucket by making 100 API requests in a second, you can continue to make 10 API requests per second after that for as long as needed. This means that you can immediately use the refill tokens as they are added to your bucket. The bucket only starts to refill to the maximum capacity when you make fewer API requests per second than the refill rate.

For more information about the request token bucket sizes and refill rates, see Request Token Bucket Sizes and Refill Rates.

Resource Rate Limiting

Some API actions, such as RunInstances and TerminateInstances, use resource rate limiting in addition to request rate limiting. These API actions have a separate resource token bucket that depletes based on the number of resources that are impacted by the request. Like request token buckets, resource token buckets have a bucket maximum that allows you to burst, and a refill rate that allows you to sustain a steady rate of requests for as long as needed.

For example, say the resource token bucket size for RunInstances is 1000 tokens, and the refill rate is two tokens per second. The bucket maximum indicates that you can immediately launch 1000 instances, using any number of API requests, as long as it does not exceed the number of tokens in the resource token bucket. So, if the request-based token bucket for Mutating API requests has 250 tokens, you can launch 1000 instances using one request for 1000 instances, or using 250 requests for four instances. The refill rate indicates that you can launch up to two instances every second after the resource token bucket has been depleted, using either one request for two instances, or two requests for one instance.

For more information, see Resource Token Bucket Sizes and Refill Rates.

Throttling Limits

The following sections describe the request token bucket and resource token bucket sizes and refill rates.

Request Token Bucket Sizes and Refill Rates

For request rate limiting purposes, API actions are grouped into the following categories:

  • Non-mutating actions — API actions that retrieve data about resources. This category generally includes all Describe* actions, such as DescribeRouteTables, DescribeImages, and DescribeHosts. These API actions typically have the highest API throttling limits.

  • Mutating actions — API actions that create, modify, or delete resources. This category generally includes all API actions that are not categorized as non-mutating actions, such as CreateVolume, ModifyHosts, and DeleteSnapshot. These actions have a lower throttling limit than non-mutating API calls.

  • Resource-intensive actions — Mutating API actions that take the most time and consume the most resources to complete. These actions have an even lower throttling limit than mutating actions. They are throttled separately from other mutating actions.

  • Console non-mutating actions — Non-mutating API actions that are called from the Amazon EC2 console. These API actions are throttled separately from other non-mutating API actions.

  • Uncategorized actions — These API actions receive their own token bucket sizes and refill rates, even though by definition they fit in one of the other categories.

The following table shows the request token bucket sizes and refill rates for all AWS Regions.

API Action Category Actions Bucket Maximum Capacity Bucket Refill Rate
Non-mutating actions
  • Describe*

  • Get*

100 20
Mutating actions

API actions that are not categorized as non-mutating actions.

200 5
Resource-intensive actions
  • AuthorizeSecurityGroupIngress

  • CancelSpotInstanceRequests

  • CreateKeyPair

  • RequestSpotInstances

  • RevokeSecurityGroupIngress

  • CreateVpcPeeringConnection

  • AcceptVpcPeeringConnection

  • RejectVpcPeeringConnection

  • DeleteVpcPeeringConnection

50 5
Console non-mutating actions
  • Describe*

  • Get*

100 10
Uncategorized actions RunInstances 5 2
StartInstances 5 2
CreateVpcEndpoint 4 0.3
ModifyVpcEndpoint 4 0.3
DeleteVpcEndpoints 4 0.3
AcceptVpcEndpointConnections 10 1
RejectVpcEndpointConnections 10 1
CreateVpcEndpointServiceConfiguration 10 1
ModifyVpcEndpointServiceConfiguration 10 1
DeleteVpcEndpointServiceConfigurations 10 1
CreateDefaultVpc 1 1
CreateDefaultSubnet 1 1
MoveAddressToVpc 1 1
RestoreAddressToClassic 1 1
DescribeMovingAddresses 1 1
AdvertiseByoipCidr 1 0.1
ProvisionByoipCidr 1 0.1
DescribeByoipCidrs 1 0.5
DeprovisionByoipCidr 1 0.1
WithdrawByoipCidr 1 0.1
DescribeReservedInstancesOfferings 10 10
PurchaseReservedInstancesOffering 5 5
DescribeSpotFleetRequests 50 3
DescribeSpotFleetInstances 100 5
DescribeSpotFleetRequestHistory 100 5

Resource Token Bucket Sizes and Refill Rates

The following table lists the resource token bucket sizes and refill rates for API actions that apply resource rate limiting.

API Action Bucket Maximum Capacity Bucket Refill Rate
RunInstances 1000 2
TerminateInstances 1000 20
StartInstances 1000 2
StopInstances 1000 20

Monitoring API Throttling

You can use Amazon CloudWatch to monitor your Amazon EC2 API calls and to collect and track metrics around API throttling. You can also create an alarm to warn you when you are close to reaching the API throttling limits. For more information, see Monitoring API Requests with Amazon CloudWatch.

Adjusting API Throttling Limits

You can request an increase for API throttling limits for your AWS account. To request a limit adjustment, contact the AWS Support Center.