Configuring reserved concurrency - AWS Lambda

Configuring reserved concurrency

In Lambda, concurrency is the number of in-flight requests your function is handling at the same time. There are two types of concurrency controls available:

  • Reserved concurrency – Reserved concurrency is the maximum number of concurrent instances you want to allocate to your function. When a function has reserved concurrency, no other function can use that concurrency. There is no charge for configuring reserved concurrency for a function.

  • Provisioned concurrency – Provisioned concurrency is the number of pre-initialized execution environments you want to allocate to your function. These execution environments are prepared to respond immediately to incoming function requests. Configuring provisioned concurrency incurs charges to your AWS account.

This topic details how to manage and configure reserved concurrency. For a conceptual overview of these two types of concurrency controls, see Reserved concurrency and provisioned concurrency. For information on configuring provisioned concurrency, see Configuring provisioned concurrency.

Configuring reserved concurrency

You can configure reserved concurrency settings for a function using the Lambda console or the Lambda API.

To reserve concurrency for a function (console)
  1. Open the Functions page of the Lambda console.

  2. Choose the function you want to reserve concurrency for.

  3. Choose Configuration and then choose Concurrency.

  4. Under Concurrency, choose Edit.

  5. Choose Reserve concurrency. Enter the amount of concurrency to reserve for the function.

  6. Choose Save.

You can reserve up to the Unreserved account concurrency value minus 100. The remaining 100 units of concurrency are for functions that aren't using reserved concurrency. For example, if your account has a concurrency limit of 1,000, you cannot reserve all 1,000 units of concurrency to a single function.

        An error occurs if you try to reserve too much concurrency.

Reserving concurrency for a function impacts the concurrency pool that's available to other functions. For example, if you reserve 100 units of concurrency for function-a, other functions in your account must share the remaining 900 units of concurrency, even if function-a doesn't use all 100 reserved concurrency units.

To intentionally throttle a function, set its reserved concurrency to 0. This stops your function from processing any events until you remove the limit.

To configure reserved concurrency with the Lambda API, use the following API operations.

For example, to configure reserved concurrency with the AWS Command Line Interface (CLI), use the put-function-concurrency command. The following command reserves 100 concurrency units for a function named my-function:

aws lambda put-function-concurrency --function-name my-function \ --reserved-concurrent-executions 100

You should see output that looks like the following:

{ "ReservedConcurrentExecutions": 100 }

Configuring concurrency with the Lambda API

If your function is currently serving traffic, you can easily view its concurrency metrics using CloudWatch metrics. Specifically, the ConcurrentExecutions metric shows you the number of concurrent invocations for each function in your account.

        Graph showing concurrency for a function over time.

The previous graph suggests that this function serves an average of 5 to 10 concurrent requests at any given time, and peaks at 20 requests on a typical day. Suppose that there are many other functions in your account. If this function is critical to your application and you don't want to drop any requests, use a number greater than or equal to 20 as your reserved concurrency setting.

Alternatively, recall that you can also calculate concurrency using the following formula:

Concurrency = (average requests per second) * (average request duration in seconds)

Multiplying average requests per second with the average request duration in seconds gives you a rough estimate of how much concurrency you need to reserve. You can estimate average requests per second using the Invocation metric, and the average request duration in seconds using the Duration metric. See Working with Lambda function metrics for more details.