AWS Lambda
Developer Guide

Lambda Function Concurrent Executions

Concurrent executions refers to the number of executions of your function code that are happening at any given time. You can estimate the concurrent execution count, but the concurrent execution count will differ depending on whether or not your Lambda function is processing events from a stream-based event source.

  • Stream-based event sources – If you create a Lambda function that processes events from stream-based services (Amazon Kinesis Streams or DynamoDB streams), the number of shards per stream is the unit of concurrency. If your stream has 100 active shards, there will be 100 Lambda functions running concurrently. Then, each Lambda function processes events on a shard in the order that they arrive.

  • Event sources that aren't stream-based – If you create a Lambda function to process events from event sources that aren't stream-based (for example, Amazon S3 or API Gateway), each published event is a unit of work. Therefore, the number of events (or requests) these event sources publish influences the concurrency.

    You can use the following formula to estimate your concurrent Lambda function invocations:

    events (or requests) per second * function duration

    For example, consider a Lambda function that processes Amazon S3 events. Suppose that the Lambda function takes on average three seconds and Amazon S3 publishes 10 events per second. Then, you will have 30 concurrent executions of your Lambda function.

Request Rate

Request rate refers to the rate at which your Lambda function is invoked. For all services except the stream-based services, the request rate is the rate at which the event sources generate the events. For stream-based services, AWS Lambda calculates the request rate as follow:

request rate = number of concurrent executions / function duration

For example, if there are five active shards on a stream (that is, you have five Lambda functions running in parallel) and your Lambda function takes about two seconds, the request rate is 2.5 requests/second.

Safety Limit

By default, AWS Lambda limits the total concurrent executions across all functions within a given region to 100. The default limit is a safety limit that protects you from costs due to potential runaway or recursive functions during initial development and testing. To increase this limit above the default, follow the steps in To request a limit increase for concurrent executions.

Any invocation that causes your function's concurrent execution to exceed the safety limit is throttled, and does not execute your function. Each throttled invocation increases the CloudWatch Throttles metric for the function.

The throttled invocation is handled differently based on how the function is invoked:

  • Event sources that aren't stream-based – Some of these event sources invoke a Lambda function synchronously and others invoke it asynchronously.


    • Synchronous invocation – If the function is invoked synchronously and is throttled, the invoking application receives a 429 error and the invoking application is responsible for retries. These event sources may have additional retries built into the integration. For example, CloudWatch Logs retries the failed batch up to five times with delays between retries. For a list of supported event sources and the invocation types that they use, see Supported Event Sources.

      If you invoke Lambda through API Gateway, you need to make sure you map Lambda response errors to API Gateway error codes. If you invoke the function directly, such as through the AWS SDKs using the RequestResponse invocation mode or through API Gateway, your client receives the 429 error and you can choose to retry the invocation.


    • Asynchronous invocation – If your Lambda function is invoked asynchronously and is throttled, AWS Lambda automatically retries the throttled event for up to six hours, with delays between retries. Asynchronous events are queued before they are used to invoke the Lambda function.


  • Stream-based event sources – For stream-based event sources (Amazon Kinesis Streams and DynamoDB streams), AWS Lambda polls your stream and invokes your Lambda function. Therefore, when your Lambda function is throttled, AWS Lambda attempts to process the throttled batch of records until the time the data expires, which can be up to seven days for Amazon Kinesis Streams. The throttled request is treated as blocking per shard and Lambda will not read any new records from the shard until the throttled batch of records either expires or succeeds. If there is more than one shard in the stream, Lambda will continue invokes on the non-throttled shards until one gets through.

To request a limit increase for concurrent executions

  1. Open the AWS Support Center page, sign in, if necessary, and then click Create case.

  2. Under Regarding, select Service Limit Increase.

  3. Under Limit Type, select Lambda, fill in the necessary fields in the form, and then click the button at the bottom of the page for your preferred method of contact.


AWS may automatically raise the concurrent execution limit on your behalf to enable your function to match the incoming event rate, as in the case of triggering the function from an Amazon S3 bucket.

Suggested Reading

If you are new to AWS Lambda, we suggest you read through all of the topics in the How It Works section to familiarize yourself with Lambda. The next topic is Retries on Errors.

After you read all of the topics in the How it Works section, we recommend that you review Building Lambda Functions, try the Getting Started exercise, and then explore the Use Cases. Each use case provides step-by-step instructions for you to set up the end-to-end experience.