Response streaming for Lambda functions - AWS Lambda

Response streaming for Lambda functions

You can configure your Lambda function URLs to stream response payloads back to clients. Response streaming can benefit latency sensitive applications by improving time to first byte (TTFB) performance. This is because you can send partial responses back to the client as they become available. Additionally, you can use response streaming to build functions that return larger payloads. Response stream payloads have a soft limit of 20 MB as compared to the 6 MB limit for buffered responses. Streaming a response also means that your function doesn’t need to fit the entire response in memory. For very large responses, this can reduce the amount of memory you need to configure for your function.

The speed at which Lambda streams your responses depends on the response size. The streaming rate for the first 6MB of your function’s response is uncapped. For responses larger than 6MB, the remainder of the response is subject to a bandwidth cap. For more information on streaming bandwidth, see Bandwidth limits for response streaming.

Streaming responses incurs a cost. For more information, see AWS Lambda Pricing.

Lambda supports response streaming on Node.js managed runtimes. For other languages, you can use a custom runtime with a custom Runtime API integration to stream responses or use the Lambda Web Adapter. You can stream responses through Lambda function URLs, the AWS SDK, or using the Lambda InvokeWithResponseStream API.

Note

When testing your function through the Lambda console, you'll always see responses as buffered.

Bandwidth limits for response streaming

The first 6MB of your function’s response payload has uncapped bandwidth. After this initial burst, Lambda streams your response at a maximum rate of 2MBps. If your function responses never exceed 6MB, then this bandwidth limit never applies.

Note

Bandwidth limits only apply to your function’s response payload, and not to network access by your function.

The rate of uncapped bandwidth varies depending on a number of factors, including your function’s processing speed. You can normally expect a rate higher than 2MBps for the first 6MB of your function’s response. If your function is streaming a response to a destination outside of AWS, the streaming rate also depends on the speed of the external internet connection.