Lambda execution environment - AWS Lambda

Lambda execution environment

Lambda invokes your function in an execution environment, which provides a secure and isolated runtime environment. The execution environment manages the resources required to run your function. The execution environment also provides lifecycle support for the function's runtime and any external extensions associated with your function.

The function's runtime communicates with Lambda using the Runtime API. Extensions communicate with Lambda using the Extensions API. Extensions can also receive log messages and other telemetry from the function by using the Telemetry API.


            Architecture diagram of the execution environment.

When you create your Lambda function, you specify configuration information, such as the amount of memory available and the maximum execution time allowed for your function. Lambda uses this information to set up the execution environment.

The function's runtime and each external extension are processes that run within the execution environment. Permissions, resources, credentials, and environment variables are shared between the function and the extensions.

Lambda execution environment lifecycle


            The Init  phase is followed by one or more function invocations. When there are no
              invocation requests, Lambda  initiates the Shutdown phase.

Each phase starts with an event that Lambda sends to the runtime and to all registered extensions. The runtime and each extension indicate completion by sending a Next API request. Lambda freezes the execution environment when the runtime and each extension have completed and there are no pending events.

Init phase

In the Init phase, Lambda performs three tasks:

  • Start all extensions (Extension init)

  • Bootstrap the runtime (Runtime init)

  • Run the function's static code (Function init)

  • Run any beforeCheckpoint runtime hooks (Lambda SnapStart only)

The Init phase ends when the runtime and all extensions signal that they are ready by sending a Next API request. The Init phase is limited to 10 seconds. If all three tasks do not complete within 10 seconds, Lambda retries the Init phase at the time of the first function invocation with the configured function timeout.

When Lambda SnapStart is activated, the Init phase happens when you publish a function version. Lambda saves a snapshot of the memory and disk state of the initialized execution environment, persists the encrypted snapshot, and caches it for low-latency access. If you have a beforeCheckpoint runtime hook, then the code runs at the end of Init phase.

Note

The 10-second timeout doesn't apply to SnapStart functions. When Lambda creates a snapshot, your initialization code can run for up to 15 minutes. The time limit is 130 seconds or the configured function timeout (maximum 900 seconds), whichever is higher.

When you use provisioned concurrency, Lambda initializes the execution environment when you configure the PC settings for a function. Lambda also ensures that initialized execution environments are always available in advance of invocations. You may see gaps between your function's invocation and initialization phases. Depending on your function's runtime and memory configuration, you may also see variable latency on the first invocation on an initialized execution environment.

For functions using on-demand concurrency, Lambda may occasionally initialize execution environments ahead of invocation requests. When this happens, you may also observe a time gap between your function's initialization and invocation phases. We recommend you to not take a dependency on this behavior.

Restore phase (Lambda SnapStart only)

When you first invoke a SnapStart function and as the function scales up, Lambda resumes new execution environments from the persisted snapshot instead of initializing the function from scratch. If you have an afterRestore() runtime hook, the code runs at the end of the Restore phase. You are charged for the duration of afterRestore() runtime hooks. The runtime (JVM) must load and afterRestore() runtime hooks must complete within the timeout limit (10 seconds). Otherwise, you'll get a SnapStartTimeoutException. When the Restore phase completes, Lambda invokes the function handler (the Invoke phase).

Invoke phase

When a Lambda function is invoked in response to a Next API request, Lambda sends an Invoke event to the runtime and to each extension.

The function's timeout setting limits the duration of the entire Invoke phase. For example, if you set the function timeout as 360 seconds, the function and all extensions need to complete within 360 seconds. Note that there is no independent post-invoke phase. The duration is the sum of all invocation time (runtime + extensions) and is not calculated until the function and all extensions have finished executing.

The invoke phase ends after the runtime and all extensions signal that they are done by sending a Next API request.

Failures during the invoke phase

If the Lambda function crashes or times out during the Invoke phase, Lambda resets the execution environment. The following diagram illustrates Lambda execution environment behavior when there's an invoke failure:

In the previous diagram:

  • The first phase is the INIT phase, which runs without errors.

  • The second phase is the INVOKE phase, which runs without errors.

  • At some point, suppose your function runs into an invoke failure (such as a function timeout or runtime error). The third phase, labeled INVOKE WITH ERROR , illustrates this scenario. When this happens, the Lambda service performs a reset. The reset behaves like a Shutdown event. First, Lambda shuts down the runtime, then sends a Shutdown event to each registered external extension. The event includes the reason for the shutdown. If this environment is used for a new invocation, Lambda re-initializes the extension and runtime together with the next invocation.

    Note

    The Lambda reset does not clear the /tmp directory content prior to the next init phase. This behavior is consistent with the regular shutdown phase.

  • The fourth phase represents the INVOKE phase immediately following an invoke failure. Here, Lambda initializes the environment again by re-running the INIT phase. This is called a suppressed init. When suppressed inits occur, Lambda doesn't explicitly report an additional INIT phase in CloudWatch Logs. Instead, you may notice that the duration in the REPORT line includes an additional INIT duration + the INVOKE duration. For example, suppose you see the following logs in CloudWatch:

    2022-12-20T01:00:00.000-08:00 START RequestId: XXX Version: $LATEST 2022-12-20T01:00:02.500-08:00 END RequestId: XXX 2022-12-20T01:00:02.500-08:00 REPORT RequestId: XXX Duration: 3022.91 ms Billed Duration: 3000 ms Memory Size: 512 MB Max Memory Used: 157 MB

    In this example, the difference between the REPORT and START timestamps is 2.5 seconds. This doesn't match the reported duration of 3022.91 millseconds, because it doesn't take into account the extra INIT (suppressed init) that Lambda performed. In this example, you can infer that the actual INVOKE phase took 2.5 seconds.

    For more insight into this behavior, you can use the Lambda Telemetry API. The Telemetry API emits INIT_START, INIT_RUNTIME_DONE, and INIT_REPORT events with phase=invoke whenever suppressed inits occur in between invoke phases.

  • The fifth phase represents the SHUTDOWN phase, which runs without errors.

Shutdown phase

When Lambda is about to shut down the runtime, it sends a Shutdown event to each registered external extension. Extensions can use this time for final cleanup tasks. The Shutdown event is a response to a Next API request.

Duration: The entire Shutdown phase is capped at 2 seconds. If the runtime or any extension does not respond, Lambda terminates it via a signal (SIGKILL).

After the function and all extensions have completed, Lambda maintains the execution environment for some time in anticipation of another function invocation. In effect, Lambda freezes the execution environment. When the function is invoked again, Lambda thaws the environment for reuse. Reusing the execution environment has the following implications:

  • Objects declared outside of the function's handler method remain initialized, providing additional optimization when the function is invoked again. For example, if your Lambda function establishes a database connection, instead of reestablishing the connection, the original connection is used in subsequent invocations. We recommend adding logic in your code to check if a connection exists before creating a new one.

  • Each execution environment provides between 512 MB and 10,240 MB, in 1-MB increments, of disk space in the /tmp directory. The directory content remains when the execution environment is frozen, providing a transient cache that can be used for multiple invocations. You can add extra code to check if the cache has the data that you stored. For more information on deployment size limits, see Lambda quotas.

  • Background processes or callbacks that were initiated by your Lambda function and did not complete when the function ended resume if Lambda reuses the execution environment. Make sure that any background processes or callbacks in your code are complete before the code exits.

When you write your function code, do not assume that Lambda automatically reuses the execution environment for subsequent function invocations. Other factors may dictate a need for Lambda to create a new execution environment, which can lead to unexpected results, such as database connection failures.