Logging and metrics for AWS Lambda - AWS Prescriptive Guidance

Logging and metrics for AWS Lambda

Lambda removes the need to manage and monitor servers for your workloads and automatically works with CloudWatch Metrics and CloudWatch Logs without further configuration or instrumentation of your application's code. This section helps you understand the performance characteristics of the systems used by Lambda and how your configuration choices influence performance. It also helps you log and monitor your Lambda functions for performance optimization and diagnosing application-level issues.

Lambda function logging

Lambda automatically streams standard output and standard error messages from a Lambda function to CloudWatch Logs, without requiring logging drivers. Lambda also automatically provisions containers that run your Lambda function and configures them to output log messages in separate log streams.

Subsequent invocations of your Lambda function can reuse the same container and output to the same log stream. Lambda can also provision a new container and output the invocation to a new log stream.

Lambda automatically creates a log group when your Lambda function is first invoked. Lambda functions can have multiple versions and you can choose the version that you want to run. All logs for the Lambda function's invocations are stored in the same log group. The name cannot be changed and is in the /aws/lambda/<YourLambdaFunctionName> format. A separate log stream is created in the log group for each Lambda function instance. Lambda has a standard naming convention for log streams that uses a YYYY/MM/DD/[<FunctionVersion>]<InstanceId> format. The InstanceId is generated by AWS to identify the Lambda function instance.

We recommend that you use a logging library to help format and classify log messages. For example, if your Lambda function is written in Python, you can use the Python logging module to log messages and control the output format. We also recommend that you log messages in JSON format because you can query them more easily with CloudWatch Logs Insights. They can also be more easily filtered and exported.

Another best practice is to set the log output level by using a variable and adjust it based on the environment and your requirements. Your Lambda function's code, in addition to the libraries used, could output a large amount of log data depending on the log output level. This can impact your logging costs and affect performance.

Lambda allows you to set environment variables for your Lambda function runtime environment without updating your code. For example, you can create a LAMBDA_LOG_LEVEL environment variable that defines the log output level that you can retrieve from your code. The following example attempts to retrieve a LAMBDA_LOG_LEVEL environment variable and use the value to define the logging output. If the environment variable is not set, it defaults to the INFO level.

import logging from os import getenv logger = logging.getLogger() log_level = getenv("LAMBDA_LOG_LEVEL", "INFO") level = logging.getLevelName(log_level) logger.setLevel(level)

Sending logs to other destinations from CloudWatch

You can send logs to other destinations (for example, Amazon ES or a Lambda function) by using subscription filters. If you don’t use Amazon ES, you can use a Lambda function to process the logs and send them to an AWS service of your choice using the AWS SDKs.

You can also use SDKs for log destinations outside the AWS Cloud in your Lambda function to directly send log statements to a destination of your choice. If you choose this option, we recommend that you consider the impact of the latency, additional processing time, error and retry handling, and coupling of operational logic to your Lambda function.

Lambda function metrics

Lambda lets you run your code without managing or scaling servers and this almost removes the burden of system-level auditing and diagnostics. However, it's still important to understand performance and invocation metrics at the system level for your Lambda functions. This helps you optimize the resource configuration and improve code performance. Effectively monitoring and measuring performance can improve user experience and lower your costs by appropriately sizing your Lambda functions. Typically, workloads running as Lambda functions also have application-level metrics that need to be captured and analyzed. Lambda directly supports the embedded metric format to make capturing application-level CloudWatch metrics easier.

System-level metrics

Lambda automatically integrates with CloudWatch Metrics and provides a set of standard metrics for your Lambda functions. Lambda also provides a separate monitoring dashboard for each Lambda function with these metrics. Two important metrics that you need to monitor are errors and invocation errors. Understanding the differences between invocation errors and other error types helps you diagnose and support Lambda deployments.

Invocation errors prevent your Lambda function from running. These errors occur before your code is run so you can’t implement error handling within your code to identify them. Instead, you should configure alarms for your Lambda functions that detect these errors and notify the operations and workload owners. These errors are often related to a configuration or permission error and can occur because of a change in your configuration or permissions. Invocation errors might initiate a retry, which causes multiple invocations of your function.

A successfully invoked Lambda function returns an HTTP 200 response even if an exception is thrown by the function. Your Lambda functions should implement error handing and raise exceptions so that the Errors metric captures and identifies failed runs of your Lambda function. You should return a formatted response from your Lambda function invocations that includes information to determine whether the run failed completely, partially, or was successful.

CloudWatch provides CloudWatch Lambda Insights that you can enable for individual Lambda function. Lambda Insights collects, aggregates, and summarizes system-level metrics (for example, CPU time, memory, disk and network usage). Lambda Insights also collects, aggregates, and summarizes diagnostic information (for example, cold starts and Lambda worker shutdowns) to help you isolate and quickly resolve issues.

Lambda Insights uses the embedded metric format to automatically emit performance information to the /aws/lambda-insights/ log group with a log stream name prefix based on your Lambda function's name. These performance log events create CloudWatch metrics that are the basis for automatic CloudWatch dashboards. We recommend that you enable Lambda Insights for performance testing and production environments. Additional metrics created by Lambda Insights include memory_utilization that helps correctly size Lambda functions so that you avoid paying for unrequired capacity.

Application metrics

You can also create and capture your own application metrics in CloudWatch using the embedded metric format. You can leverage AWS provided libraries for embedded metric format to create and emit embedded metric format statements to CloudWatch. The integrated Lambda CloudWatch logging facility is configured to process and extract appropriately formatted embedded metric format statements.