Monitor a Serverless Endpoint

To monitor your serverless endpoint, you can use Amazon CloudWatch alarms. CloudWatch is a service that collects metrics in real time from your AWS applications and resources. An alarm watches metrics as they are collected and gives you the ability to pre-specify a threshold and the actions to take if that threshold is breached. For example, your CloudWatch alarm can send you a notification if your endpoint breaches an error threshold. By setting up CloudWatch alarms, you gain visibility into the performance and functionality of your endpoint.

To learn more about CloudWatch metrics you can use to monitor your endpoints in SageMaker, see SageMaker Endpoint Invocation Metrics. The ModelSetupTime metric tracks the cold start time for your endpoint, or the time it takes to launch new compute resources for your serverless endpoint. This metric depends on your model size and the container's start-up time. Serverless endpoints can also use the Invocations4XXErrors, Invocations5XXErrors, and Invocations metrics in the AWS/SageMaker namespace. In the aws/sagemaker/Endpoints namespace, they can use the MemoryUtilization metric. For more information about CloudWatch alarms, see Using Amazon CloudWatch alarms in the Amazon CloudWatch User Guide.

If you want to monitor the logs from your endpoint for debugging or progress analysis, you can use Amazon CloudWatch Logs. The SageMaker-provided log group that you can use for serverless endpoints is /aws/sagemaker/Endpoints/[EndpointName]. For more information about using CloudWatch Logs in SageMaker, see Log Amazon SageMaker Events with Amazon CloudWatch. To learn more about CloudWatch Logs, see What is Amazon CloudWatch Logs? in the Amazon CloudWatch Logs User Guide.