New endpoint configurations Existing endpoints (opt-in)MetricsConfig API parameters Relationship to enhanced metrics Configure for custom containers (BYOC)Enabling OTel enrichment in Amazon CloudWatch

Getting started with detailed observability

Detailed observability is controlled by the EnableDetailedObservability flag in your endpoint configuration. The behavior of this flag depends on when the endpoint configuration was created. The following sections describe the different cases and how to enable the feature for each.

New endpoint configurations

For endpoint configurations created after June 17, 2026, EnableDetailedObservability defaults to true. No action required.

Endpoint detail page showing Observability: Enabled.

Verify via API


aws sagemaker describe-endpoint-config \
    --endpoint-config-name your-config-name \
    --query 'MetricsConfig.EnableDetailedObservability'

Expected output: true

Disable via API

To disable detailed observability, create a new endpoint configuration with the flag set to false, then update your endpoint to use it:


# Create endpoint config with detailed observability disabled
aws sagemaker create-endpoint-config \
    --endpoint-config-name name-no-observability \
    --execution-role-arn role-arn \
    --production-variants '[{"VariantName":"primary","ModelName":"model","InitialInstanceCount":1,"InstanceType":"ml.g5.xlarge"}]' \
    --metrics-config '{"EnableDetailedObservability": false}'

# Update endpoint to use the new config
aws sagemaker update-endpoint \
    --endpoint-name your-endpoint \
    --endpoint-config-name name-no-observability

Existing endpoints (opt-in)

Via API


# Create a new endpoint config with detailed observability enabled
aws sagemaker create-endpoint-config \
    --endpoint-config-name name-v2 \
    --execution-role-arn role-arn \
    --production-variants '[{"VariantName":"primary","ModelName":"model","InitialInstanceCount":2,"InstanceType":"ml.g5.12xlarge"}]' \
    --metrics-config '{"EnableDetailedObservability": true}'

# Update endpoint to use new config
aws sagemaker update-endpoint \
    --endpoint-name your-endpoint \
    --endpoint-config-name name-v2

Via console (3-step wizard)

Navigate to SageMaker AI Console → Deployments and inference → Endpoints.
Click Enable detailed observability in the banner.
Step 1: Review the metrics that detailed observability provides. This includes inference framework metrics (TTFT, KV cache, queue depth), GPU health, node health, and lifecycle events. For the complete list, see OpenTelemetry metrics reference. Enabling this feature also activates the SageMaker AI Insights dashboard — an auto-generated dashboard in Amazon CloudWatch that displays these metrics along with a health overview across all your endpoints.
Step 2: Enable OTel enrichment in your Amazon CloudWatch account settings. This step is required so that your metrics are queryable via PromQL in CloudWatch Query Studio and Amazon Managed Grafana. The wizard provides instructions and a direct link to the CloudWatch Settings page.
Step 3: Select the endpoints you want to enable detailed observability on and confirm. The console creates new endpoint configurations with EnableDetailedObservability set to true and applies them to your selected endpoints.

MetricsConfig API parameters

Set on the endpoint configuration via CreateEndpointConfig:

MetricsConfig parameters
Parameter	Type	Required	Default	Description
`EnableDetailedObservability`	Boolean	No	`false` (existing), `true` (new)	Enables OTel-based metric collection
`EnableEnhancedMetrics`	Boolean	No	`false`	Enables instance-level dimensions for legacy CloudWatch metrics
`MetricPublishFrequencyInSeconds`	Integer	No	`60`	Scrape interval. Valid: 10, 30, 60, 120, 180, 240, 300

Relationship to enhanced metrics

EnableDetailedObservability and EnableEnhancedMetrics are separate features that can coexist on the same endpoint:

Enhanced metrics vs. detailed observability
Feature	`EnableEnhancedMetrics`	`EnableDetailedObservability`
Purpose	Instance-level and container-level dimensions for legacy CloudWatch metrics	Full OTel-based metric collection with PromQL support
Metrics store	CloudWatch classic metrics (namespace/dimension model)	OpenTelemetry metrics (label-based, PromQL-queryable)
Query language	CloudWatch Metrics API	PromQL
GPU metrics	`GPUUtilization` (with InstanceId, ContainerId, AcceleratorId dimensions)	`DCGM_FI_DEV_GPU_UTIL` (GPU utilization %), `DCGM_FI_DEV_MEM_COPY_UTIL` (memory copy utilization %), `DCGM_FI_DEV_GPU_TEMP` (GPU temperature), `DCGM_FI_DEV_MEMORY_TEMP` (memory temperature), `DCGM_FI_DEV_FB_FREE` (framebuffer memory free), `DCGM_FI_DEV_FB_USED` (framebuffer memory used), `DCGM_FI_DEV_SM_ACTIVE` (streaming multiprocessor active %) — all per-GPU
Token metrics	Not available	TTFT, ITL, KV cache, queue depth, TPS

Both flags can be enabled simultaneously. They publish to different metric stores and do not conflict.

For more information about GPU metrics available through the DCGM exporter, refer to the Data Center GPU Manager exporter documentation.

Configure for custom containers (BYOC)

If you are using a custom container (bring your own container), the platform cannot automatically detect where your container exposes Prometheus metrics. You must specify the metrics endpoint path using ContainerMetricsConfig so that the OTel Collector knows where to scrape.

Note

Your container must expose metrics in Prometheus format on port 8080. The default metrics path is /metrics. If your container uses a different path, configure ContainerMetricsConfig with the custom path.

You still need to set EnableDetailedObservability and MetricPublishFrequencyInSeconds in the endpoint configuration. Then, set ContainerMetricsConfig on the inference component or production variant with your custom metrics path:


{
    "ContainerMetricsConfig": {
        "MetricsEndpoints": [
            {
                "MetricsEndpointPath": "/metrics"
            }
        ]
    }
}

Enabling OTel enrichment in Amazon CloudWatch

To query metrics via PromQL (required for SageMaker AI Insights dashboard and Grafana), enable OTel enrichment at the account level.

Important

OTel metric enrichment converts CloudWatch metrics into OpenTelemetry format and enriches each data point with AWS resource tags and account metadata. Enriched metrics are ingested at $0.50 per GB. Actual bytes per data point depend on the number and size of resource tags applied to your AWS resources. For details, see Amazon CloudWatch Pricing.

Via CloudWatch console

Open Amazon CloudWatch console.
Choose Settings in the left navigation.
Enable OTel metric enrichment.
Enable Resource tags for telemetry.

CloudWatch Settings page with OTel metric enrichment and Resource tags for telemetry enabled.

Via AWS CLI


# Enable OTel enrichment
aws cloudwatch start-otel-enrichment

# Enable resource tags for telemetry
aws observabilityadmin start-telemetry-enrichment

# Verify
aws cloudwatch get-otel-enrichment-status

What enrichment adds

Every metric is automatically tagged with AWS resource context:

Enrichment attributes
Attribute	Description	Example
`@aws.account`	AWS account ID	`123456789012`
`@aws.region`	AWS Region	`us-west-2`
`cloud.resource_id`	Full resource ARN	`arn:aws:sagemaker:us-west-2:123456789012:endpoint/my-ep`
Resource tags	Tags from AWS Resource Explorer	`env=production, team=ml`

Warning Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Document Conventions

Detailed observability

SageMaker AI Insights dashboard