Open the Amazon SageMaker Debugger Insights Dashboard - Amazon SageMaker

Open the Amazon SageMaker Debugger Insights Dashboard

Open the Debugger insights dashboard in Studio to see profiling progress, results of resource utilization, and system bottlenecks of your training job running on Amazon EC2 instances.


The Studio Debugger insights dashboard runs a Studio app on an ml.m5.4xlarge instance to process and render the visualizations. Each Debugger insights tab runs one Studio kernel session. Multiple kernel sessions for multiple Debugger insights tabs run on the single instance. When you close a Debugger insights tab, the corresponding kernel session is also closed. The Studio app remains active and accrues charges for the ml.m5.4xlarge instance usage. For information about pricing, see the Amazon SageMaker Pricing page.


When you are done using the Debugger insights dashboard, you must shut down the ml.m5.4xlarge instance to avoid accruing charges. For instructions on how to shut down the instance, see Shut Down the Amazon SageMaker Debugger Insights Instance.

To open the Debugger insights dashboard

                An animated example of how to open the Studio Debugger insights
  1. Choose the SageMaker Components and registries icon ( ).

  2. Open the dropdown list, and choose Experiments and trials.

  3. Look up your training job name. If you have not assigned a SageMaker Experiments trial component to the training job, the job is collected under the Unassigned trial components list.

  4. Right-click (or an equivalent UI interaction) to open the context menu of the training job trial component. There are two menu items to access the Debugger features in Studio: Open Debugger for insights and Open in trial details.

  5. Choose Open Debugger for insights. This opens a Debug [your-training-job-name] tab. On this tab, Debugger provides an overview of your model training performance on Amazon EC2 instances and identifies system bottleneck problems. While monitoring the system resource utilization, you can also enable profiling to capture framework metrics that consist of data from neural network operations executed during the forward and backward pass and data loading. For more information about how to enable profiling using the Debugger insights dashboard controller, see Enable and Configure Debugger Profiling for Detailed Insights.

Debugger correlates the system resource utilization metrics with the framework metrics and helps identify resource-intensive operators that might be the root cause of the system bottlenecks. You can also download an aggregated Debugger profiling report. For more information, see Amazon SageMaker Debugger Insights Dashboard Controller.