Track the lineage of a pipeline
In this tutorial, you use Amazon SageMaker Studio to track the lineage of an Amazon SageMaker AI ML Pipeline.
The pipeline was created by the
Orchestrating
Jobs with Amazon SageMaker Model Building Pipelines
Lineage tracking in Studio is centered around a directed acyclic graph (DAG). The DAG represents the steps in a pipeline. From the DAG you can track the lineage from any step to any other step. The following diagram displays the steps in the pipeline. These steps appear as a DAG in Studio.
To track the lineage of a pipeline in the Amazon SageMaker Studio console, complete the following steps based on whether you use Studio or Studio Classic.
- Studio
-
To track the lineage of a pipeline
-
Open the SageMaker Studio console by following the instructions in Launch Amazon SageMaker Studio.
-
In the left navigation pane, select Pipelines.
-
(Optional) To filter the list of pipelines by name, enter a full or partial pipeline name in the search field.
-
In the Name column, select a pipeline name to view details about the pipeline.
Choose the Executions tab.
In the Name column of the Executions table, select the name of a pipeline execution to view.
-
At the top right of the Executions page, choose the vertical ellipsis and choose Download pipeline definition (JSON). You can view the file to see how the pipeline graph was defined.
Choose Edit to open the Pipeline Designer.
Use the resizing and zoom controls at the top right corner of the canvas to zoom in and out of the graph, fit the graph to screen, or expand the graph to full screen.
-
To view your training, validation, and test datasets, complete the following steps:
Choose the Processing step in your pipeline graph.
In the right sidebar, choose the Overview tab.
In the Files section, find the Amazon S3 paths to the training, validation, and test datasets.
-
To view your model artifacts, complete the following steps:
Choose the Training step in your pipeline graph.
In the right sidebar, choose the Overview tab.
In the Files section, find the Amazon S3 paths to the model artifact.
-
To find the model package ARN, complete the following steps:
Choose the Register model step.
In the right sidebar, choose the Overview tab.
In the Files section, find the ARN of the model package.
-
- Studio Classic
-
To track the lineage of a pipeline
-
Sign in to Amazon SageMaker Studio Classic. For more information, see Launch Amazon SageMaker Studio Classic.
-
In the left sidebar of Studio, choose the Home icon ( ).
-
In the menu, select Pipelines.
-
Use the Search box to filter the pipelines list.
-
Choose the
AbalonePipeline
pipeline to view the execution list and other details about the pipeline. Choose the Property Inspector icon ( ) in the right sidebar to open the TABLE PROPERTIES pane, where you can choose which properties to view.
-
Choose the Settings tab and then choose Download pipeline definition file. You can view the file to see how the pipeline graph was defined.
-
On the Execution tab, select the first row in the execution list to view its execution graph and other details about the execution. Note that the graph matches the diagram displayed at the beginning of the tutorial.
Use the resizing icons on the lower-right side of the graph to zoom in and out of the graph, fit the graph to screen, or expand the graph to full screen. To focus on a specific part of the graph, you can select a blank area of the graph and drag the graph to center on that area. The inset on the lower-right side of the graph displays your location in the graph.
-
On the Graph tab, choose the
AbaloneProcess
step to view details about the step. -
Find the Amazon S3 paths to the training, validation, and test datasets in the Output tab, under Files.
Note
To get the full paths, right-click the path and then choose Copy cell contents.
s3://sagemaker-eu-west-1-acct-id/sklearn-abalone-process-2020-12-05-17-28-28-509/output/train s3://sagemaker-eu-west-1-acct-id/sklearn-abalone-process-2020-12-05-17-28-28-509/output/validation s3://sagemaker-eu-west-1-acct-id/sklearn-abalone-process-2020-12-05-17-28-28-509/output/test
-
Choose the
AbaloneTrain
step. -
Find the Amazon S3 path to the model artifact in the Output tab, under Files:
s3://sagemaker-eu-west-1-acct-id/AbaloneTrain/pipelines-6locnsqz4bfu-AbaloneTrain-NtfEpI0Ahu/output/model.tar.gz
-
Choose the
AbaloneRegisterModel
step. -
Find the ARN of the model package in the Output tab, under Files:
arn:aws:sagemaker:eu-west-1:acct-id:model-package/abalonemodelpackagegroupname/2
-