SageMaker AI Components for Kubeflow Pipelines
With SageMaker AI components for Kubeflow Pipelines, you can create and monitor native SageMaker AI training, tuning, endpoint deployment, and batch transform jobs from your Kubeflow Pipelines. By running Kubeflow Pipeline jobs on SageMaker AI, you move data processing and training jobs from the Kubernetes cluster to SageMaker AI's machine learning-optimized managed service. This document assumes prior knowledge of Kubernetes and Kubeflow.
Contents
- What are Kubeflow Pipelines?
- What are Kubeflow Pipeline components?
- Why use SageMaker AI Components for Kubeflow Pipelines?
- SageMaker AI Components for Kubeflow Pipelines versions
- List of SageMaker AI Components for Kubeflow Pipelines
- IAM permissions
- Converting pipelines to use SageMaker AI
- Install Kubeflow Pipelines
- Use SageMaker AI components
What are Kubeflow Pipelines?
Kubeflow Pipelines (KFP) is a platform for building and deploying portable, scalable machine learning (ML) workflows based on Docker containers. The Kubeflow Pipelines platform consists of the following:
-
A user interface (UI) for managing and tracking experiments, jobs, and runs.
-
An engine (Argo) for scheduling multi-step ML workflows.
-
An SDK for defining and manipulating pipelines and components.
-
Notebooks for interacting with the system using the SDK.
A pipeline is a description of an ML workflow expressed as a directed acyclic
graph
For more information on Kubeflow Pipelines, see the Kubeflow Pipelines documentation
What are Kubeflow Pipeline components?
A Kubeflow Pipeline component is a set of code used to execute one step of a Kubeflow pipeline. Components are represented by a Python module built into a Docker image. When the pipeline runs, the component's container is instantiated on one of the worker nodes on the Kubernetes cluster running Kubeflow, and your logic is executed. Pipeline components can read outputs from the previous components and create outputs that the next component in the pipeline can consume. These components make it fast and easy to write pipelines for experimentation and production environments without having to interact with the underlying Kubernetes infrastructure.
You can use SageMaker AI Components in your Kubeflow pipeline. Rather than encapsulating your logic in a custom container, you simply load the components and describe your pipeline using the Kubeflow Pipelines SDK. When the pipeline runs, your instructions are translated into a SageMaker AI job or deployment. The workload then runs on the fully managed infrastructure of SageMaker AI.
Why use SageMaker AI Components for Kubeflow Pipelines?
SageMaker AI Components for Kubeflow Pipelines offer an alternative to launching your compute-intensive jobs from SageMaker AI. The components integrate SageMaker AI with the portability and orchestration of Kubeflow Pipelines. Using the SageMaker AI Components for Kubeflow Pipelines, you can create and monitor your SageMaker AI resources as part of a Kubeflow Pipelines workflow. Each of the jobs in your pipelines runs on SageMaker AI instead of the local Kubernetes cluster allowing you to take advantage of key SageMaker AI features such as data labeling, large-scale hyperparameter tuning and distributed training jobs, or one-click secure and scalable model deployment. The job parameters, status, logs, and outputs from SageMaker AI are still accessible from the Kubeflow Pipelines UI.
The SageMaker AI components integrate key SageMaker AI features into your ML workflows from preparing data, to building, training, and deploying ML models. You can create a Kubeflow Pipeline built entirely using these components, or integrate individual components into your workflow as needed. The components are available in one or two versions. Each version of a component leverages a different backend. For more information on those versions, see SageMaker AI Components for Kubeflow Pipelines versions.
There is no additional charge for using SageMaker AI Components for Kubeflow Pipelines. You incur charges for any SageMaker AI resources you use through these components.
SageMaker AI Components for Kubeflow Pipelines versions
SageMaker AI Components for Kubeflow Pipelines come in two versions. Each version leverages a different backend to create and manage resources on SageMaker AI.
-
The SageMaker AI Components for Kubeflow Pipelines version 1 (v1.x or below) use Boto3
(AWS SDK for Python (Boto3)) as backend. -
The version 2 (v2.0.0-alpha2 and above) of SageMaker AI Components for Kubeflow Pipelines use SageMaker AI Operator for Kubernetes (ACK)
. AWS introduced ACK
to facilitate a Kubernetes-native way of managing AWS Cloud resources. ACK includes a set of AWS service-specific controllers, one of which is the SageMaker AI controller. The SageMaker AI controller makes it easier for machine learning developers and data scientists using Kubernetes as their control plane to train, tune, and deploy machine learning (ML) models in SageMaker AI. For more information, see SageMaker AI Operators for Kubernetes
Both versions of the SageMaker AI Components for Kubeflow Pipelines are supported. However, the version 2 provides some additional advantages. In particular, it offers:
-
A consistent experience to manage your SageMaker AI resources from any application; whether you are using Kubeflow pipelines, or Kubernetes CLI (
kubectl
) or other Kubeflow applications such as Notebooks. -
The flexibility to manage and monitor your SageMaker AI resources outside of the Kubeflow pipeline workflow.
-
Zero setup time to use the SageMaker AI components if you deployed the full Kubeflow on AWS
release since the SageMaker AI Operator is part of its deployment.
List of SageMaker AI Components for Kubeflow Pipelines
The following is a list of all SageMaker AI Components for Kubeflow Pipelines and their available
versions. Alternatively, you can find all SageMaker AI Components for Kubeflow Pipelines in GitHub
Note
We encourage users to utilize Version 2 of a SageMaker AI component wherever it is available.
-
Ground Truth
The Ground Truth component enables you to submit SageMaker AI Ground Truth labeling jobs directly from a Kubeflow Pipelines workflow.
Version 1 of the component Version 2 of the component SageMaker AI Ground Truth Kubeflow Pipelines component version 1
X
-
Workteam
The Workteam component enables you to create SageMaker AI private workteam jobs directly from a Kubeflow Pipelines workflow.
Version 1 of the component Version 2 of the component SageMaker AI create private workteam Kubeflow Pipelines component version 1
X
-
Processing
The Processing component enables you to submit processing jobs to SageMaker AI directly from a Kubeflow Pipelines workflow.
Version 1 of the component Version 2 of the component X
-
Training
The Training component allows you to submit SageMaker Training jobs directly from a Kubeflow Pipelines workflow.
Version 1 of the component Version 2 of the component -
Hyperparameter Optimization
The Hyperparameter Optimization component enables you to submit hyperparameter tuning jobs to SageMaker AI directly from a Kubeflow Pipelines workflow.
Version 1 of the component Version 2 of the component SageMaker AI hyperparameter optimization Kubeflow Pipeline component version 1
X
-
Hosting Deploy
The Hosting components allow you to deploy a model using SageMaker AI hosting services from a Kubeflow Pipelines workflow.
Version 1 of the component Version 2 of the component SageMaker AI Hosting Services - Create Endpoint Kubeflow Pipeline component version 1
. Version 2 of the Hosting components consists of the three sub-components needed to create a hosting deployment on SageMaker AI.
-
A SageMaker AI Model Kubeflow Pipelines component version 2
responsible for the model artifacts and the model image registry path that contains the inference code. -
A SageMaker AI Endpoint Configuration Kubeflow Pipelines component version 2
responsible for defining the configuration of the endpoint such as the instance type, models, number of instances, and serverless inference option. -
A SageMaker AI Endpoint Kubeflow Pipelines component version 2
responsible for creating or updating the endpoint on SageMaker AI as specified in the endpoint configuration.
-
-
Batch Transform
The Batch Transform component allows you to run inference jobs for an entire dataset in SageMaker AI from a Kubeflow Pipelines workflow.
Version 1 of the component Version 2 of the component SageMaker AI Batch Transform Kubeflow Pipeline component version 1
X
-
Model Monitor
The Model Monitor components allow you to monitor the quality of SageMaker AI machine learning models in production from a Kubeflow Pipelines workflow.
Version 1 of the component Version 2 of the component X
The Model Monitor components consist of four sub-components for monitoring drift in a model.
-
A SageMaker AI Data Quality Job Definition Kubeflow Pipelines component version 2
responsible for monitoring drift in data quality. -
A SageMaker AI Model Quality Job Definition Kubeflow Pipelines component version 2
responsible for monitoring drift in model quality metrics. -
A SageMaker AI Model Bias Job Definition Kubeflow Pipelines component version 2
responsible for monitoring bias in a model's predictions. -
A SageMaker AI Model Explainability Job Definition Kubeflow Pipelines component version 2
responsible for monitoring drift in feature attribution.
Additionally, for on-schedule monitoring at a specified frequency, a fifth component, SageMaker AI Monitoring Schedule Kubeflow Pipelines component version 2
, is responsible for monitoring the data collected from a real-time endpoint on a schedule. For more information on Amazon SageMaker Model Monitor, see Data and model quality monitoring with Amazon SageMaker Model Monitor.
-
IAM permissions
Deploying Kubeflow Pipelines with SageMaker AI components requires the following three layers of authentication:
-
An IAM role granting your gateway node (which can be your local machine or a remote instance) access to the Amazon Elastic Kubernetes Service (Amazon EKS) cluster.
The user accessing the gateway node assumes this role to:
-
Create an Amazon EKS cluster and install KFP
-
Create IAM roles
-
Create Amazon S3 buckets for your sample input data
The role requires the following permissions:
-
CloudWatchLogsFullAccess
-
IAMFullAccess
-
AmazonS3FullAccess
-
AmazonEC2FullAccess
-
AmazonEKSAdminPolicy (Create this policy using the schema from Amazon EKS Identity-Based Policy Examples)
-
-
A Kubernetes IAM execution role assumed by Kubernetes pipeline pods (kfp-example-pod-role) or the SageMaker AI Operator for Kubernetes controller pod to access SageMaker AI. This role is used to create and monitor SageMaker AI jobs from Kubernetes.
The role requires the following permission:
-
AmazonSageMakerFullAccess
You can limit permissions to the KFP and controller pods by creating and attaching your own custom policy.
-
-
A SageMaker AI IAM execution role assumed by SageMaker AI jobs to access AWS resources such as Amazon S3 or Amazon ECR (kfp-example-sagemaker-execution-role).
SageMaker AI jobs use this role to:
-
Access SageMaker AI resources
-
Input Data from Amazon S3
-
Store your output model to Amazon S3
The role requires the following permissions:
-
AmazonSageMakerFullAccess
-
AmazonS3FullAccess
-
Converting pipelines to use SageMaker AI
You can convert an existing pipeline to use SageMaker AI by porting your generic Python processing containers and training containers. If you are using SageMaker AI for inference, you also need to attach IAM permissions to your cluster and convert an artifact to a model.