SageMaker Components for Kubeflow Pipelines - Amazon SageMaker

SageMaker Components for Kubeflow Pipelines

This document outlines how to use SageMaker Components for Kubeflow Pipelines (KFP). With these pipeline components, you can create and monitor native SageMaker training, tuning, endpoint deployment, and batch transform jobs from your Kubeflow Pipelines. By running Kubeflow Pipeline jobs on SageMaker, you move data processing and training jobs from the Kubernetes cluster to SageMaker’s machine learning-optimized managed service. This document assumes prior knowledge of Kubernetes and Kubeflow.

What is Kubeflow Pipelines?

Kubeflow Pipelines (KFP) is a platform for building and deploying portable, scalable machine learning (ML) workflows based on Docker containers. The Kubeflow Pipelines platform consists of the following:

  • A user interface (UI) for managing and tracking experiments, jobs, and runs.

  • An engine (Argo) for scheduling multi-step ML workflows.

  • A AWS SDK for Python (Boto3) SDK for defining and manipulating pipelines and components.

  • Notebooks for interacting with the system using the SDK.

A pipeline is a description of an ML workflow expressed as a directed acyclic graph as shown in the following diagram.  Every step in the workflow is expressed as a Kubeflow Pipeline component, which is a AWS SDK for Python (Boto3) module.

If your data has been preprocessed, the standard pipeline can take a subset of the data and run hyperparameter optimization of your model. The pipeline then trains a model with the full dataset using the optimal hyperparameters. This model is used for both batch inference and endpoint creation.

For more information on Kubeflow Pipelines, see the Kubeflow Pipelines documentation.

Kubeflow Pipeline components

A Kubeflow Pipeline component is a set of code used to execute one step of a Kubeflow pipeline. Components are represented by a AWS SDK for Python (Boto3) module built into a Docker image. These components make it fast and easy to write pipelines for experimentation and production environments without having to interact with the underlying Kubernetes infrastructure.

SageMaker Components for Kubeflow Pipelines versions

SageMaker Components for Kubeflow Pipelines come in two versions. Each version leverages a different backend to create and manage resources on SageMaker.

  • The SageMaker Components for Kubeflow Pipelines version 1 (v1.x or below) use Boto3 (AWS SDK for AWS SDK for Python (Boto3)) as backend.

  • The version 2 (v2.0.0-alpha2 and above) of SageMaker Components for Kubeflow Pipelines use SageMaker Operator for Kubernetes (ACK).

    AWS introduced ACK to facilitate a Kubernetes-native way of managing AWS Cloud resources. ACK includes a set of AWS service-specific controllers, one of which is the SageMaker controller. The SageMaker controller makes it easier for machine learning developers and data scientists using Kubernetes as their control plane to train, tune, and deploy machine learning (ML) models in SageMaker. For more information, see SageMaker Operators for Kubernetes

Both versions of the SageMaker Components for Kubeflow Pipelines are supported. However, the version 2 provides some additional advantages. In particular, it offers:

  1. A consistent experience to manage your SageMaker resources from any application; whether you are using Kubeflow pipelines, or Kubernetes CLI (kubectl) or other Kubeflow applications such as Notebooks.

  2. The flexibility to manage and monitor your SageMaker resources outside of the Kubeflow pipeline workflow.

  3. Zero setup time to use the components if you deployed the full Kubeflow on AWS release since the SageMaker Operator is part of its deployment.

What do SageMaker Components for Kubeflow Pipelines provide?

SageMaker Components for Kubeflow Pipelines offer an alternative to launching your compute-intensive jobs from SageMaker. The components integrate SageMaker with the portability and orchestration of Kubeflow Pipelines. Using the SageMaker Components for Kubeflow Pipelines (KFP), you can create and monitor your SageMaker resources as part of a Kubeflow Pipelines workflow. Each of the jobs in your pipelines runs on SageMaker instead of the local Kubernetes cluster. The job parameters, status, logs, and outputs from SageMaker are still accessible from the Kubeflow Pipelines UI.

The following SageMaker components have been created to integrate six key SageMaker features into your ML workflows. You can create a Kubeflow Pipeline built entirely using these components, or integrate individual components into your workflow as needed. Alternatively, you can find all SageMaker Components for Kubeflow Pipelines in GitHub.

There is no additional charge for using SageMaker Components for Kubeflow Pipelines. You incur charges for any SageMaker resources you use through these components.

Training components

Processing

The Processing component enables you to submit processing jobs to SageMaker directly from a Kubeflow Pipelines workflow. For more information, see SageMaker Processing Kubeflow Pipeline component version 1.

Training

The Training component allows you to submit SageMaker Training jobs directly from a Kubeflow Pipelines workflow. For more information, see SageMaker Training Kubeflow Pipelines component version 2. For information about Version 1 of the Training component see SageMaker Training Kubeflow Pipelines component version 1.

Hyperparameter Optimization

The Hyperparameter Optimization component enables you to submit hyperparameter tuning jobs to SageMaker directly from a Kubeflow Pipelines workflow. For more information, see SageMaker hyperparameter optimization Kubeflow Pipeline component version 1.

Inference components

Hosting Deploy

The Deploy component enables you to deploy a model in SageMaker Hosting from a Kubeflow Pipelines workflow. For more information, see SageMaker Hosting Services - Create Endpoint Kubeflow Pipeline component version 1.

Batch Transform component

The Batch Transform component enables you to run inference jobs for an entire dataset in SageMaker from a Kubeflow Pipelines workflow. For more information, see SageMaker Batch Transform Kubeflow Pipeline component version 1.

Ground Truth components

Ground Truth The Ground Truth component enables you to submit SageMaker Ground Truth labeling jobs directly from a Kubeflow Pipelines workflow. For more information, see SageMaker Ground Truth Kubeflow Pipelines component version 1.

Workteam

The Workteam component enables you to create SageMaker private workteam jobs directly from a Kubeflow Pipelines workflow. For more information, see SageMaker create private workteam Kubeflow Pipelines component version 1.

IAM permissions

Deploying Kubeflow Pipelines with SageMaker components requires the following three levels of IAM permissions:

  • An IAM user/role to access your AWS account (your_credentials). Note: You don’t need this at all if you already have access to KFP web UI and have your input data in Amazon S3, or if you already have an Amazon Elastic Kubernetes Service (Amazon EKS) cluster with KFP.

    You use this user/role from your gateway node, which can be your local machine or a remote instance, to:

    • Create an Amazon EKS cluster and install KFP

    • Create IAM roles/users

    • Create Amazon S3 buckets for your sample input data

    The IAM user/role needs the following permissions:

  • An IAM role used by pipeline pods (kfp-example-pod-role) or the SageMaker Operator for Kubernetes controller pod to access SageMaker. This permission is used to create and monitor SageMaker jobs. Note: You can limit permissions to the KFP and controller pods by creating and attaching your own custom policy.

    The role needs the following permission:

    • AmazonSageMakerFullAccess

  • An IAM role used by SageMaker jobs to access resources such as Amazon S3 and Amazon ECR etc. (kfp-example-sagemaker-execution-role).

    Your SageMaker jobs use this role to:

    • Access SageMaker resources

    • Input Data from Amazon S3

    • Store your output model to Amazon S3

    The role needs the following permissions:

    • AmazonSageMakerFullAccess

    • AmazonS3FullAccess

These are all the IAM users/roles you need to run KFP components for SageMaker.

When you have run the components and have created the SageMaker endpoint, you also need a role with the sagemaker:InvokeEndpoint permission to query inference endpoints.

Converting Pipelines to use SageMaker

You can convert an existing pipeline to use SageMaker by porting your generic AWS SDK for Python (Boto3) processing containers and training containers. If you are using SageMaker for inference, you also need to attach IAM permissions to your cluster and convert an artifact to a model.