使用 Amazon SageMaker 推論建議的先決條件

焦點模式

使用 Amazon SageMaker 推論建議的先決條件 - Amazon SageMaker AI

您必須先完成先決條件步驟，才能使用 Amazon SageMaker 推論建議程式。舉例來說，我們將示範如何使用 PyTorch (v1.7.1) ResNet-18 預先訓練的模型來處理這兩種類型的 Amazon SageMaker Inference Recommender 推薦任務。顯示的範例使用適用於 Python (Boto3) 的 AWS SDK。

注意

下列程式碼範例使用 Python。如果您在終端機或 AWS CLI中執行下列任何一個程式碼範例，請移除 ! 字首字元。
您可以在 Amazon SageMaker Studio 筆記本中使用 Python 3 (TensorFlow 2.6 Python 3.8 CPU Optimized) 核心執行下列範例。如需 Studio 的詳細資訊，請參閱Amazon SageMaker Studio。

為 Amazon SageMaker AI 建立 IAM 角色。

為已連接 IAM 受管政策的 Amazon SageMaker AI 建立 AmazonSageMakerFullAccess IAM 角色。

設定您的環境。

匯入相依性，並為您的 SageMaker AI IAM 角色 AWS 區域（從步驟 1) 和 SageMaker AI 用戶端建立變數。


!pip install --upgrade pip awscli botocore boto3  --quiet
from sagemaker import get_execution_role, Session, image_uris
import boto3

region = boto3.Session().region_name
role = get_execution_role()
sagemaker_client = boto3.client("sagemaker", region_name=region)
sagemaker_session = Session()

(選用) 檢閱由 Inference Recommender 進行基準測試的現有模型。

Inference Recommender 從熱門的 Model Zoo 為模型進行基準測試。Inference Recommender 支援您的模型，即使模型尚未進行基準測試。

使用 ListModelMetaData 來取得回應物件，物件會列出一般 Model Zoo 中發現之機器學習模型的網域、架構、任務和模型名稱。

您可以在稍後的步驟中使用網域、架構、架構版本、任務和模型名稱來選取推論 Docker 映像，並在 SageMaker 模型註冊表中註冊您的模型。以下範例示範如何使用 SDK for Python (Boto3) 列出模型中繼資料：


list_model_metadata_response=sagemaker_client.list_model_metadata()

輸出包含模型摘要 (ModelMetadataSummaries) 和回應中繼資料 (ResponseMetadata)，類似下列範例：


{
    'ModelMetadataSummaries': [{
            'Domain': 'NATURAL_LANGUAGE_PROCESSING',
            'Framework': 'PYTORCH:1.6.0',
             'Model': 'bert-base-cased',
             'Task': 'FILL_MASK'
             },
            {
             'Domain': 'NATURAL_LANGUAGE_PROCESSING',
             'Framework': 'PYTORCH:1.6.0',
             'Model': 'bert-base-uncased',
             'Task': 'FILL_MASK'
             },
            {
            'Domain': 'COMPUTER_VISION',
             'Framework': 'MXNET:1.8.0',
             'Model': 'resnet18v2-gluon',
             'Task': 'IMAGE_CLASSIFICATION'
             },
             {
             'Domain': 'COMPUTER_VISION',
             'Framework': 'PYTORCH:1.6.0',
             'Model': 'resnet152',
             'Task': 'IMAGE_CLASSIFICATION'
             }],
    'ResponseMetadata': {
                            'HTTPHeaders': {
                            'content-length': '2345',
                            'content-type': 'application/x-amz-json-1.1',
                            'date': 'Tue, 19 Oct 2021 20:52:03 GMT',
                            'x-amzn-requestid': 'xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx'
                          },
    'HTTPStatusCode': 200,
    'RequestId': 'xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx',
    'RetryAttempts': 0
    }
}

在此示範中，我們使用 PyTorch (v1.7.1) ResNet-18 模型來執行影像分類。下列 Python 程式碼範例會將架構、架構版本、網域和任務儲存到變數中，以供之後使用：


# ML framework details
framework = 'pytorch'
framework_version = '1.7.1'

# ML model details
ml_domain = 'COMPUTER_VISION'
ml_task = 'IMAGE_CLASSIFICATION'

將您的機器學習模型上傳到 Amazon S3。

如果您沒有預先訓練的機器學習模型，請使用此 PyTorch (v1.7.1) ResNet-18 模型：


# Optional: Download a sample PyTorch model
import torch
from torchvision import models, transforms, datasets

# Create an example input for tracing
image = torch.zeros([1, 3, 256, 256], dtype=torch.float32)

# Load a pretrained resnet18 model from TorchHub
model = models.resnet18(pretrained=True)

# Tell the model we are using it for evaluation (not training). Note this is required for Inferentia compilation.
model.eval()
model_trace = torch.jit.trace(model, image)

# Save your traced model
model_trace.save('model.pth')

下載範例推論指令碼 inference.py。建立 code 目錄並將推論指令碼移至 code 目錄。


# Download the inference script
!wget https://aws-ml-blog-artifacts.s3.us-east-2.amazonaws.com/inference.py

# move it into a code/ directory
!mkdir code
!mv inference.py code/

Amazon SageMaker AI 需要預先訓練的機器學習模型封裝為壓縮的 TAR 檔案 (*.tar.gz)。壓縮模型和推論指令碼以滿足此需求：


!tar -czf test.tar.gz model.pth code/inference.py

佈建端點時，封存中的檔案會解壓縮到端點上的 /opt/ml/model/。

將模型和模型成品壓縮為 .tar.gz 檔案後，請將它們上傳到 Amazon S3 儲存貯體。下列範例示範如何使用將模型上傳至 Amazon S3 AWS CLI：


!aws s3 cp test.tar.gz s3://{your-bucket}/models/

選取一個預先建立的 Docker 推論影像，或建立您自己的推論 Docker 映像。

SageMaker AI 為其內建演算法提供容器，並為一些最常見的機器學習架構提供預先建置的 Docker 映像，例如 Apache MXNet、TensorFlow、PyTorch 和 Chainer。如需可用 SageMaker AI 映像的完整清單，請參閱可用的深度學習容器映像。

如果現有的 SageMaker AI 容器都不符合您的需求，而且您沒有自己的現有容器，請建立新的 Docker 映像。請參閱具有自訂推論程式碼的容器，了解如何建立 Docker 映像的資訊。

以下內容示範如何使用 SageMaker Python SDK 擷取 PyTorch 1.7.1 版本的推論影像：
```
from sagemaker import image_uris

## Uncomment and replace with your own values if you did not define  
## these variables a previous step.
#framework = 'pytorch'
#framework_version = '1.7.1'

# Note: you can use any CPU-based instance here, 
# this is just to set the arch as CPU for the Docker image
instance_type = 'ml.m5.2xlarge' 

image_uri = image_uris.retrieve(framework, 
                                region, 
                                version=framework_version, 
                                py_version='py3', 
                                instance_type=instance_type, 
                                image_scope='inference')
```
如需可用的 SageMaker AI 執行個體清單，請參閱 Amazon SageMaker AI 定價。
建立範例承載封存。

建立包含負載測試工具可傳送至 SageMaker AI 端點之個別檔案的封存。您的推論程式碼必須能夠從範例承載讀取檔案格式。

下列步驟會下載此範例在稍後的步驟中針對 ResNet-18 模型使用的 .jpg 影像。
```
!wget https://cdn.pixabay.com/photo/2020/12/18/05/56/flowers-5841251_1280.jpg
```
將範例承載壓縮為 tarball：
```
!tar -cvzf payload.tar.gz flowers-5841251_1280.jpg
```
將範例承載上傳到 Amazon S3，並留意 Amazon S3 URI：
```
!aws s3 cp payload.tar.gz s3://{bucket}/models/
```
您在稍後步驟中需要 Amazon S3 URI，因此請將其儲存在變數中：
```
bucket_prefix='models'
bucket = '<your-bucket-name>' # Provide the name of your S3 bucket
payload_s3_key = f"{bucket_prefix}/payload.tar.gz"
sample_payload_url= f"s3://{bucket}/{payload_s3_key}"
```
準備建議任務的模型輸入

針對最後一個先決條件，您可選擇以兩種方式準備您的模型輸入。您可以向 SageMaker Model Registry 註冊模型，該登錄檔可用來為模型編製生產的目錄，也可以建立 SageMaker AI 模型，並在建立建議任務時於 ContainerConfig 欄位中指定。如果您想要善用模型註冊表提供的功能，例如管理模型版本和自動化模型部署，則第一個選項會是首選。如果您想快速入門，第二個選擇則是理想的選擇。對於第一個選項，請前往步驟 7。對於第二個選項，請略過步驟 7 並前往步驟 8。
選項 1：在模型註冊表檔中註冊模型

使用 SageMaker 模型註冊表，您可以為生產模型編目、管理模型版本、將中繼資料 (例如訓練指標) 與模型相連結、管理模型的核准狀態、將模型部署到生產環境，以及使用 CI/CD 自動化模型部署。

當您使用 SageMaker 模型註冊表來追蹤和管理模型時，它們會在模型套件群組中顯示為已建立版本的模型套件。未版本的模型套件不屬於模型群組。模型套件群組包含模型的多個版本或迭代。雖然不需要為註冊表中的每個模型建立它們，但它們可以協助組織所有具有相同目的之各種模型並提供自動版本控制。

若要使用 Amazon SageMaker Inference Recommender，您必須擁有版本控制的模型套件。您可以使用適用於 Python (Boto3) 的 AWS SDK 或使用 Amazon SageMaker Studio Classic，以程式設計方式建立版本化模型套件。若要以程式設計方式建立版本化的模型套件，請先使用 CreateModelPackageGroup API 建立模型套件群組。接下來，使用 CreateModelPackage API 建立模型套件。呼叫此方法會建立版本化的模型套件。

如需如何使用和建立模型群組註冊模型版本 Amazon SageMaker Studio Classic，以程式設計和互動方式分別建立模型套件群組以及如何建立版本化模型套件的詳細說明，請參閱適用於 Python (Boto3) 的 AWS SDK 和。

下列程式碼範例示範如何使用適用於 Python (Boto3) 的 AWS SDK建立版本化模型套件。

注意
您不需要核准模型套件即可建立 Inference Recommender 任務。
1. 建立模型套件群組
  
  使用 CreateModelPackageGroup API 建立模型套件群組。針對 ModelPackageGroupName 為模型套件群組提供名稱，並選擇性地在 ModelPackageGroupDescription 欄位中提供模型套件的描述。
```
model_package_group_name = '<INSERT>'
model_package_group_description = '<INSERT>' 

model_package_group_input_dict = {
 "ModelPackageGroupName" : model_package_group_name,
 "ModelPackageGroupDescription" : model_package_group_description,
}

model_package_group_response = sagemaker_client.create_model_package_group(**model_package_group_input_dict)
```
  請參閱 Amazon SageMaker API 參考指南，以取得可傳遞給 CreateModelPackageGroup 的選用和必要引數之完整清單。
  
  透過指定執行推論程式碼的 Docker 映像和模型成品的 Amazon S3 位置，並為提供值來建立模型套件InferenceSpecification。 InferenceSpecification應包含可使用此模型套件執行的推論任務相關資訊，包括下列項目：
  - 執行推論程式碼之影像的 Amazon ECR 路徑。
  - （選用）模型套件支援用於轉換任務的執行個體類型，以及用於推論的即時端點。
  - 模型套件支援用於推論的輸入和輸出內容格式。
  此外，建立模型套件時，您必須指定下列參數：
  - 網域：模型套件及其元件的機器學習網域。常見的機器學習網域包括電腦視覺和自然語言處理。
  - 任務：模型套件完成的機器學習任務。常見的機器學習任務包括物件偵測和映像分類。如果 API 參考指南中列出的任何任務都不符合您的使用案例，請指定 "OTHER"。如需支援的機器學習任務清單，請參閱任務 API 欄位描述。
  - SamplePayloadUrl：儲存範例承載的 Amazon Simple Storage Service (Amazon S3) 路徑。此路徑必須指向單一 GZIP 壓縮 TAR 封存 (.tar.gz 尾碼）。
  - 架構：模型套件容器映像的機器學習架構。
  - FrameworkVersion：模型套件容器映像的架構版本。
  如果您提供執行個體類型的允許清單，以用於即時產生 SupportedRealtimeInferenceInstanceTypes 的推論，Inference Recommender 會在Default任務期間限制執行個體類型的搜尋空間。如有預算限制，或知道有一組特定的執行個體類型可支援您的模型和容器映像檔，請使用此參數。
  
  在上一個步驟中，我們下載了預先訓練的 ResNet18 模型，並將其儲存在 Amazon S3 儲存貯體中名為 models 的目錄。我們擷取了一個 PyTorch (v1.7.1) 深度學習容器推論影像，並將 URI 儲存在一個名為 image_uri 的變數。使用下列程式碼範例中的這些變數，定義用來輸入 CreateModelPackage API 的字典。
```
# Provide the Amazon S3 URI of your compressed tarfile
# so that Model Registry knows where to find your model artifacts
bucket_prefix='models'
bucket = '<your-bucket-name>' # Provide the name of your S3 bucket
model_s3_key = f"{bucket_prefix}/test.tar.gz"
model_url= f"s3://{bucket}/{model_s3_key}"

# Similar open source model to the packaged model
# The name of the ML model as standardized by common model zoos
nearest_model_name = 'resnet18'

# The supported MIME types for input and output data. In this example, 
# we are using images as input.
input_content_type='image/jpeg'


# Optional - provide a description of your model.
model_package_description = '<INSERT>'

## Uncomment if you did not store the domain and task in an earlier
## step 
#ml_domain = 'COMPUTER_VISION'
#ml_task = 'IMAGE_CLASSIFICATION'

## Uncomment if you did not store the framework and framework version
## in a previous step.
#framework = 'PYTORCH'
#framework_version = '1.7.1'

# Optional: Used for optimizing your model using SageMaker Neo
# PyTorch uses NCHW format for images
data_input_configuration = "[[1,3,256,256]]"

# Create a dictionary to use as input for creating a model pacakge group
model_package_input_dict = {
        "ModelPackageGroupName" : model_package_group_name,
        "ModelPackageDescription" : model_package_description,
        "Domain": ml_domain,
        "Task": ml_task,
        "SamplePayloadUrl": sample_payload_url,
        "InferenceSpecification": {
                "Containers": [
                    {
                        "Image": image_uri,
                        "ModelDataUrl": model_url,
                        "Framework": framework.upper(), 
                        "FrameworkVersion": framework_version,
                        "NearestModelName": nearest_model_name,
                        "ModelInput": {"DataInputConfig": data_input_configuration}
                    }
                    ],
                "SupportedContentTypes": [input_content_type]
        }
    }
```
2. 建立模型套件
  
  使用 CreateModelPackage API 建立模型套件。傳遞在上一步中定義的輸入字典：
```
model_package_response = sagemaker_client.create_model_package(**model_package_input_dict)
```
  您需要模型套件 ARN 才能使用 Amazon SageMaker Inference Recommender。請記下模型套件的 ARN 或將其儲存在變數中：
```
model_package_arn = model_package_response["ModelPackageArn"]

print('ModelPackage Version ARN : {}'.format(model_package_arn))
```

選項 2：建立模型並設定 ContainerConfig 欄位

如果您想要啟動推論建議任務，而不需要在模型註冊表中註冊模型，請使用此選項。在下列步驟中，您可以在 SageMaker AI 中建立模型，並將 ContainerConfig 欄位設定為建議任務的輸入。

建立模型

使用 CreateModel API 建立模型。如需在將模型部署至 SageMaker AI Hosting 時呼叫此方法的範例，請參閱建立模型 (適用於 Python (Boto3) 的 AWS SDK)。

在上一個步驟中，我們下載了預先訓練的 ResNet18 模型，並將其儲存在 Amazon S3 儲存貯體中名為 models 的目錄。我們擷取了一個 PyTorch (v1.7.1) 深度學習容器推論影像，並將 URI 儲存在一個名為 image_uri 的變數。我們在下面的程式碼範例中使用這些變數，我們定義用作輸入 CreateModel API 的字典。



model_name = '<name_of_the_model>'
# Role to give SageMaker permission to access AWS services.
sagemaker_role= "arn:aws:iam::<region>:<account>:role/*"

# Provide the Amazon S3 URI of your compressed tarfile
# so that Model Registry knows where to find your model artifacts
bucket_prefix='models'
bucket = '<your-bucket-name>' # Provide the name of your S3 bucket
model_s3_key = f"{bucket_prefix}/test.tar.gz"
model_url= f"s3://{bucket}/{model_s3_key}"

#Create model
create_model_response = sagemaker_client.create_model(
    ModelName = model_name,
    ExecutionRoleArn = sagemaker_role, 
    PrimaryContainer = {
        'Image': image_uri,
        'ModelDataUrl': model_url,
    })

設定 ContainerConfig 欄位

接下來，您必須使用剛才建立的模型來設定 ContainerConfig 欄位，並在其中指定下列參數：

Domain：模型的機器學習網域及其元件，例如電腦視覺或自然語言處理。
Task：模型完成的機器學習任務，例如影像分類或物件偵測。
PayloadConfig：建議任務的承載組態。如需有關子欄位的詳細資訊，請參閱RecommendationJobPayloadConfig。
Framework：容器映像的機器學習架構，例如 PyTorch。
FrameworkVersion：容器映像的架構版本。
(選用) SupportedInstanceTypes：用於即時產生推論的執行個體類型清單。

如果您使用 SupportedInstanceTypes 參數，Inference Recommender 會在 Default 任務期間限制執行個體類型的搜尋空間。如有預算限制，或知道有一組特定的執行個體類型可支援您的模型和容器映像檔，請使用此參數。

在下面的程式碼範例中，我們使用先前定義的參數以及NearestModelName，以定義作為 CreateInferenceRecommendationsJob API 輸入的字典。


## Uncomment if you did not store the domain and task in a previous step
#ml_domain = 'COMPUTER_VISION'
#ml_task = 'IMAGE_CLASSIFICATION'

## Uncomment if you did not store the framework and framework version in a previous step
#framework = 'PYTORCH'
#framework_version = '1.7.1'

# The name of the ML model as standardized by common model zoos
nearest_model_name = 'resnet18'

# The supported MIME types for input and output data. In this example, 
# we are using images as input
input_content_type='image/jpeg'

# Optional: Used for optimizing your model using SageMaker Neo
# PyTorch uses NCHW format for images
data_input_configuration = "[[1,3,256,256]]"

# Create a dictionary to use as input for creating an inference recommendation job
container_config = {
        "Domain": ml_domain,
        "Framework": framework.upper(), 
        "FrameworkVersion": framework_version,
        "NearestModelName": nearest_model_name,
        "PayloadConfig": { 
            "SamplePayloadUrl": sample_payload_url,
            "SupportedContentTypes": [ input_content_type ]
         },
        "DataInputConfig": data_input_configuration
        "Task": ml_task,
        }