使用 kubectl 從 JumpStart 部署模型

下列步驟說明如何使用 kubectl 將 JumpStart 模型部署至 HyperPod 叢集。

下列指示包含在終端機中執行的程式碼儲存格和命令。執行這些命令之前，請確定您已使用 AWS 登入資料設定環境。

先決條件

開始之前，請確認您已：

在 Amazon SageMaker HyperPod 叢集上設定推論功能。如需詳細資訊，請參閱設定 HyperPod 叢集以進行模型部署。
在終端機中安裝 kubectl 公用程式和設定的 jq。

設定和組態

選取您的區域。
```
export REGION=<region>
```
檢視所有 SageMaker 公有中樞模型和 HyperPod 叢集。

JumpstartModel 從 JumpstartPublic Hub 選取。JumpstartPublic 中樞有大量可用的模型，因此您可以使用反覆列出公NextToken有中樞中的所有可用模型。


aws sagemaker list-hub-contents --hub-name SageMakerPublicHub --hub-content-type Model --query '{Models: HubContentSummaries[].{ModelId:HubContentName,Version:HubContentVersion}, NextToken: NextToken}' --output json


export MODEL_ID="deepseek-llm-r1-distill-qwen-1-5b"
export MODEL_VERSION="2.0.4"

設定您在下列變數中選取的模型 ID 和叢集名稱。

注意

請洽詢您的叢集管理員，以確保已授與此角色或使用者的許可。您可以執行 !aws sts get-caller-identity --query "Arn"來檢查您在終端機中使用的角色或使用者。


aws sagemaker list-clusters --output table

# Select the cluster name where you want to deploy the model.
export HYPERPOD_CLUSTER_NAME="<insert cluster name here>"

# Select the instance that is relevant for your model deployment and exists within the selected cluster.
# List availble instances in your HyperPod cluster
aws sagemaker describe-cluster --cluster-name=$HYPERPOD_CLUSTER_NAME --query "InstanceGroups[].{InstanceType:InstanceType,Count:CurrentCount}" --output table

# List supported instance types for the selected model
aws sagemaker describe-hub-content --hub-name SageMakerPublicHub --hub-content-type Model --hub-content-name "$MODEL_ID" --output json | jq -r '.HubContentDocument | fromjson | {Default: .DefaultInferenceInstanceType, Supported: .SupportedInferenceInstanceTypes}'


# Select and instance type from the cluster that is compatible with the model. 
# Make sure that the selected instance is either default or supported instance type for the jumpstart model 
export INSTANCE_TYPE="<Instance_type_In_cluster"

與叢集管理員確認您可以使用哪個命名空間。管理員應該已在命名空間中建立hyperpod-inference服務帳戶。
```
export CLUSTER_NAMESPACE="default"
```

設定要建立的端點和自訂物件的名稱。


export SAGEMAKER_ENDPOINT_NAME="deepsek-qwen-1-5b-test"

以下是 Jumpstart deepseek-llm-r1-distill-qwen-1-5b模型部署的範例。根據上述步驟中選取的模型建立類似的部署 yaml 檔案。


cat << EOF > jumpstart_model.yaml
---
apiVersion: inference.sagemaker.aws.amazon.com/v1alpha1
kind: JumpStartModel
metadata:
  name: $SAGEMAKER_ENDPOINT_NAME
  namespace: $CLUSTER_NAMESPACE 
spec:
  sageMakerEndpoint:
    name: $SAGEMAKER_ENDPOINT_NAME
  model:
    modelHubName: SageMakerPublicHub
    modelId: $MODEL_ID
    modelVersion: $MODEL_VERSION
  server:
    instanceType: $INSTANCE_TYPE
  metrics:
    enabled: true
  environmentVariables:
    - name: SAMPLE_ENV_VAR
      value: "sample_value"
  maxDeployTimeInSeconds: 1800
  autoScalingSpec:
    cloudWatchTrigger:
      name: "SageMaker-Invocations"
      namespace: "AWS/SageMaker"
      useCachedMetrics: false
      metricName: "Invocations"
      targetValue: 10
      minValue: 0.0
      metricCollectionPeriod: 30
      metricStat: "Sum"
      metricType: "Average"
      dimensions:
        - name: "EndpointName"
          value: "$SAGEMAKER_ENDPOINT_NAME"
        - name: "VariantName"
          value: "AllTraffic"
EOF

部署模型

更新您的 kubernetes 組態並部署模型

設定 kubectl 以連線至由 Amazon EKS 協調的 HyperPod 叢集。


export EKS_CLUSTER_NAME=$(aws --region $REGION sagemaker describe-cluster --cluster-name $HYPERPOD_CLUSTER_NAME \
  --query 'Orchestrator.Eks.ClusterArn' --output text | \
  cut -d'/' -f2)
aws eks update-kubeconfig --name $EKS_CLUSTER_NAME --region $REGION

部署您的 JumpStart 模型。
```
kubectl apply -f jumpstart_model.yaml
```

監控模型部署的狀態

確認模型已成功部署。


kubectl describe JumpStartModel $SAGEMAKER_ENDPOINT_NAME -n $CLUSTER_NAMESPACE

確認已成功建立端點。


aws sagemaker describe-endpoint --endpoint-name=$SAGEMAKER_ENDPOINT_NAME --output table

叫用您的模型端點。您可以透過程式設計方式從 JumpStartModel 物件擷取範例承載。


aws sagemaker-runtime invoke-endpoint \
  --endpoint-name $SAGEMAKER_ENDPOINT_NAME \
  --content-type "application/json" \
  --body '{"inputs": "What is AWS SageMaker?"}' \
  --region $REGION \
  --cli-binary-format raw-in-base64-out \
  /dev/stdout

管理您的部署

當您不再需要 JumpStart 模型部署時，請將其刪除。


kubectl delete JumpStartModel $SAGEMAKER_ENDPOINT_NAME -n $CLUSTER_NAMESPACE

故障診斷

如果您的部署未如預期般運作，請使用這些除錯命令。

檢查 Kubernetes 部署的狀態。此命令會檢查基礎 Kubernetes 部署物件，以管理執行模型的 Pod。使用此項目來疑難排解 Pod 排程、資源配置和容器啟動問題。
```
kubectl describe deployment $SAGEMAKER_ENDPOINT_NAME -n $CLUSTER_NAMESPACE
```
檢查 JumpStart 模型資源的狀態。此命令會檢查管理高階模型組態和部署生命週期的自訂JumpStartModel資源。使用此項目來疑難排解模型特定的問題，例如組態錯誤或 SageMaker AI 端點建立問題。
```
kubectl describe JumpStartModel $SAGEMAKER_ENDPOINT_NAME -n $CLUSTER_NAMESPACE
```
檢查所有 Kubernetes 物件的狀態。此命令提供命名空間中所有相關 Kubernetes 資源的完整概觀。使用此項目進行快速運作狀態檢查，以查看與您的模型部署相關聯的 Pod、服務、部署和自訂資源的整體狀態。
```
kubectl get pods,svc,deployment,JumpStartModel,sagemakerendpointregistration -n $CLUSTER_NAMESPACE
```

您的瀏覽器已停用或無法使用 Javascript。

您必須啟用 Javascript，才能使用 AWS 文件。請參閱您的瀏覽器說明頁以取得說明。

文件慣用形式

使用 Studio 從 JumpStart 部署模型

使用 kubectl 從 Amazon S3 和 Amazon FSx 部署自訂微調模型