JumpStart 使用 kubectl 部署模型

以下步骤向您展示了如何使用 kubect JumpStart l 将模型部署到 HyperPod 集群。

以下指令包含专为在终端中运行而设计的代码单元和命令。在执行这些命令之前，请确保已使用 AWS 凭据配置您的环境。

先决条件

在开始之前，请确认您已经：

在您的 Amazon SageMaker HyperPod 集群上设置推理功能。有关更多信息，请参阅设置 HyperPod 集群以进行模型部署。
安装了 kubectl 实用程序并在您的终端中配置了 jq。

设置和配置

选择您所在的地区。
```
export REGION=<region>
```
查看所有 SageMaker 公共中心模型和 HyperPod 集群。

从 JumpstartPublic Hub JumpstartModel 中选择一个。 JumpstartPublic hub 有大量可用的模型，因此您可以使用它NextToken来迭代列出公共中心中的所有可用模型。


aws sagemaker list-hub-contents --hub-name SageMakerPublicHub --hub-content-type Model --query '{Models: HubContentSummaries[].{ModelId:HubContentName,Version:HubContentVersion}, NextToken: NextToken}' --output json


export MODEL_ID="deepseek-llm-r1-distill-qwen-1-5b"
export MODEL_VERSION="2.0.4"

在下面的变量中配置您选择的模型 ID 和集群名称。

注意

请咨询您的集群管理员，确保已为该角色或用户授予权限。你可以运行!aws sts get-caller-identity --query "Arn"来检查你在终端中使用的是哪个角色或哪个用户。


aws sagemaker list-clusters --output table

# Select the cluster name where you want to deploy the model.
export HYPERPOD_CLUSTER_NAME="<insert cluster name here>"

# Select the instance that is relevant for your model deployment and exists within the selected cluster.
# List availble instances in your HyperPod cluster
aws sagemaker describe-cluster --cluster-name=$HYPERPOD_CLUSTER_NAME --query "InstanceGroups[].{InstanceType:InstanceType,Count:CurrentCount}" --output table

# List supported instance types for the selected model
aws sagemaker describe-hub-content --hub-name SageMakerPublicHub --hub-content-type Model --hub-content-name "$MODEL_ID" --output json | jq -r '.HubContentDocument | fromjson | {Default: .DefaultInferenceInstanceType, Supported: .SupportedInferenceInstanceTypes}'


# Select and instance type from the cluster that is compatible with the model. 
# Make sure that the selected instance is either default or supported instance type for the jumpstart model 
export INSTANCE_TYPE="<Instance_type_In_cluster"

向集群管理员确认允许您使用哪个命名空间。管理员应该已经在你的命名空间中创建了一个hyperpod-inference服务账号。
```
export CLUSTER_NAMESPACE="default"
```

为要创建的端点和自定义对象设置名称。


export SAGEMAKER_ENDPOINT_NAME="deepsek-qwen-1-5b-test"

以下是 Jumpstart 中deepseek-llm-r1-distill-qwen-1-5b模型部署的示例。根据在上述步骤中选择的模型创建类似的部署 yaml 文件。


cat << EOF > jumpstart_model.yaml
---
apiVersion: inference.sagemaker.aws.amazon.com/v1alpha1
kind: JumpStartModel
metadata:
  name: $SAGEMAKER_ENDPOINT_NAME
  namespace: $CLUSTER_NAMESPACE 
spec:
  sageMakerEndpoint:
    name: $SAGEMAKER_ENDPOINT_NAME
  model:
    modelHubName: SageMakerPublicHub
    modelId: $MODEL_ID
    modelVersion: $MODEL_VERSION
  server:
    instanceType: $INSTANCE_TYPE
  metrics:
    enabled: true
  environmentVariables:
    - name: SAMPLE_ENV_VAR
      value: "sample_value"
  maxDeployTimeInSeconds: 1800
  autoScalingSpec:
    cloudWatchTrigger:
      name: "SageMaker-Invocations"
      namespace: "AWS/SageMaker"
      useCachedMetrics: false
      metricName: "Invocations"
      targetValue: 10
      minValue: 0.0
      metricCollectionPeriod: 30
      metricStat: "Sum"
      metricType: "Average"
      dimensions:
        - name: "EndpointName"
          value: "$SAGEMAKER_ENDPOINT_NAME"
        - name: "VariantName"
          value: "AllTraffic"
EOF

部署模型

更新你的 kubernetes 配置并部署你的模型

将 kubectl 配置为连接到由 Amazon EKS 编排的 HyperPod 集群。


export EKS_CLUSTER_NAME=$(aws --region $REGION sagemaker describe-cluster --cluster-name $HYPERPOD_CLUSTER_NAME \
  --query 'Orchestrator.Eks.ClusterArn' --output text | \
  cut -d'/' -f2)
aws eks update-kubeconfig --name $EKS_CLUSTER_NAME --region $REGION

部署您的 JumpStart 模型。
```
kubectl apply -f jumpstart_model.yaml
```

监控模型部署的状态

验证模型是否已成功部署。


kubectl describe JumpStartModel $SAGEMAKER_ENDPOINT_NAME -n $CLUSTER_NAMESPACE

确认终端节点已成功创建。


aws sagemaker describe-endpoint --endpoint-name=$SAGEMAKER_ENDPOINT_NAME --output table

调用您的模型端点。您可以通过编程方式从对象中检索示例负载。JumpStartModel


aws sagemaker-runtime invoke-endpoint \
  --endpoint-name $SAGEMAKER_ENDPOINT_NAME \
  --content-type "application/json" \
  --body '{"inputs": "What is AWS SageMaker?"}' \
  --region $REGION \
  --cli-binary-format raw-in-base64-out \
  /dev/stdout

管理您的部署

一旦不再需要 JumpStart 模型部署，请将其删除。


kubectl delete JumpStartModel $SAGEMAKER_ENDPOINT_NAME -n $CLUSTER_NAMESPACE

故障排除

如果您的部署未按预期运行，请使用这些调试命令。

检查 Kubernetes 部署的状态。此命令检查底层 Kubernetes 部署对象，该对象管理运行模型的 pod。使用它来解决 Pod 调度、资源分配和容器启动问题。
```
kubectl describe deployment $SAGEMAKER_ENDPOINT_NAME -n $CLUSTER_NAMESPACE
```
检查您的 JumpStart 模型资源的状态。此命令检查管理高级模型配置和部署生命周期的自定义JumpStartModel资源。使用它来解决特定于模型的问题，例如配置错误或 SageMaker AI 端点创建问题。
```
kubectl describe JumpStartModel $SAGEMAKER_ENDPOINT_NAME -n $CLUSTER_NAMESPACE
```
检查所有 Kubernetes 对象的状态。此命令全面概述了您的命名空间中所有相关的 Kubernetes 资源。使用它进行快速运行状况检查，以查看与您的模型部署关联的 pod、服务、部署和自定义资源的整体状态。
```
kubectl get pods,svc,deployment,JumpStartModel,sagemakerendpointregistration -n $CLUSTER_NAMESPACE
```

Javascript 在您的浏览器中被禁用或不可用。

要使用 Amazon Web Services 文档，必须启用 Javascript。请参阅浏览器的帮助页面以了解相关说明。

文档惯例

JumpStart 使用 Studio 部署模型

FSx 使用 kubectl 部署来自亚马逊 S3 和亚马逊的自定义微调模型