本文為英文版的機器翻譯版本,如內容有任何歧義或不一致之處,概以英文版為準。
使用 YuniKorn 作為 Amazon EMR on EKS 上 Apache Spark 的自訂排程器
透過 Amazon EMR on EKS,可以將 Spark 運算子或 spark-submit 與 Kubernetes 自訂排程器搭配使用,以執行 Spark 作業。本教學課程介紹了如何在自訂佇列上使用 YuniKorn 排程器和群排程來執行 Spark 作業。
概要
Apache YuniKorn
建立您的叢集並設定 YuniKorn
使用下列步驟來部署 Amazon EKS 叢集。可以變更 AWS 區域 (region
) 和可用區域 (availabilityZones
)。
-
定義 Amazon EKS 叢集:
cat <<EOF >eks-cluster.yaml --- apiVersion: eksctl.io/v1alpha5 kind: ClusterConfig metadata: name: emr-eks-cluster region:
eu-west-1
vpc: clusterEndpoints: publicAccess: true privateAccess: true iam: withOIDC: true nodeGroups: - name: spark-jobs labels: { app: spark } instanceType: m5.xlarge desiredCapacity: 2 minSize: 2 maxSize: 3 availabilityZones: ["eu-west-1a"
] EOF -
建立叢集:
eksctl create cluster -f eks-cluster.yaml
-
建立您將在其中執行 Spark 作業的命名空間
spark-job
:kubectl create namespace spark-job
-
接下來,建立 Kubernetes 角色和角色連結。這是 Spark 作業執行使用的服務帳戶所必需的。
-
定義 Spark 作業的服務帳戶、角色和角色連結。
cat <<EOF >emr-job-execution-rbac.yaml --- apiVersion: v1 kind: ServiceAccount metadata: name: spark-sa namespace: spark-job automountServiceAccountToken: false --- apiVersion: rbac.authorization.k8s.io/v1 kind: Role metadata: name: spark-role namespace: spark-job rules: - apiGroups: ["", "batch","extensions"] resources: ["configmaps","serviceaccounts","events","pods","pods/exec","pods/log","pods/portforward","secrets","services","persistentvolumeclaims"] verbs: ["create","delete","get","list","patch","update","watch"] --- apiVersion: rbac.authorization.k8s.io/v1 kind: RoleBinding metadata: name: spark-sa-rb namespace: spark-job roleRef: apiGroup: rbac.authorization.k8s.io kind: Role name: spark-role subjects: - kind: ServiceAccount name: spark-sa namespace: spark-job EOF
-
使用下列命令套用 Kubernetes 角色和角色連結定義:
kubectl apply -f emr-job-execution-rbac.yaml
-
安裝與設定 YuniKorn
-
使用下列 kubectl 命令建立命名空間
yunikorn
,以部署 Yunikorn 排程器:kubectl create namespace yunikorn
-
若要安裝排程器,請執行下列 Helm 命令:
helm repo add yunikorn https://apache.github.io/yunikorn-release
helm repo update
helm install yunikorn yunikorn/yunikorn --namespace yunikorn
使用 YuniKorn 排程器和 Spark Operator 來執行 Spark 應用程式
-
如果您尚未完成,請先完成下節中的步驟進行設定:
-
當您執行
helm install spark-operator-demo
命令時,請包含下列引數:--set batchScheduler.enable=true --set webhook.enable=true
-
建立
SparkApplication
定義檔案spark-pi.yaml
。若要使用 YuniKorn 作為作業的排程器,必須將某些註釋和標籤新增至應用程式定義。註釋和標籤會指定作業的佇列,以及您要使用的排程策略。
在下列範例中,註釋
schedulingPolicyParameters
會設定應用程式的群排程。然後,此範例會建立作業群組或「成群」作業,以指定在排程 Pod 開始作業執行之前必須具備的最小容量。最後,它會在任務群組定義中進行指定,以使用帶有"app": "spark"
標籤的節點群組,如 建立您的叢集並設定 YuniKorn 區段中所定義。apiVersion: "sparkoperator.k8s.io/v1beta2" kind: SparkApplication metadata: name: spark-pi namespace: spark-job spec: type: Scala mode: cluster image: "895885662937.dkr.ecr.
us-west-2
.amazonaws.com/spark/emr-6.10.0:latest
" imagePullPolicy: Always mainClass: org.apache.spark.examples.SparkPi mainApplicationFile: "local:///usr/lib/spark/examples/jars/spark-examples.jar" sparkVersion: "3.3.1" restartPolicy: type: Never volumes: - name: "test-volume" hostPath: path: "/tmp" type: Directory driver: cores: 1 coreLimit: "1200m" memory: "512m" labels: version: 3.3.1 annotations: yunikorn.apache.org/schedulingPolicyParameters: "placeholderTimeoutSeconds=30 gangSchedulingStyle=Hard" yunikorn.apache.org/task-group-name: "spark-driver" yunikorn.apache.org/task-groups: |- [{ "name": "spark-driver", "minMember": 1, "minResource": { "cpu": "1200m", "memory": "1Gi" }, "nodeSelector": { "app": "spark" } }, { "name": "spark-executor", "minMember": 1, "minResource": { "cpu": "1200m", "memory": "1Gi" }, "nodeSelector": { "app": "spark" } }] serviceAccount: spark-sa volumeMounts: - name: "test-volume
" mountPath: "/tmp" executor: cores: 1 instances: 1 memory: "512m" labels: version: 3.3.1 annotations: yunikorn.apache.org/task-group-name: "spark-executor" volumeMounts: - name: "test-volume
" mountPath: "/tmp" -
使用下列命令提交 Spark 應用程式。這也會建立
SparkApplication
物件,名為spark-pi
:kubectl apply -f spark-pi.yaml
-
使用下列命令檢查
SparkApplication
物件的事件:kubectl describe sparkapplication spark-pi --namespace spark-job
第一個 Pod 事件顯示 YuniKorn 已排程 Pod:
Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduling 3m12s yunikorn spark-operator/org-apache-spark-examples-sparkpi-2a777a88b98b8a95-driver is queued and waiting for allocation Normal GangScheduling 3m12s yunikorn Pod belongs to the taskGroup spark-driver, it will be scheduled as a gang member Normal Scheduled 3m10s yunikorn Successfully assigned spark Normal PodBindSuccessful 3m10s yunikorn Pod spark-operator/ Normal TaskCompleted 2m3s yunikorn Task spark-operator/ Normal Pulling 3m10s kubelet Pulling
使用 YuniKorn 排程器和 spark-submit
來執行 Spark 應用程式
-
首先,請先完成 在上為 Amazon 設置火花提交 EMR EKS 一節中的步驟。
-
設定以下環境變數的值:
export SPARK_HOME=spark-home export MASTER_URL=k8s://
Amazon-EKS-cluster-endpoint
-
使用下列命令提交 Spark 應用程式:
在下列範例中,註釋
schedulingPolicyParameters
會設定應用程式的群排程。然後,此範例會建立作業群組或「成群」作業,以指定在排程 Pod 開始作業執行之前必須具備的最小容量。最後,它會在任務群組定義中進行指定,以使用帶有"app": "spark"
標籤的節點群組,如 建立您的叢集並設定 YuniKorn 區段中所定義。$SPARK_HOME/bin/spark-submit \ --class org.apache.spark.examples.SparkPi \ --master $MASTER_URL \ --conf spark.kubernetes.container.image=895885662937.dkr.ecr.
us-west-2
.amazonaws.com/spark/emr-6.10.0:latest
\ --conf spark.kubernetes.authenticate.driver.serviceAccountName=spark-sa \ --deploy-mode cluster \ --conf spark.kubernetes.namespace=spark-job \ --conf spark.kubernetes.scheduler.name=yunikorn \ --conf spark.kubernetes.driver.annotation.yunikorn.apache.org/schedulingPolicyParameters="placeholderTimeoutSeconds=30 gangSchedulingStyle=Hard" \ --conf spark.kubernetes.driver.annotation.yunikorn.apache.org/task-group-name="spark-driver" \ --conf spark.kubernetes.executor.annotation.yunikorn.apache.org/task-group-name="spark-executor" \ --conf spark.kubernetes.driver.annotation.yunikorn.apache.org/task-groups='[{ "name": "spark-driver", "minMember": 1, "minResource": { "cpu": "1200m", "memory": "1Gi" }, "nodeSelector": { "app": "spark" } }, { "name": "spark-executor", "minMember": 1, "minResource": { "cpu": "1200m", "memory": "1Gi" }, "nodeSelector": { "app": "spark" } }]' \ local:///usr/lib/spark/examples/jars/spark-examples.jar 20 -
使用下列命令檢查
SparkApplication
物件的事件:kubectl describe pod
spark-driver-pod
--namespace spark-job第一個 Pod 事件顯示 YuniKorn 已排程 Pod:
Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduling 3m12s yunikorn spark-operator/org-apache-spark-examples-sparkpi-2a777a88b98b8a95-driver is queued and waiting for allocation Normal GangScheduling 3m12s yunikorn Pod belongs to the taskGroup spark-driver, it will be scheduled as a gang member Normal Scheduled 3m10s yunikorn Successfully assigned spark Normal PodBindSuccessful 3m10s yunikorn Pod spark-operator/ Normal TaskCompleted 2m3s yunikorn Task spark-operator/ Normal Pulling 3m10s kubelet Pulling