使用 Amazon EC2 自動擴展群組建立叢集基礎設施 - AWS 截止日期雲

本文為英文版的機器翻譯版本,如內容有任何歧義或不一致之處,概以英文版為準。

使用 Amazon EC2 自動擴展群組建立叢集基礎設施

本節說明如何建立 Amazon EC2 Auto Scaling 叢集。

使用以下 AWS CloudFormation YAML範本建立 Amazon EC2 Auto Scaling (Auto Scaling) 群組、具有兩個子網路的 Amazon 虛擬私有雲端 (AmazonVPC)、一個執行個體設定檔和一個執行個體存取角色。若要在子網路中使用「自動調整」(Auto Scaling) 啟動執行個體,

您應該檢閱並更新執行個體類型清單,以符合您的彩現需求。

若要建立 Amazon EC2 Auto Scaling 叢集

  1. 請在以下位置開啟 AWS CloudFormation 主控台。 https://console.aws.amazon.com/cloudformation

  2. 建立具有參數Farm IDFleet ID、和的 CloudFormation 範本AMI ID

    AWSTemplateFormatVersion: 2010-09-09 Description: Amazon Deadline Cloud customer-managed fleet Parameters: FarmId: Type: String Description: Farm ID FleetId: Type: String Description: Fleet ID AMIId: Type: String Description: AMI ID for launching Workers Resources: deadlineVPC: Type: 'AWS::EC2::VPC' Properties: CidrBlock: 100.100.0.0/16 deadlineWorkerSecurityGroup: Type: 'AWS::EC2::SecurityGroup' Properties: GroupDescription: !Join - ' ' - - Security Group created for deadline workers in fleet - !Ref FleetId GroupName: !Join - '' - - deadlineWorkerSecurityGroup- - !Ref FleetId SecurityGroupEgress: - CidrIp: 0.0.0.0/0 IpProtocol: '-1' SecurityGroupIngress: [] VpcId: !Ref deadlineVPC deadlineIGW: Type: 'AWS::EC2::InternetGateway' Properties: {} deadlineVPCGatewayAttachment: Type: 'AWS::EC2::VPCGatewayAttachment' Properties: VpcId: !Ref deadlineVPC InternetGatewayId: !Ref deadlineIGW deadlinePublicRouteTable: Type: 'AWS::EC2::RouteTable' Properties: VpcId: !Ref deadlineVPC deadlinePublicRoute: Type: 'AWS::EC2::Route' Properties: RouteTableId: !Ref deadlinePublicRouteTable DestinationCidrBlock: 0.0.0.0/0 GatewayId: !Ref deadlineIGW DependsOn: - deadlineIGW - deadlineVPCGatewayAttachment deadlinePublicSubnet0: Type: 'AWS::EC2::Subnet' Properties: VpcId: !Ref deadlineVPC CidrBlock: 100.100.16.0/22 AvailabilityZone: !Join - '' - - !Ref 'AWS::Region' - a deadlineSubnetRouteTableAssociation0: Type: 'AWS::EC2::SubnetRouteTableAssociation' Properties: RouteTableId: !Ref deadlinePublicRouteTable SubnetId: !Ref deadlinePublicSubnet0 deadlinePublicSubnet1: Type: 'AWS::EC2::Subnet' Properties: VpcId: !Ref deadlineVPC CidrBlock: 100.100.20.0/22 AvailabilityZone: !Join - '' - - !Ref 'AWS::Region' - c deadlineSubnetRouteTableAssociation1: Type: 'AWS::EC2::SubnetRouteTableAssociation' Properties: RouteTableId: !Ref deadlinePublicRouteTable SubnetId: !Ref deadlinePublicSubnet1 deadlineInstanceAccessAccessRole: Type: 'AWS::IAM::Role' Properties: RoleName: !Join - '-' - - deadline - InstanceAccess - !Ref FleetId AssumeRolePolicyDocument: Statement: - Effect: Allow Principal: Service: ec2.amazonaws.com Action: - 'sts:AssumeRole' Path: / ManagedPolicyArns: - 'arn:aws:iam::aws:policy/CloudWatchAgentServerPolicy' - 'arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore' - 'arn:aws:iam::aws:policy/AWSDeadlineCloud-WorkerHost' deadlineInstanceProfile: Type: 'AWS::IAM::InstanceProfile' Properties: Path: / Roles: - !Ref deadlineInstanceAccessAccessRole deadlineLaunchTemplate: Type: 'AWS::EC2::LaunchTemplate' Properties: LaunchTemplateName: !Join - '' - - deadline-LT- - !Ref FleetId LaunchTemplateData: NetworkInterfaces: - DeviceIndex: 0 AssociatePublicIpAddress: true Groups: - !Ref deadlineWorkerSecurityGroup DeleteOnTermination: true ImageId: !Ref AMIId InstanceInitiatedShutdownBehavior: terminate IamInstanceProfile: Arn: !GetAtt - deadlineInstanceProfile - Arn MetadataOptions: HttpTokens: required HttpEndpoint: enabled deadlineAutoScalingGroup: Type: 'AWS::AutoScaling::AutoScalingGroup' Properties: AutoScalingGroupName: !Join - '' - - deadline-ASG-autoscalable- - !Ref FleetId MinSize: 0 MaxSize: 10 VPCZoneIdentifier: - !Ref deadlinePublicSubnet0 - !Ref deadlinePublicSubnet1 NewInstancesProtectedFromScaleIn: true MixedInstancesPolicy: InstancesDistribution: OnDemandBaseCapacity: 0 OnDemandPercentageAboveBaseCapacity: 0 SpotAllocationStrategy: capacity-optimized OnDemandAllocationStrategy: lowest-price LaunchTemplate: LaunchTemplateSpecification: LaunchTemplateId: !Ref deadlineLaunchTemplate Version: !GetAtt - deadlineLaunchTemplate - LatestVersionNumber Overrides: - InstanceType: m5.large - InstanceType: m5d.large - InstanceType: m5a.large - InstanceType: m5ad.large - InstanceType: m5n.large - InstanceType: m5dn.large - InstanceType: m4.large - InstanceType: m3.large - InstanceType: r5.large - InstanceType: r5d.large - InstanceType: r5a.large - InstanceType: r5ad.large - InstanceType: r5n.large - InstanceType: r5dn.large - InstanceType: r4.large MetricsCollection: - Granularity: 1Minute Metrics: - GroupMinSize - GroupMaxSize - GroupDesiredCapacity - GroupInServiceInstances - GroupTotalInstances - GroupInServiceCapacity - GroupTotalCapacity
  3. 建立IAM角色之後,您需要確認下列事項:

    • 附加至工作者 Amazon EC2 執行個體之IAM角色的登入資料可供該工作者上執行的所有程序使用,其中包括任務。Worker 應具有最少的操作權限:deadline:CreateWorkerdeadline:AssumeFleetRoleForWorker.

    • Worker 代理程式會取得佇列角色的認證,並設定它們以供執行工作使用。Amazon EC2 執行個體設定檔角色不應包含任務所需的許可。

使用截止日期雲端擴展建議功能自動擴展 Amazon EC2 叢集

截止日期雲端利用 Amazon EC2 Auto Scaling (Auto Scaling) 群組自動擴展 Amazon EC2 客戶管理的叢集 (CMF)。您必須設定叢集模式,並在帳戶中部署所需的基礎結構,才能讓叢集 auto 擴充。您部署的基礎架構將適用於所有艦隊,因此您只需設置一次即可。

基本工作流程是:您將叢集模式設定為 auto 調整規模,然後在建議的叢集大小變更時 (其中一個 EventBridge 事件包含叢集 ID、建議的叢集大小和其他中繼資料),Parate Cloud 就會傳送該叢集的事件。您將有一個 EventBridge 規則來篩選相關事件,並讓 Lambda 使用它們。Lambda 將與 Amazon 自 EC2 Auto Scaling 集成AutoScalingGroup以自動擴展 Amazon EC2 機隊。

將車隊模式設定為 EVENT_BASED_AUTO_SCALING

將您的叢集模式設定為EVENT_BASED_AUTO_SCALING。您可以使用主控台來執行這項操作,或使用直接呼叫CreateFleetUpdateFleetAPI。 AWS CLI 模式設定完成後,只要建議的叢集大小變更,Deepdate Cloud 就會開始傳送 EventBridge 事件。

  • 範例UpdateFleet命令:

    aws deadline update-fleet \ --farm-id FARM_ID \ --fleet-id FLEET_ID \ --configuration file://configuration.json
  • 範例CreateFleet命令:

    aws deadline create-fleet \ --farm-id FARM_ID \ --display-name "Fleet name" \ --max-worker-count 10 \ --configuration file://configuration.json

以下是上述CLI指令中configuration.json使用的範例 (--configuration file://configuration.json)。

  • 若要在叢集上啟用 Auto Scaling,您應該將模式設定為EVENT_BASED_AUTO_SCALING

  • workerCapabilities是您建立CMF時指派給的預設值。如果您需要增加可用的資源,您可以變更這些值CMF。

設定叢集模式之後,Depution Cloud 會開始發出該叢集的叢集大小建議事件。

{ "customerManaged": { "mode": "EVENT_BASED_AUTO_SCALING", "workerCapabilities": { "vCpuCount": { "min": 1, "max": 4 }, "memoryMiB": { "min": 1024, "max": 4096 }, "osFamily": "linux", "cpuArchitectureType": "x86_64", } } }

使用範本部署 Auto Scaling AWS CloudFormation 模堆疊

您可以設定 EventBridge 規則來篩選事件、使用事件和控制 Auto Scaling 的 Lambda,以及用來儲存未處理事件的SQS佇列。使用下列 AWS CloudFormation 範本來部署堆疊中的所有項目。成功部署資源後,您可以提交工作,叢集會自動擴充。

Resources: AutoScalingLambda: Type: 'AWS::Lambda::Function' Properties: Code: ZipFile: |- """ This lambda is configured to handle "Fleet Size Recommendation Change" messages. It will handle all such events, and requires that the ASG is named based on the fleet id. It will scale up/down the fleet based on the recommended fleet size in the message. Example EventBridge message: { "version": "0", "id": "6a7e8feb-b491-4cf7-a9f1-bf3703467718", "detail-type": "Fleet Size Recommendation Change", "source": "aws.deadline", "account": "111122223333", "time": "2017-12-22T18:43:48Z", "region": "us-west-1", "resources": [], "detail": { "farmId": "farm-12345678900000000000000000000000", "fleetId": "fleet-12345678900000000000000000000000", "oldFleetSize": 1, "newFleetSize": 5, } } """ import json import boto3 import logging logger = logging.getLogger() logger.setLevel(logging.INFO) auto_scaling_client = boto3.client("autoscaling") def lambda_handler(event, context): logger.info(event) event_detail = event["detail"] fleet_id = event_detail["fleetId"] desired_capacity = event_detail["newFleetSize"] asg_name = f"deadline-ASG-autoscalable-{fleet_id}" auto_scaling_client.set_desired_capacity( AutoScalingGroupName=asg_name, DesiredCapacity=desired_capacity, HonorCooldown=False, ) return { 'statusCode': 200, 'body': json.dumps(f'Successfully set desired_capacity for {asg_name} to {desired_capacity}') } Handler: index.lambda_handler Role: !GetAtt - AutoScalingLambdaServiceRole - Arn Runtime: python3.11 DependsOn: - AutoScalingLambdaServiceRoleDefaultPolicy - AutoScalingLambdaServiceRole AutoScalingEventRule: Type: 'AWS::Events::Rule' Properties: EventPattern: source: - aws.deadline detail-type: - Fleet Size Recommendation Change State: ENABLED Targets: - Arn: !GetAtt - AutoScalingLambda - Arn DeadLetterConfig: Arn: !GetAtt - UnprocessedAutoScalingEventQueue - Arn Id: Target0 RetryPolicy: MaximumRetryAttempts: 15 AutoScalingEventRuleTargetPermission: Type: 'AWS::Lambda::Permission' Properties: Action: 'lambda:InvokeFunction' FunctionName: !GetAtt - AutoScalingLambda - Arn Principal: events.amazonaws.com SourceArn: !GetAtt - AutoScalingEventRule - Arn AutoScalingLambdaServiceRole: Type: 'AWS::IAM::Role' Properties: AssumeRolePolicyDocument: Statement: - Action: 'sts:AssumeRole' Effect: Allow Principal: Service: lambda.amazonaws.com Version: 2012-10-17 ManagedPolicyArns: - !Join - '' - - 'arn:' - !Ref 'AWS::Partition' - ':iam::aws:policy/service-role/AWSLambdaBasicExecutionRole' AutoScalingLambdaServiceRoleDefaultPolicy: Type: 'AWS::IAM::Policy' Properties: PolicyDocument: Statement: - Action: 'autoscaling:SetDesiredCapacity' Effect: Allow Resource: '*' Version: 2012-10-17 PolicyName: AutoScalingLambdaServiceRoleDefaultPolicy Roles: - !Ref AutoScalingLambdaServiceRole UnprocessedAutoScalingEventQueue: Type: 'AWS::SQS::Queue' Properties: QueueName: deadline-unprocessed-autoscaling-events UpdateReplacePolicy: Delete DeletionPolicy: Delete UnprocessedAutoScalingEventQueuePolicy: Type: 'AWS::SQS::QueuePolicy' Properties: PolicyDocument: Statement: - Action: 'sqs:SendMessage' Condition: ArnEquals: 'aws:SourceArn': !GetAtt - AutoScalingEventRule - Arn Effect: Allow Principal: Service: events.amazonaws.com Resource: !GetAtt - UnprocessedAutoScalingEventQueue - Arn Version: 2012-10-17 Queues: - !Ref UnprocessedAutoScalingEventQueue