使用 Step Functions 创建和管理 Amazon EMR Serverless 应用程序 - AWS Step Functions

本文属于机器翻译版本。若本译文内容与英语原文存在差异,则一律以英文原文为准。

使用 Step Functions 创建和管理 Amazon EMR Serverless 应用程序

了解如何使用 Step Functions 在 EMR Serverless 上创建、启动、停止和删除应用程序。本页列出了支持的Task状态 APIs ,并提供了执行常见用例的示例状态。

要了解如何在 Step Functions 中与 AWS 服务集成,请参阅集成 服务在 Step Functions 中将参数传递给服务 API

经优化的 EMR Serverless 集成的主要功能
  • 优化的EMR Serverless服务集成有一组自定义的集合 APIs,用于封装底层EMR Serverless APIs。由于这种自定义,优化的EMR Serverless集成与 AWS SDK 服务集成有很大不同。

  • 此外,优化的 EMR Serverless 集成支持运行作业 (.sync) 集成模式。

  • 支持 等待具有任务令牌的回调集成模式。

EMR Serverless服务集成 APIs

要AWS Step Functions与集成EMR Serverless,您可以使用以下六个EMR Serverless服务集成 APIs。这些服务集成 APIs 与相应的服务集成类似 EMR Serverless APIs,但传递的字段和返回的响应有所不同。

下表介绍了每个 EMR Serverless 服务集成 API 及其相应 EMR Serverless API 之间的差异。

EMR Serverless 服务集成 API 相应的 EMR Serverless API 差异

createApplication

创建应用程序。

EMR Serverless 与一种独特类型的 IAM 角色(称为服务相关角色)关联。要使 createApplicationcreateApplication.sync 起作用,您必须配置必要的权限以创建与服务关联的角色 AWS ServiceRoleForAmazonEMRServerless。有关此问题的更多信息,包括您可以添加到 IAM 权限策略的语句,请参阅使用 EMR Serverless 的服务关联角色

CreateApplication

createApplication.sync

创建应用程序。

CreateApplication

EMR Serverless API 和 EMR Serverless 服务集成 API 的请求和响应之间没有区别。但是,createApplication.sync 会等待应用程序进入 CREATED 状态。

startApplication

启动指定的应用程序并初始化该应用程序的初始容量(如果已配置)。

StartApplication

EMR Serverless API 响应不包含任何数据,但 EMR Serverless 服务集成 API 响应包含以下数据。

{ "ApplicationId": "string" }

startApplication.sync

启动指定的应用程序并初始化初始容量(如果已配置)。

StartApplication

EMR Serverless API 响应不包含任何数据,但 EMR Serverless 服务集成 API 响应包含以下数据。

{ "ApplicationId": "string" }

此外,startApplication.sync 会等待应用程序进入 STARTED 状态。

stopApplication

停止指定的应用程序并释放初始容量(如果已配置)。在停止应用程序之前,必须完成或取消所有已计划和正在运行的作业。

StopApplication

EMR Serverless API 响应不包含任何数据,但 EMR Serverless 服务集成 API 响应包含以下数据。

{ "ApplicationId": "string" }

stopApplication.sync

停止指定的应用程序并释放初始容量(如果已配置)。在停止应用程序之前,必须完成或取消所有已计划和正在运行的作业。

StopApplication

EMR Serverless API 响应不包含任何数据,但 EMR Serverless 服务集成 API 响应包含以下数据。

{ "ApplicationId": "string" }

此外,stopApplication.sync 会等待应用程序进入 STOPPED 状态。

deleteApplication

删除应用程序 应用程序必须处于 STOPPEDCREATED 状态才能被删除。

DeleteApplication

EMR Serverless API 响应不包含任何数据,但 EMR Serverless 服务集成 API 响应包含以下数据。

{ "ApplicationId": "string" }

deleteApplication.sync

删除应用程序 应用程序必须处于 STOPPEDCREATED 状态才能被删除。

DeleteApplication

EMR Serverless API 响应不包含任何数据,但 EMR Serverless 服务集成 API 响应包含以下数据。

{ "ApplicationId": "string" }

此外,stopApplication.sync 会等待应用程序进入 TERMINATED 状态。

startJobRun

启动作业运行。

StartJobRun

startJobRun.sync

启动作业运行。

StartJobRun

EMR Serverless API 和 EMR Serverless 服务集成 API 的请求和响应之间没有区别。但是,startJobRun.sync 会等待应用程序进入状态。SUCCESS

cancelJobRun

取消作业运行。

CancelJobRun

cancelJobRun.sync

取消作业运行。

CancelJobRun

EMR Serverless API 和 EMR Serverless 服务集成 API 的请求和响应之间没有区别。但是,cancelJobRun.sync 会等待应用程序进入状态。CANCELLED

EMR 无服务器集成用例

对于优化的 EMR Serverless 服务集成,我们建议您创建单个应用程序,然后使用该应用程序运行多个作业。例如,在单个状态机中,您可以包含多个startJobRun请求,所有这些请求都使用同一个应用程序。以下Task 工作流程状态状态示例显示了要与之集成的EMR Serverless APIs 用例Step Functions。有关 EMR Serverless 其他用例的信息,请参阅什么是 Amazon EMR Serverless

提示

要部署与EMR Serverless集成以运行多个作业的状态机示例;,请参阅运行 EMR Serverless 作业

要了解有关在与其他 AWS 服务Step Functions一起使用时配置IAM权限的信息,请参阅Step Functions 如何为集成服务生成 IAM 策略

在以下用例所示的示例中,用您的资源特定信息替换italicized文本。例如,yourApplicationId替换为EMR Serverless应用程序的 ID,例如00yv7iv71inak893

创建 应用程序

以下 Task 状态示例使用 createApplication.sync 服务集成 API 创建了一个应用程序。

"Create_Application": { "Type": "Task", "Resource": "arn:aws:states:::emr-serverless:createApplication.sync", "Arguments": { "Name": "MyApplication", "ReleaseLabel": "emr-6.9.0", "Type": "SPARK" }, "End": true }

启动应用程序

以下 Task 状态示例使用 startApplication.sync 服务集成 API 启动应用程序。

"Start_Application": { "Type": "Task", "Resource": "arn:aws:states:::emr-serverless:startApplication.sync", "Arguments": { "ApplicationId": "yourApplicationId" }, "End": true }

停止应用程序

以下 Task 状态示例使用 stopApplication.sync 服务集成 API 停止应用程序。

"Stop_Application": { "Type": "Task", "Resource": "arn:aws:states:::emr-serverless:stopApplication.sync", "Arguments": { "ApplicationId": "yourApplicationId" }, "End": true }

删除 应用程序

以下 Task 状态示例使用 deleteApplication.sync 服务集成 API 删除应用程序。

"Delete_Application": { "Type": "Task", "Resource": "arn:aws:states:::emr-serverless:deleteApplication.sync", "Arguments": { "ApplicationId": "yourApplicationId" }, "End": true }

启动应用程序中的作业

以下任务状态示例使用 startJobRun.sync 服务集成 API 在应用程序中启动作业。

"Start_Job": { "Type": "Task", "Resource": "arn:aws:states:::emr-serverless:startJobRun.sync", "Arguments": { "ApplicationId": "yourApplicationId", "ExecutionRoleArn": "arn:aws:iam::account-id:role/myEMRServerless-execution-role", "JobDriver": { "SparkSubmit": { "EntryPoint": "s3://<amzn-s3-demo-bucket>/sample.py", "EntryPointArguments": ["1"], "SparkSubmitParameters": "--conf spark.executor.cores=4 --conf spark.executor.memory=4g --conf spark.driver.cores=2 --conf spark.driver.memory=4g --conf spark.executor.instances=1" } } }, "End": true }

取消应用程序中的作业

以下任务状态示例使用 cancelJobRun.sync 服务集成 API 取消应用程序中的作业。

"Cancel_Job": { "Type": "Task", "Resource": "arn:aws:states:::emr-serverless:cancelJobRun.sync", "Arguments": { "ApplicationId": "{% $states.input.ApplicationId %}", "JobRunId": "{% $states.input.JobRunId %}" }, "End": true }

用于调用 Amazon EMR Serverless 的 IAM 策略

使用控制台创建状态机时,Step Functions 会自动为状态机创建一个具有所需最低权限的执行角色。这些自动生成的IAM角色对 AWS 区域 您在其中创建状态机的角色有效。

以下示例模板展示了如何根据状态机定义中的资源 AWS Step Functions 生成 IAM 策略。有关更多信息,请参阅Step Functions 如何为集成服务生成 IAM 策略探索 Step Functions 中的服务集成模式

我们建议您在创建 IAM 策略时,不要在策略中包含通配符。作为安全最佳实操,应尽可能缩小策略范围。只有在运行时不知道某些输入参数时,才应使用动态策略。

此外,管理员用户在向非管理员用户授予运行状态机的执行角色时应谨慎行事。如果要自行创建策略,我们建议在执行角色中加入 passRole 策略。我们还建议在执行角色中添加 aws:SourceARNaws:SourceAccount 上下文密钥。

EMR Serverless 与 Step Functions 集成的 IAM 策略示例

的 IAM 策略示例 CreateApplication

以下是带有状态的状态机的 IAM 策略示例。 CreateApplication Task 工作流程状态

注意

在您的账户中创建有史以来第一个应用程序时,您需要在 IAM 策略中指定 CreateServiceLinkedRole 权限。此后,便无需再添加此权限。有关信息 CreateServiceLinkedRole,请参阅 CreateServiceLinkedRole https://docs.aws.amazon.com/IAM/latest//APIReference。

以下策略的静态资源和动态资源相同。

Run a Job (.sync)
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "emr-serverless:CreateApplication" ], "Resource": [ "arn:aws:emr-serverless:us-east-1:123456789012:/*" ] }, { "Effect": "Allow", "Action": [ "emr-serverless:GetApplication", "emr-serverless:DeleteApplication" ], "Resource": [ "arn:aws:emr-serverless:us-east-1:123456789012:/applications/*" ] }, { "Effect": "Allow", "Action": [ "events:PutTargets", "events:PutRule", "events:DescribeRule" ], "Resource": [ "arn:aws:events:us-east-1:123456789012:rule/StepFunctionsGetEventsForEMRServerlessApplicationRule" ] }, { "Effect": "Allow", "Action": "iam:CreateServiceLinkedRole", "Resource": "arn:aws:iam::123456789012:role/aws-service-role/ops.emr-serverless.amazonaws.com/AWS ServiceRoleForAmazonEMRServerless*", "Condition": { "StringLike": { "iam:AWSServiceName": "ops.emr-serverless.amazonaws.com" } } } ] }
Request Response
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "emr-serverless:CreateApplication" ], "Resource": [ "arn:aws:emr-serverless:us-east-1:123456789012:/*" ] }, { "Effect": "Allow", "Action": "iam:CreateServiceLinkedRole", "Resource": "arn:aws:iam::123456789012:role/aws-service-role/ops.emr-serverless.amazonaws.com/AWS ServiceRoleForAmazonEMRServerless*", "Condition": { "StringLike": { "iam:AWSServiceName": "ops.emr-serverless.amazonaws.com" } } } ] }

的 IAM 策略示例 StartApplication

静态资源

以下是当您使用带有状态的状态机时静态资源的 IAM 策略示例。 StartApplication Task 工作流程状态

Run a Job (.sync)
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "emr-serverless:StartApplication", "emr-serverless:GetApplication", "emr-serverless:StopApplication" ], "Resource": [ "arn:aws:emr-serverless:us-east-1:123456789012:/applications/applicationId" ] }, { "Effect": "Allow", "Action": [ "events:PutTargets", "events:PutRule", "events:DescribeRule" ], "Resource": [ "arn:aws:events:us-east-1:123456789012:rule/StepFunctionsGetEventsForEMRServerlessApplicationRule" ] } ] }
Request Response
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "emr-serverless:StartApplication" ], "Resource": [ "arn:aws:emr-serverless:us-east-1:123456789012:/applications/applicationId" ] } ] }
动态资源

以下是当您使用带有状态的状态机时动态资源的 IAM 策略示例。 StartApplication Task 工作流程状态

Run a Job (.sync)
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "emr-serverless:StartApplication", "emr-serverless:GetApplication", "emr-serverless:StopApplication" ], "Resource": [ "arn:aws:emr-serverless:us-east-1:123456789012:/applications/*" ] }, { "Effect": "Allow", "Action": [ "events:PutTargets", "events:PutRule", "events:DescribeRule" ], "Resource": [ "arn:aws:events:us-east-1:123456789012:rule/StepFunctionsGetEventsForEMRServerlessApplicationRule" ] } ] }
Request Response
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "emr-serverless:StartApplication" ], "Resource": [ "arn:aws:emr-serverless:us-east-1:123456789012:/applications/*" ] } ] }

的 IAM 策略示例 StopApplication

静态资源

以下是当您使用带有状态的状态机时静态资源的 IAM 策略示例。 StopApplication Task 工作流程状态

Run a Job (.sync)
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "emr-serverless:StopApplication", "emr-serverless:GetApplication" ], "Resource": [ "arn:aws:emr-serverless:us-east-1:123456789012:/applications/applicationId" ] }, { "Effect": "Allow", "Action": [ "events:PutTargets", "events:PutRule", "events:DescribeRule" ], "Resource": [ "arn:aws:events:us-east-1:123456789012:rule/StepFunctionsGetEventsForEMRServerlessApplicationRule" ] } ] }
Request Response
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "emr-serverless:StopApplication" ], "Resource": [ "arn:aws:emr-serverless:us-east-1:123456789012:/applications/applicationId" ] } ] }
动态资源

以下是当您使用带有状态的状态机时动态资源的 IAM 策略示例。 StopApplication Task 工作流程状态

Run a Job (.sync)
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "emr-serverless:StopApplication", "emr-serverless:GetApplication" ], "Resource": [ "arn:aws:emr-serverless:us-east-1:123456789012:/applications/*" ] }, { "Effect": "Allow", "Action": [ "events:PutTargets", "events:PutRule", "events:DescribeRule" ], "Resource": [ "arn:aws:events:us-east-1:123456789012:rule/StepFunctionsGetEventsForEMRServerlessApplicationRule" ] } ] }
Request Response
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "emr-serverless:StopApplication" ], "Resource": [ "arn:aws:emr-serverless:us-east-1:123456789012:/applications/*" ] } ] }

的 IAM 策略示例 DeleteApplication

静态资源

以下是当您使用带有状态的状态机时静态资源的 IAM 策略示例。 DeleteApplication Task 工作流程状态

Run a Job (.sync)
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "emr-serverless:DeleteApplication", "emr-serverless:GetApplication" ], "Resource": [ "arn:aws:emr-serverless:us-east-1:123456789012:/applications/applicationId" ] }, { "Effect": "Allow", "Action": [ "events:PutTargets", "events:PutRule", "events:DescribeRule" ], "Resource": [ "arn:aws:events:us-east-1:123456789012:rule/StepFunctionsGetEventsForEMRServerlessApplicationRule" ] } ] }
Request Response
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "emr-serverless:DeleteApplication" ], "Resource": [ "arn:aws:emr-serverless:us-east-1:123456789012:/applications/applicationId" ] } ] }
动态资源

以下是当您使用带有状态的状态机时动态资源的 IAM 策略示例。 DeleteApplication Task 工作流程状态

Run a Job (.sync)
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "emr-serverless:DeleteApplication", "emr-serverless:GetApplication" ], "Resource": [ "arn:aws:emr-serverless:us-east-1:123456789012:/applications/*" ] }, { "Effect": "Allow", "Action": [ "events:PutTargets", "events:PutRule", "events:DescribeRule" ], "Resource": [ "arn:aws:events:us-east-1:123456789012:rule/StepFunctionsGetEventsForEMRServerlessApplicationRule" ] } ] }
Request Response
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "emr-serverless:DeleteApplication" ], "Resource": [ "arn:aws:emr-serverless:us-east-1:123456789012:/applications/*" ] } ] }

的 IAM 策略示例 StartJobRun

静态资源

以下是当您使用带有状态的状态机时静态资源的 IAM 策略示例。 StartJobRun Task 工作流程状态

Run a Job (.sync)
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "emr-serverless:StartJobRun" ], "Resource": [ "arn:aws:emr-serverless:us-east-1:123456789012:/applications/applicationId" ] }, { "Effect": "Allow", "Action": "iam:PassRole", "Resource": [ "arn:aws:iam::123456789012:role/jobExecutionRoleArn" ], "Condition": { "StringEquals": { "iam:PassedToService": "emr-serverless.amazonaws.com" } } }, { "Effect": "Allow", "Action": [ "emr-serverless:GetJobRun", "emr-serverless:CancelJobRun" ], "Resource": [ "arn:aws:emr-serverless:us-east-1:123456789012:/applications/applicationId/jobruns/*" ] }, { "Effect": "Allow", "Action": [ "events:PutTargets", "events:PutRule", "events:DescribeRule" ], "Resource": [ "arn:aws:events:us-east-1:123456789012:rule/StepFunctionsGetEventsForEMRServerlessJobRule" ] } ] }
Request Response
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "emr-serverless:StartJobRun" ], "Resource": [ "arn:aws:emr-serverless:us-east-1:123456789012:/applications/applicationId" ] }, { "Effect": "Allow", "Action": "iam:PassRole", "Resource": [ "arn:aws:iam::123456789012:role/jobExecutionRoleArn" ], "Condition": { "StringEquals": { "iam:PassedToService": "emr-serverless.amazonaws.com" } } } ] }
动态资源

以下是当您使用带有状态的状态机时动态资源的 IAM 策略示例。 StartJobRun Task 工作流程状态

Run a Job (.sync)
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "emr-serverless:StartJobRun", "emr-serverless:GetJobRun", "emr-serverless:CancelJobRun" ], "Resource": [ "arn:aws:emr-serverless:us-east-1:123456789012:/applications/*" ] }, { "Effect": "Allow", "Action": "iam:PassRole", "Resource": [ "arn:aws:iam::123456789012:role/jobExecutionRoleArn" ], "Condition": { "StringEquals": { "iam:PassedToService": "emr-serverless.amazonaws.com" } } }, { "Effect": "Allow", "Action": [ "events:PutTargets", "events:PutRule", "events:DescribeRule" ], "Resource": [ "arn:aws:events:us-east-1:123456789012:rule/StepFunctionsGetEventsForEMRServerlessJobRule" ] } ] }
Request Response
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "emr-serverless:StartJobRun" ], "Resource": [ "arn:aws:emr-serverless:us-east-1:123456789012:/applications/*" ] }, { "Effect": "Allow", "Action": "iam:PassRole", "Resource": [ "arn:aws:iam::123456789012:role/jobExecutionRoleArn" ], "Condition": { "StringEquals": { "iam:PassedToService": "emr-serverless.amazonaws.com" } } } ] }

的 IAM 策略示例 CancelJobRun

静态资源

以下是当您使用带有状态的状态机时静态资源的 IAM 策略示例。 CancelJobRun Task 工作流程状态

Run a Job (.sync)
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "emr-serverless:CancelJobRun", "emr-serverless:GetJobRun" ], "Resource": [ "arn:aws:emr-serverless:us-east-1:123456789012:/applications/applicationId/jobruns/jobRunId" ] }, { "Effect": "Allow", "Action": [ "events:PutTargets", "events:PutRule", "events:DescribeRule" ], "Resource": [ "arn:aws:events:us-east-1:123456789012:rule/StepFunctionsGetEventsForEMRServerlessJobRule" ] } ] }
Request Response
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "emr-serverless:CancelJobRun" ], "Resource": [ "arn:aws:emr-serverless:us-east-1:123456789012:/applications/applicationId/jobruns/jobRunId" ] } ] }
动态资源

以下是当您使用带有状态的状态机时动态资源的 IAM 策略示例。 CancelJobRun Task 工作流程状态

Run a Job (.sync)
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "emr-serverless:CancelJobRun", "emr-serverless:GetJobRun" ], "Resource": [ "arn:aws:emr-serverless:us-east-1:123456789012:/applications/*" ] }, { "Effect": "Allow", "Action": [ "events:PutTargets", "events:PutRule", "events:DescribeRule" ], "Resource": [ "arn:aws:events:us-east-1:123456789012:rule/StepFunctionsGetEventsForEMRServerlessJobRule" ] } ] }
Request Response
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "emr-serverless:CancelJobRun" ], "Resource": [ "arn:aws:emr-serverless:us-east-1:123456789012:/applications/*" ] } ] }