Create and manage Amazon EMR Serverless applications with Step Functions
Learn how to create, start, stop, and delete applications on EMR Serverless using Step Functions. This page lists the supported APIs and provides
example Task
states to perform common use cases.
To learn about integrating with AWS services in Step Functions, see Integrating services and Passing parameters to a service API in Step Functions.
Key features of Optimized EMR Serverless integration
-
The Optimized EMR Serverless service integration has a customized set of APIs that wrap the underlying EMR Serverless APIs. Because of this customization, the optimized EMR Serverless integration differs significantly from the AWS SDK service integration.
-
In addition, the optimized EMR Serverless integration supports Run a Job (.sync) integration pattern.
-
The Wait for a Callback with Task Token integration pattern is not supported.
EMR Serverless service integration APIs
To integrate AWS Step Functions with EMR Serverless, you can use the following six EMR Serverless service integration APIs. These service integration APIs are similar to the corresponding EMR Serverless APIs, with some differences in the fields that are passed and in the responses that are returned.
The following table describes the differences between each EMR Serverless service integration API and its corresponding EMR Serverless API.
EMR Serverless service integration API | Corresponding EMR Serverless API | Differences |
---|---|---|
createApplication Creates an application. EMR Serverless is linked to a unique type of IAM role known as a service-linked role. For |
CreateApplication | None |
createApplication.sync Creates an application. |
CreateApplication |
No differences between the requests and responses of the EMR Serverless API and EMR Serverless service integration API. However, createApplication.sync waits for the application to reach the |
startApplication Starts a specified application and initializes the application's initial capacity if configured. |
StartApplication |
The EMR Serverless API response doesn't contain any data, but the EMR Serverless service integration API response includes the following data.
|
startApplication.sync Starts a specified application and initializes the initial capacity if configured. |
StartApplication |
The EMR Serverless API response doesn't contain any data, but the EMR Serverless service integration API response includes the following data.
Also, startApplication.sync waits for the application to reach the |
stopApplication Stops a specified application and releases initial capacity if configured. All scheduled and running jobs must be completed or cancelled before stopping an application. |
StopApplication |
The EMR Serverless API response doesn't contain any data, but the EMR Serverless service integration API response includes the following data.
|
stopApplication.sync Stops a specified application and releases initial capacity if configured. All scheduled and running jobs must be completed or cancelled before stopping an application. |
StopApplication |
The EMR Serverless API response doesn't contain any data, but the EMR Serverless service integration API response includes the following data.
Also, stopApplication.sync waits for the application to reach the |
deleteApplication Deletes an application. An application must be in the |
DeleteApplication |
The EMR Serverless API response doesn't contain any data, but the EMR Serverless service integration API response includes the following data.
|
deleteApplication.sync Deletes an application. An application must be in the |
DeleteApplication |
The EMR Serverless API response doesn't contain any data, but the EMR Serverless service integration API response includes the following data.
Also, stopApplication.sync waits for the application to reach the |
startJobRun Starts a job run. |
StartJobRun | None |
startJobRun.sync Starts a job run. |
StartJobRun |
No differences between the requests and responses of the EMR Serverless API and EMR Serverless service integration API. However, startJobRun.sync waits for the application to reach the |
cancelJobRun Cancels a job run. |
CancelJobRun | None |
cancelJobRun.sync Cancels a job run. |
CancelJobRun |
No differences between the requests and responses of the EMR Serverless API and EMR Serverless service integration API. However, cancelJobRun.sync waits for the application to reach the |
EMR Serverless integration use cases
For the Optimized EMR Serverless service integration, we recommend that you create a single application, and then use that application to run multiple jobs. For example, in a single state machine, you can include multiple startJobRun requests, all of which use the same application. The following Task workflow state state examples show use cases to integrate EMR Serverless APIs with Step Functions. For information about other use cases of EMR Serverless, see What is Amazon EMR Serverless.
Tip
To deploy an example of a state machine that integrates with EMR Serverless for running multiple jobs to your AWS account, see Run an EMR Serverless job.
To learn about configuring IAM permissions when using Step Functions with other AWS services, see How Step Functions generates IAM policies for integrated services.
In the examples shown in the following use cases, replace the italicized
text with your resource-specific information. For example, replace yourApplicationId
with the ID of your EMR Serverless application, such as 00yv7iv71inak893
.
Create an application
The following Task state example creates an application using the createApplication.sync service integration API.
"Create_Application": { "Type": "Task", "Resource": "arn:aws:states:::emr-serverless:createApplication.sync", "Parameters": { "Name": "
MyApplication
", "ReleaseLabel": "emr-6.9.0", "Type": "SPARK" }, "End": true }
Start an application
The following Task state example starts an application using the startApplication.sync service integration API.
"Start_Application": { "Type": "Task", "Resource": "arn:aws:states:::emr-serverless:startApplication.sync", "Parameters": { "ApplicationId": "
yourApplicationId
" }, "End": true }
Stop an application
The following Task state example stops an application using the stopApplication.sync service integration API.
"Stop_Application": { "Type": "Task", "Resource": "arn:aws:states:::emr-serverless:stopApplication.sync", "Parameters": { "ApplicationId": "
yourApplicationId
" }, "End": true }
Delete an application
The following Task state example deletes an application using the deleteApplication.sync service integration API.
"Delete_Application": { "Type": "Task", "Resource": "arn:aws:states:::emr-serverless:deleteApplication.sync", "Parameters": { "ApplicationId": "
yourApplicationId
" }, "End": true }
Start a job in an application
The following Task state example starts a job in an application using the startJobRun.sync service integration API.
"Start_Job": { "Type": "Task", "Resource": "arn:aws:states:::emr-serverless:startJobRun.sync", "Parameters": { "ApplicationId": "
yourApplicationId
", "ExecutionRoleArn": "arn:aws:iam::123456789012:role/myEMRServerless-execution-role
", "JobDriver": { "SparkSubmit": { "EntryPoint": "s3://<amzn-s3-demo-bucket>
/sample.py
", "EntryPointArguments": ["1"], "SparkSubmitParameters": "--conf spark.executor.cores=4 --conf spark.executor.memory=4g --conf spark.driver.cores=2 --conf spark.driver.memory=4g --conf spark.executor.instances=1" } } }, "End": true }
Cancel a job in an application
The following Task state example cancels a job in an application using the cancelJobRun.sync service integration API.
"Cancel_Job": { "Type": "Task", "Resource": "arn:aws:states:::emr-serverless:cancelJobRun.sync", "Parameters": { "ApplicationId.$": "$.ApplicationId", "JobRunId.$": "$.JobRunId" }, "End": true }
IAM policies for calling Amazon EMR Serverless
When you create a state machine using the console, Step Functions automatically creates an execution role for your state machine with the least privileges required. These automatically generated IAM roles are valid for the AWS Region in which you create the state machine.
The following example templates show how AWS Step Functions generates IAM policies based on the resources in your state machine definition. For more information, see How Step Functions generates IAM policies for integrated services and Discover service integration patterns in Step Functions.
We recommend that when you create IAM policies, do not include wildcards in the policies. As a security best practice, you should scope your policies down as much as possible. You should use dynamic policies only when certain input parameters are not known during runtime.
Further, administrator users should be careful when granting non-administrator users execution roles for running the state machines. We recommend that you include passRole policies in the execution roles if you're creating policies on your own. We also recommend that you add the aws:SourceARN
and aws:SourceAccount
context keys in the execution roles.
IAM policy examples for EMR Serverless integration with Step Functions
IAM policy example for CreateApplication
The following is an IAM policy example for a state machine with a CreateApplication Task workflow state state.
Note
You need to specify the CreateServiceLinkedRole permissions in your IAM policies during the creation of the first ever application in your account. Thereafter, you need not add this permission. For information about CreateServiceLinkedRole, see CreateServiceLinkedRole in the https://docs.aws.amazon.com/IAM/latest/APIReference/.
Static and dynamic resources for the following policies are the same.
IAM policy example for StartApplication
Static resources
The following are IAM policy examples for static resources when you use a state machine with a StartApplication Task workflow state state.
Dynamic resources
The following are IAM policy examples for dynamic resources when you use a state machine with a StartApplication Task workflow state state.
IAM policy example for StopApplication
Static resources
The following are IAM policy examples for static resources when you use a state machine with a StopApplication Task workflow state state.
Dynamic resources
The following are IAM policy examples for dynamic resources when you use a state machine with a StopApplication Task workflow state state.
IAM policy example for DeleteApplication
Static resources
The following are IAM policy examples for static resources when you use a state machine with a DeleteApplication Task workflow state state.
Dynamic resources
The following are IAM policy examples for dynamic resources when you use a state machine with a DeleteApplication Task workflow state state.
IAM policy example for StartJobRun
Static resources
The following are IAM policy examples for static resources when you use a state machine with a StartJobRun Task workflow state state.
Dynamic resources
The following are IAM policy examples for dynamic resources when you use a state machine with a StartJobRun Task workflow state state.
IAM policy example for CancelJobRun
Static resources
The following are IAM policy examples for static resources when you use a state machine with a CancelJobRun Task workflow state state.
Dynamic resources
The following are IAM policy examples for dynamic resources when you use a state machine with a CancelJobRun Task workflow state state.