Amazon EMR 2009-03-31
- Client: Aws\Emr\EmrClient
- Service ID: elasticmapreduce
- Version: 2009-03-31
This page describes the parameters and results for the operations of the Amazon EMR (2009-03-31), and shows how to use the Aws\Emr\EmrClient object to call the described operations. This documentation is specific to the 2009-03-31 API version of the service.
Operation Summary
Each of the following operations can be created from a client using
$client->getCommand('CommandName')
, where "CommandName" is the
name of one of the following operations. Note: a command is a value that
encapsulates an operation and the parameters used to create an HTTP request.
You can also create and send a command immediately using the magic methods
available on a client object: $client->commandName(/* parameters */)
.
You can send the command asynchronously (returning a promise) by appending the
word "Async" to the operation name: $client->commandNameAsync(/* parameters */)
.
- AddInstanceFleet ( array $params = [] )
- Adds an instance fleet to a running cluster.
- AddInstanceGroups ( array $params = [] )
- Adds one or more instance groups to a running cluster.
- AddJobFlowSteps ( array $params = [] )
- AddJobFlowSteps adds new steps to a running cluster.
- AddTags ( array $params = [] )
- Adds tags to an Amazon EMR resource, such as a cluster or an Amazon EMR Studio.
- CancelSteps ( array $params = [] )
- Cancels a pending step or steps in a running cluster.
- CreateSecurityConfiguration ( array $params = [] )
- Creates a security configuration, which is stored in the service and can be specified when a cluster is created.
- CreateStudio ( array $params = [] )
- Creates a new Amazon EMR Studio.
- CreateStudioSessionMapping ( array $params = [] )
- Maps a user or group to the Amazon EMR Studio specified by StudioId, and applies a session policy to refine Studio permissions for that user or group.
- DeleteSecurityConfiguration ( array $params = [] )
- Deletes a security configuration.
- DeleteStudio ( array $params = [] )
- Removes an Amazon EMR Studio from the Studio metadata store.
- DeleteStudioSessionMapping ( array $params = [] )
- Removes a user or group from an Amazon EMR Studio.
- DescribeCluster ( array $params = [] )
- Provides cluster-level details including status, hardware and software configuration, VPC settings, and so on.
- DescribeJobFlows ( array $params = [] )
- This API is no longer supported and will eventually be removed.
- DescribeNotebookExecution ( array $params = [] )
- Provides details of a notebook execution.
- DescribeReleaseLabel ( array $params = [] )
- Provides Amazon EMR release label details, such as the releases available the Region where the API request is run, and the available applications for a specific Amazon EMR release label.
- DescribeSecurityConfiguration ( array $params = [] )
- Provides the details of a security configuration by returning the configuration JSON.
- DescribeStep ( array $params = [] )
- Provides more detail about the cluster step.
- DescribeStudio ( array $params = [] )
- Returns details for the specified Amazon EMR Studio including ID, Name, VPC, Studio access URL, and so on.
- GetAutoTerminationPolicy ( array $params = [] )
- Returns the auto-termination policy for an Amazon EMR cluster.
- GetBlockPublicAccessConfiguration ( array $params = [] )
- Returns the Amazon EMR block public access configuration for your Amazon Web Services account in the current Region.
- GetClusterSessionCredentials ( array $params = [] )
- Provides temporary, HTTP basic credentials that are associated with a given runtime IAM role and used by a cluster with fine-grained access control activated.
- GetManagedScalingPolicy ( array $params = [] )
- Fetches the attached managed scaling policy for an Amazon EMR cluster.
- GetStudioSessionMapping ( array $params = [] )
- Fetches mapping details for the specified Amazon EMR Studio and identity (user or group).
- ListBootstrapActions ( array $params = [] )
- Provides information about the bootstrap actions associated with a cluster.
- ListClusters ( array $params = [] )
- Provides the status of all clusters visible to this Amazon Web Services account.
- ListInstanceFleets ( array $params = [] )
- Lists all available details about the instance fleets in a cluster.
- ListInstanceGroups ( array $params = [] )
- Provides all available details about the instance groups in a cluster.
- ListInstances ( array $params = [] )
- Provides information for all active Amazon EC2 instances and Amazon EC2 instances terminated in the last 30 days, up to a maximum of 2,000.
- ListNotebookExecutions ( array $params = [] )
- Provides summaries of all notebook executions.
- ListReleaseLabels ( array $params = [] )
- Retrieves release labels of Amazon EMR services in the Region where the API is called.
- ListSecurityConfigurations ( array $params = [] )
- Lists all the security configurations visible to this account, providing their creation dates and times, and their names.
- ListSteps ( array $params = [] )
- Provides a list of steps for the cluster in reverse order unless you specify stepIds with the request or filter by StepStates.
- ListStudioSessionMappings ( array $params = [] )
- Returns a list of all user or group session mappings for the Amazon EMR Studio specified by StudioId.
- ListStudios ( array $params = [] )
- Returns a list of all Amazon EMR Studios associated with the Amazon Web Services account.
- ListSupportedInstanceTypes ( array $params = [] )
- A list of the instance types that Amazon EMR supports.
- ModifyCluster ( array $params = [] )
- Modifies the number of steps that can be executed concurrently for the cluster specified using ClusterID.
- ModifyInstanceFleet ( array $params = [] )
- Modifies the target On-Demand and target Spot capacities for the instance fleet with the specified InstanceFleetID within the cluster specified using ClusterID.
- ModifyInstanceGroups ( array $params = [] )
- ModifyInstanceGroups modifies the number of nodes and configuration settings of an instance group.
- PutAutoScalingPolicy ( array $params = [] )
- Creates or updates an automatic scaling policy for a core instance group or task instance group in an Amazon EMR cluster.
- PutAutoTerminationPolicy ( array $params = [] )
- Auto-termination is supported in Amazon EMR releases 5.
- PutBlockPublicAccessConfiguration ( array $params = [] )
- Creates or updates an Amazon EMR block public access configuration for your Amazon Web Services account in the current Region.
- PutManagedScalingPolicy ( array $params = [] )
- Creates or updates a managed scaling policy for an Amazon EMR cluster.
- RemoveAutoScalingPolicy ( array $params = [] )
- Removes an automatic scaling policy from a specified instance group within an Amazon EMR cluster.
- RemoveAutoTerminationPolicy ( array $params = [] )
- Removes an auto-termination policy from an Amazon EMR cluster.
- RemoveManagedScalingPolicy ( array $params = [] )
- Removes a managed scaling policy from a specified Amazon EMR cluster.
- RemoveTags ( array $params = [] )
- Removes tags from an Amazon EMR resource, such as a cluster or Amazon EMR Studio.
- RunJobFlow ( array $params = [] )
- RunJobFlow creates and starts running a new cluster (job flow).
- SetKeepJobFlowAliveWhenNoSteps ( array $params = [] )
- You can use the SetKeepJobFlowAliveWhenNoSteps to configure a cluster (job flow) to terminate after the step execution, i.
- SetTerminationProtection ( array $params = [] )
- SetTerminationProtection locks a cluster (job flow) so the Amazon EC2 instances in the cluster cannot be terminated by user intervention, an API call, or in the event of a job-flow error.
- SetUnhealthyNodeReplacement ( array $params = [] )
- Specify whether to enable unhealthy node replacement, which lets Amazon EMR gracefully replace core nodes on a cluster if any nodes become unhealthy.
- SetVisibleToAllUsers ( array $params = [] )
- The SetVisibleToAllUsers parameter is no longer supported.
- StartNotebookExecution ( array $params = [] )
- Starts a notebook execution.
- StopNotebookExecution ( array $params = [] )
- Stops a notebook execution.
- TerminateJobFlows ( array $params = [] )
- TerminateJobFlows shuts a list of clusters (job flows) down.
- UpdateStudio ( array $params = [] )
- Updates an Amazon EMR Studio configuration, including attributes such as name, description, and subnets.
- UpdateStudioSessionMapping ( array $params = [] )
- Updates the session policy attached to the user or group for the specified Amazon EMR Studio.
Paginators
Paginators handle automatically iterating over paginated API results. Paginators are associated with specific API operations, and they accept the parameters that the corresponding API operation accepts. You can get a paginator from a client class using getPaginator($paginatorName, $operationParameters). This client supports the following paginators:
- DescribeJobFlows
- ListBootstrapActions
- ListClusters
- ListInstanceFleets
- ListInstanceGroups
- ListInstances
- ListNotebookExecutions
- ListReleaseLabels
- ListSecurityConfigurations
- ListSteps
- ListStudioSessionMappings
- ListStudios
- ListSupportedInstanceTypes
Waiters
Waiters allow you to poll a resource until it enters into a desired state. A waiter has a name used to describe what it does, and is associated with an API operation. When creating a waiter, you can provide the API operation parameters associated with the corresponding operation. Waiters can be accessed using the getWaiter($waiterName, $operationParameters) method of a client object. This client supports the following waiters:
Waiter name | API Operation | Delay | Max Attempts |
---|---|---|---|
ClusterRunning | DescribeCluster | 30 | 60 |
StepComplete | DescribeStep | 30 | 60 |
ClusterTerminated | DescribeCluster | 30 | 60 |
Operations
AddInstanceFleet
$result = $client->addInstanceFleet
([/* ... */]); $promise = $client->addInstanceFleetAsync
([/* ... */]);
Adds an instance fleet to a running cluster.
The instance fleet configuration is available only in Amazon EMR releases 4.8.0 and later, excluding 5.0.x.
Parameter Syntax
$result = $client->addInstanceFleet([ 'ClusterId' => '<string>', // REQUIRED 'InstanceFleet' => [ // REQUIRED 'Context' => '<string>', 'InstanceFleetType' => 'MASTER|CORE|TASK', // REQUIRED 'InstanceTypeConfigs' => [ [ 'BidPrice' => '<string>', 'BidPriceAsPercentageOfOnDemandPrice' => <float>, 'Configurations' => [ [ 'Classification' => '<string>', 'Configurations' => [...], // RECURSIVE 'Properties' => ['<string>', ...], ], // ... ], 'CustomAmiId' => '<string>', 'EbsConfiguration' => [ 'EbsBlockDeviceConfigs' => [ [ 'VolumeSpecification' => [ // REQUIRED 'Iops' => <integer>, 'SizeInGB' => <integer>, // REQUIRED 'Throughput' => <integer>, 'VolumeType' => '<string>', // REQUIRED ], 'VolumesPerInstance' => <integer>, ], // ... ], 'EbsOptimized' => true || false, ], 'InstanceType' => '<string>', // REQUIRED 'Priority' => <float>, 'WeightedCapacity' => <integer>, ], // ... ], 'LaunchSpecifications' => [ 'OnDemandSpecification' => [ 'AllocationStrategy' => 'lowest-price|prioritized', // REQUIRED 'CapacityReservationOptions' => [ 'CapacityReservationPreference' => 'open|none', 'CapacityReservationResourceGroupArn' => '<string>', 'UsageStrategy' => 'use-capacity-reservations-first', ], ], 'SpotSpecification' => [ 'AllocationStrategy' => 'capacity-optimized|price-capacity-optimized|lowest-price|diversified|capacity-optimized-prioritized', 'BlockDurationMinutes' => <integer>, 'TimeoutAction' => 'SWITCH_TO_ON_DEMAND|TERMINATE_CLUSTER', // REQUIRED 'TimeoutDurationMinutes' => <integer>, // REQUIRED ], ], 'Name' => '<string>', 'ResizeSpecifications' => [ 'OnDemandResizeSpecification' => [ 'AllocationStrategy' => 'lowest-price|prioritized', 'CapacityReservationOptions' => [ 'CapacityReservationPreference' => 'open|none', 'CapacityReservationResourceGroupArn' => '<string>', 'UsageStrategy' => 'use-capacity-reservations-first', ], 'TimeoutDurationMinutes' => <integer>, ], 'SpotResizeSpecification' => [ 'AllocationStrategy' => 'capacity-optimized|price-capacity-optimized|lowest-price|diversified|capacity-optimized-prioritized', 'TimeoutDurationMinutes' => <integer>, ], ], 'TargetOnDemandCapacity' => <integer>, 'TargetSpotCapacity' => <integer>, ], ]);
Parameter Details
Members
- ClusterId
-
- Required: Yes
- Type: string
The unique identifier of the cluster.
- InstanceFleet
-
- Required: Yes
- Type: InstanceFleetConfig structure
Specifies the configuration of the instance fleet.
Result Syntax
[ 'ClusterArn' => '<string>', 'ClusterId' => '<string>', 'InstanceFleetId' => '<string>', ]
Result Details
Members
- ClusterArn
-
- Type: string
The Amazon Resource Name of the cluster.
- ClusterId
-
- Type: string
The unique identifier of the cluster.
- InstanceFleetId
-
- Type: string
The unique identifier of the instance fleet.
Errors
- InternalServerException:
This exception occurs when there is an internal failure in the Amazon EMR service.
- InvalidRequestException:
This exception occurs when there is something wrong with user input.
AddInstanceGroups
$result = $client->addInstanceGroups
([/* ... */]); $promise = $client->addInstanceGroupsAsync
([/* ... */]);
Adds one or more instance groups to a running cluster.
Parameter Syntax
$result = $client->addInstanceGroups([ 'InstanceGroups' => [ // REQUIRED [ 'AutoScalingPolicy' => [ 'Constraints' => [ // REQUIRED 'MaxCapacity' => <integer>, // REQUIRED 'MinCapacity' => <integer>, // REQUIRED ], 'Rules' => [ // REQUIRED [ 'Action' => [ // REQUIRED 'Market' => 'ON_DEMAND|SPOT', 'SimpleScalingPolicyConfiguration' => [ // REQUIRED 'AdjustmentType' => 'CHANGE_IN_CAPACITY|PERCENT_CHANGE_IN_CAPACITY|EXACT_CAPACITY', 'CoolDown' => <integer>, 'ScalingAdjustment' => <integer>, // REQUIRED ], ], 'Description' => '<string>', 'Name' => '<string>', // REQUIRED 'Trigger' => [ // REQUIRED 'CloudWatchAlarmDefinition' => [ // REQUIRED 'ComparisonOperator' => 'GREATER_THAN_OR_EQUAL|GREATER_THAN|LESS_THAN|LESS_THAN_OR_EQUAL', // REQUIRED 'Dimensions' => [ [ 'Key' => '<string>', 'Value' => '<string>', ], // ... ], 'EvaluationPeriods' => <integer>, 'MetricName' => '<string>', // REQUIRED 'Namespace' => '<string>', 'Period' => <integer>, // REQUIRED 'Statistic' => 'SAMPLE_COUNT|AVERAGE|SUM|MINIMUM|MAXIMUM', 'Threshold' => <float>, // REQUIRED 'Unit' => 'NONE|SECONDS|MICRO_SECONDS|MILLI_SECONDS|BYTES|KILO_BYTES|MEGA_BYTES|GIGA_BYTES|TERA_BYTES|BITS|KILO_BITS|MEGA_BITS|GIGA_BITS|TERA_BITS|PERCENT|COUNT|BYTES_PER_SECOND|KILO_BYTES_PER_SECOND|MEGA_BYTES_PER_SECOND|GIGA_BYTES_PER_SECOND|TERA_BYTES_PER_SECOND|BITS_PER_SECOND|KILO_BITS_PER_SECOND|MEGA_BITS_PER_SECOND|GIGA_BITS_PER_SECOND|TERA_BITS_PER_SECOND|COUNT_PER_SECOND', ], ], ], // ... ], ], 'BidPrice' => '<string>', 'Configurations' => [ [ 'Classification' => '<string>', 'Configurations' => [...], // RECURSIVE 'Properties' => ['<string>', ...], ], // ... ], 'CustomAmiId' => '<string>', 'EbsConfiguration' => [ 'EbsBlockDeviceConfigs' => [ [ 'VolumeSpecification' => [ // REQUIRED 'Iops' => <integer>, 'SizeInGB' => <integer>, // REQUIRED 'Throughput' => <integer>, 'VolumeType' => '<string>', // REQUIRED ], 'VolumesPerInstance' => <integer>, ], // ... ], 'EbsOptimized' => true || false, ], 'InstanceCount' => <integer>, // REQUIRED 'InstanceRole' => 'MASTER|CORE|TASK', // REQUIRED 'InstanceType' => '<string>', // REQUIRED 'Market' => 'ON_DEMAND|SPOT', 'Name' => '<string>', ], // ... ], 'JobFlowId' => '<string>', // REQUIRED ]);
Parameter Details
Members
- InstanceGroups
-
- Required: Yes
- Type: Array of InstanceGroupConfig structures
Instance groups to add.
- JobFlowId
-
- Required: Yes
- Type: string
Job flow in which to add the instance groups.
Result Syntax
[ 'ClusterArn' => '<string>', 'InstanceGroupIds' => ['<string>', ...], 'JobFlowId' => '<string>', ]
Result Details
Members
- ClusterArn
-
- Type: string
The Amazon Resource Name of the cluster.
- InstanceGroupIds
-
- Type: Array of strings
Instance group IDs of the newly created instance groups.
- JobFlowId
-
- Type: string
The job flow ID in which the instance groups are added.
Errors
- InternalServerError:
Indicates that an error occurred while processing the request and that the request was not completed.
AddJobFlowSteps
$result = $client->addJobFlowSteps
([/* ... */]); $promise = $client->addJobFlowStepsAsync
([/* ... */]);
AddJobFlowSteps adds new steps to a running cluster. A maximum of 256 steps are allowed in each job flow.
If your cluster is long-running (such as a Hive data warehouse) or complex, you may require more than 256 steps to process your data. You can bypass the 256-step limitation in various ways, including using SSH to connect to the master node and submitting queries directly to the software running on the master node, such as Hive and Hadoop.
A step specifies the location of a JAR file stored either on the master node of the cluster or in Amazon S3. Each step is performed by the main function of the main class of the JAR file. The main class can be specified either in the manifest of the JAR or by using the MainFunction parameter of the step.
Amazon EMR executes each step in the order listed. For a step to be considered complete, the main function must exit with a zero exit code and all Hadoop jobs started while the step was running must have completed and run successfully.
You can only add steps to a cluster that is in one of the following states: STARTING, BOOTSTRAPPING, RUNNING, or WAITING.
The string values passed into HadoopJarStep
object cannot exceed a total of 10240 characters.
Parameter Syntax
$result = $client->addJobFlowSteps([ 'ExecutionRoleArn' => '<string>', 'JobFlowId' => '<string>', // REQUIRED 'Steps' => [ // REQUIRED [ 'ActionOnFailure' => 'TERMINATE_JOB_FLOW|TERMINATE_CLUSTER|CANCEL_AND_WAIT|CONTINUE', 'HadoopJarStep' => [ // REQUIRED 'Args' => ['<string>', ...], 'Jar' => '<string>', // REQUIRED 'MainClass' => '<string>', 'Properties' => [ [ 'Key' => '<string>', 'Value' => '<string>', ], // ... ], ], 'Name' => '<string>', // REQUIRED ], // ... ], ]);
Parameter Details
Members
- ExecutionRoleArn
-
- Type: string
The Amazon Resource Name (ARN) of the runtime role for a step on the cluster. The runtime role can be a cross-account IAM role. The runtime role ARN is a combination of account ID, role name, and role type using the following format:
arn:partition:service:region:account:resource
.For example,
arn:aws:IAM::1234567890:role/ReadOnly
is a correctly formatted runtime role ARN. - JobFlowId
-
- Required: Yes
- Type: string
A string that uniquely identifies the job flow. This identifier is returned by RunJobFlow and can also be obtained from ListClusters.
- Steps
-
- Required: Yes
- Type: Array of StepConfig structures
A list of StepConfig to be executed by the job flow.
Result Syntax
[ 'StepIds' => ['<string>', ...], ]
Result Details
Members
- StepIds
-
- Type: Array of strings
The identifiers of the list of steps added to the job flow.
Errors
- InternalServerError:
Indicates that an error occurred while processing the request and that the request was not completed.
AddTags
$result = $client->addTags
([/* ... */]); $promise = $client->addTagsAsync
([/* ... */]);
Adds tags to an Amazon EMR resource, such as a cluster or an Amazon EMR Studio. Tags make it easier to associate resources in various ways, such as grouping clusters to track your Amazon EMR resource allocation costs. For more information, see Tag Clusters.
Parameter Syntax
$result = $client->addTags([ 'ResourceId' => '<string>', // REQUIRED 'Tags' => [ // REQUIRED [ 'Key' => '<string>', 'Value' => '<string>', ], // ... ], ]);
Parameter Details
Members
- ResourceId
-
- Required: Yes
- Type: string
The Amazon EMR resource identifier to which tags will be added. For example, a cluster identifier or an Amazon EMR Studio ID.
- Tags
-
- Required: Yes
- Type: Array of Tag structures
A list of tags to associate with a resource. Tags are user-defined key-value pairs that consist of a required key string with a maximum of 128 characters, and an optional value string with a maximum of 256 characters.
Result Syntax
[]
Result Details
Errors
- InternalServerException:
This exception occurs when there is an internal failure in the Amazon EMR service.
- InvalidRequestException:
This exception occurs when there is something wrong with user input.
CancelSteps
$result = $client->cancelSteps
([/* ... */]); $promise = $client->cancelStepsAsync
([/* ... */]);
Cancels a pending step or steps in a running cluster. Available only in Amazon EMR versions 4.8.0 and later, excluding version 5.0.0. A maximum of 256 steps are allowed in each CancelSteps request. CancelSteps is idempotent but asynchronous; it does not guarantee that a step will be canceled, even if the request is successfully submitted. When you use Amazon EMR releases 5.28.0 and later, you can cancel steps that are in a PENDING
or RUNNING
state. In earlier versions of Amazon EMR, you can only cancel steps that are in a PENDING
state.
Parameter Syntax
$result = $client->cancelSteps([ 'ClusterId' => '<string>', // REQUIRED 'StepCancellationOption' => 'SEND_INTERRUPT|TERMINATE_PROCESS', 'StepIds' => ['<string>', ...], // REQUIRED ]);
Parameter Details
Members
- ClusterId
-
- Required: Yes
- Type: string
The
ClusterID
for the specified steps that will be canceled. Use RunJobFlow and ListClusters to get ClusterIDs. - StepCancellationOption
-
- Type: string
The option to choose to cancel
RUNNING
steps. By default, the value isSEND_INTERRUPT
. - StepIds
-
- Required: Yes
- Type: Array of strings
The list of
StepIDs
to cancel. Use ListSteps to get steps and their states for the specified cluster.
Result Syntax
[ 'CancelStepsInfoList' => [ [ 'Reason' => '<string>', 'Status' => 'SUBMITTED|FAILED', 'StepId' => '<string>', ], // ... ], ]
Result Details
Members
- CancelStepsInfoList
-
- Type: Array of CancelStepsInfo structures
A list of CancelStepsInfo, which shows the status of specified cancel requests for each
StepID
specified.
Errors
- InternalServerError:
Indicates that an error occurred while processing the request and that the request was not completed.
- InvalidRequestException:
This exception occurs when there is something wrong with user input.
CreateSecurityConfiguration
$result = $client->createSecurityConfiguration
([/* ... */]); $promise = $client->createSecurityConfigurationAsync
([/* ... */]);
Creates a security configuration, which is stored in the service and can be specified when a cluster is created.
Parameter Syntax
$result = $client->createSecurityConfiguration([ 'Name' => '<string>', // REQUIRED 'SecurityConfiguration' => '<string>', // REQUIRED ]);
Parameter Details
Members
- Name
-
- Required: Yes
- Type: string
The name of the security configuration.
- SecurityConfiguration
-
- Required: Yes
- Type: string
The security configuration details in JSON format. For JSON parameters and examples, see Use Security Configurations to Set Up Cluster Security in the Amazon EMR Management Guide.
Result Syntax
[ 'CreationDateTime' => <DateTime>, 'Name' => '<string>', ]
Result Details
Members
- CreationDateTime
-
- Required: Yes
- Type: timestamp (string|DateTime or anything parsable by strtotime)
The date and time the security configuration was created.
- Name
-
- Required: Yes
- Type: string
The name of the security configuration.
Errors
- InternalServerException:
This exception occurs when there is an internal failure in the Amazon EMR service.
- InvalidRequestException:
This exception occurs when there is something wrong with user input.
CreateStudio
$result = $client->createStudio
([/* ... */]); $promise = $client->createStudioAsync
([/* ... */]);
Creates a new Amazon EMR Studio.
Parameter Syntax
$result = $client->createStudio([ 'AuthMode' => 'SSO|IAM', // REQUIRED 'DefaultS3Location' => '<string>', // REQUIRED 'Description' => '<string>', 'EncryptionKeyArn' => '<string>', 'EngineSecurityGroupId' => '<string>', // REQUIRED 'IdcInstanceArn' => '<string>', 'IdcUserAssignment' => 'REQUIRED|OPTIONAL', 'IdpAuthUrl' => '<string>', 'IdpRelayStateParameterName' => '<string>', 'Name' => '<string>', // REQUIRED 'ServiceRole' => '<string>', // REQUIRED 'SubnetIds' => ['<string>', ...], // REQUIRED 'Tags' => [ [ 'Key' => '<string>', 'Value' => '<string>', ], // ... ], 'TrustedIdentityPropagationEnabled' => true || false, 'UserRole' => '<string>', 'VpcId' => '<string>', // REQUIRED 'WorkspaceSecurityGroupId' => '<string>', // REQUIRED ]);
Parameter Details
Members
- AuthMode
-
- Required: Yes
- Type: string
Specifies whether the Studio authenticates users using IAM or IAM Identity Center.
- DefaultS3Location
-
- Required: Yes
- Type: string
The Amazon S3 location to back up Amazon EMR Studio Workspaces and notebook files.
- Description
-
- Type: string
A detailed description of the Amazon EMR Studio.
- EncryptionKeyArn
-
- Type: string
The KMS key identifier (ARN) used to encrypt Amazon EMR Studio workspace and notebook files when backed up to Amazon S3.
- EngineSecurityGroupId
-
- Required: Yes
- Type: string
The ID of the Amazon EMR Studio Engine security group. The Engine security group allows inbound network traffic from the Workspace security group, and it must be in the same VPC specified by
VpcId
. - IdcInstanceArn
-
- Type: string
The ARN of the IAM Identity Center instance to create the Studio application.
- IdcUserAssignment
-
- Type: string
Specifies whether IAM Identity Center user assignment is
REQUIRED
orOPTIONAL
. If the value is set toREQUIRED
, users must be explicitly assigned to the Studio application to access the Studio. - IdpAuthUrl
-
- Type: string
The authentication endpoint of your identity provider (IdP). Specify this value when you use IAM authentication and want to let federated users log in to a Studio with the Studio URL and credentials from your IdP. Amazon EMR Studio redirects users to this endpoint to enter credentials.
- IdpRelayStateParameterName
-
- Type: string
The name that your identity provider (IdP) uses for its
RelayState
parameter. For example,RelayState
orTargetSource
. Specify this value when you use IAM authentication and want to let federated users log in to a Studio using the Studio URL. TheRelayState
parameter differs by IdP. - Name
-
- Required: Yes
- Type: string
A descriptive name for the Amazon EMR Studio.
- ServiceRole
-
- Required: Yes
- Type: string
The IAM role that the Amazon EMR Studio assumes. The service role provides a way for Amazon EMR Studio to interoperate with other Amazon Web Services services.
- SubnetIds
-
- Required: Yes
- Type: Array of strings
A list of subnet IDs to associate with the Amazon EMR Studio. A Studio can have a maximum of 5 subnets. The subnets must belong to the VPC specified by
VpcId
. Studio users can create a Workspace in any of the specified subnets. - Tags
-
- Type: Array of Tag structures
A list of tags to associate with the Amazon EMR Studio. Tags are user-defined key-value pairs that consist of a required key string with a maximum of 128 characters, and an optional value string with a maximum of 256 characters.
- TrustedIdentityPropagationEnabled
-
- Type: boolean
A Boolean indicating whether to enable Trusted identity propagation for the Studio. The default value is
false
. - UserRole
-
- Type: string
The IAM user role that users and groups assume when logged in to an Amazon EMR Studio. Only specify a
UserRole
when you use IAM Identity Center authentication. The permissions attached to theUserRole
can be scoped down for each user or group using session policies. - VpcId
-
- Required: Yes
- Type: string
The ID of the Amazon Virtual Private Cloud (Amazon VPC) to associate with the Studio.
- WorkspaceSecurityGroupId
-
- Required: Yes
- Type: string
The ID of the Amazon EMR Studio Workspace security group. The Workspace security group allows outbound network traffic to resources in the Engine security group, and it must be in the same VPC specified by
VpcId
.
Result Syntax
[ 'StudioId' => '<string>', 'Url' => '<string>', ]
Result Details
Members
- StudioId
-
- Type: string
The ID of the Amazon EMR Studio.
- Url
-
- Type: string
The unique Studio access URL.
Errors
- InternalServerException:
This exception occurs when there is an internal failure in the Amazon EMR service.
- InvalidRequestException:
This exception occurs when there is something wrong with user input.
CreateStudioSessionMapping
$result = $client->createStudioSessionMapping
([/* ... */]); $promise = $client->createStudioSessionMappingAsync
([/* ... */]);
Maps a user or group to the Amazon EMR Studio specified by StudioId
, and applies a session policy to refine Studio permissions for that user or group. Use CreateStudioSessionMapping
to assign users to a Studio when you use IAM Identity Center authentication. For instructions on how to assign users to a Studio when you use IAM authentication, see Assign a user or group to your EMR Studio.
Parameter Syntax
$result = $client->createStudioSessionMapping([ 'IdentityId' => '<string>', 'IdentityName' => '<string>', 'IdentityType' => 'USER|GROUP', // REQUIRED 'SessionPolicyArn' => '<string>', // REQUIRED 'StudioId' => '<string>', // REQUIRED ]);
Parameter Details
Members
- IdentityId
-
- Type: string
- IdentityName
-
- Type: string
The name of the user or group. For more information, see UserName and DisplayName in the IAM Identity Center Identity Store API Reference. Either
IdentityName
orIdentityId
must be specified, but not both. - IdentityType
-
- Required: Yes
- Type: string
Specifies whether the identity to map to the Amazon EMR Studio is a user or a group.
- SessionPolicyArn
-
- Required: Yes
- Type: string
The Amazon Resource Name (ARN) for the session policy that will be applied to the user or group. You should specify the ARN for the session policy that you want to apply, not the ARN of your user role. For more information, see Create an Amazon EMR Studio User Role with Session Policies.
- StudioId
-
- Required: Yes
- Type: string
The ID of the Amazon EMR Studio to which the user or group will be mapped.
Result Syntax
[]
Result Details
Errors
- InternalServerError:
Indicates that an error occurred while processing the request and that the request was not completed.
- InvalidRequestException:
This exception occurs when there is something wrong with user input.
DeleteSecurityConfiguration
$result = $client->deleteSecurityConfiguration
([/* ... */]); $promise = $client->deleteSecurityConfigurationAsync
([/* ... */]);
Deletes a security configuration.
Parameter Syntax
$result = $client->deleteSecurityConfiguration([ 'Name' => '<string>', // REQUIRED ]);
Parameter Details
Members
- Name
-
- Required: Yes
- Type: string
The name of the security configuration.
Result Syntax
[]
Result Details
Errors
- InternalServerException:
This exception occurs when there is an internal failure in the Amazon EMR service.
- InvalidRequestException:
This exception occurs when there is something wrong with user input.
DeleteStudio
$result = $client->deleteStudio
([/* ... */]); $promise = $client->deleteStudioAsync
([/* ... */]);
Removes an Amazon EMR Studio from the Studio metadata store.
Parameter Syntax
$result = $client->deleteStudio([ 'StudioId' => '<string>', // REQUIRED ]);
Parameter Details
Members
- StudioId
-
- Required: Yes
- Type: string
The ID of the Amazon EMR Studio.
Result Syntax
[]
Result Details
Errors
- InternalServerException:
This exception occurs when there is an internal failure in the Amazon EMR service.
- InvalidRequestException:
This exception occurs when there is something wrong with user input.
DeleteStudioSessionMapping
$result = $client->deleteStudioSessionMapping
([/* ... */]); $promise = $client->deleteStudioSessionMappingAsync
([/* ... */]);
Removes a user or group from an Amazon EMR Studio.
Parameter Syntax
$result = $client->deleteStudioSessionMapping([ 'IdentityId' => '<string>', 'IdentityName' => '<string>', 'IdentityType' => 'USER|GROUP', // REQUIRED 'StudioId' => '<string>', // REQUIRED ]);
Parameter Details
Members
- IdentityId
-
- Type: string
- IdentityName
-
- Type: string
The name of the user name or group to remove from the Amazon EMR Studio. For more information, see UserName and DisplayName in the IAM Identity Center Store API Reference. Either
IdentityName
orIdentityId
must be specified. - IdentityType
-
- Required: Yes
- Type: string
Specifies whether the identity to delete from the Amazon EMR Studio is a user or a group.
- StudioId
-
- Required: Yes
- Type: string
The ID of the Amazon EMR Studio.
Result Syntax
[]
Result Details
Errors
- InternalServerError:
Indicates that an error occurred while processing the request and that the request was not completed.
- InvalidRequestException:
This exception occurs when there is something wrong with user input.
DescribeCluster
$result = $client->describeCluster
([/* ... */]); $promise = $client->describeClusterAsync
([/* ... */]);
Provides cluster-level details including status, hardware and software configuration, VPC settings, and so on.
Parameter Syntax
$result = $client->describeCluster([ 'ClusterId' => '<string>', // REQUIRED ]);
Parameter Details
Members
- ClusterId
-
- Required: Yes
- Type: string
The identifier of the cluster to describe.
Result Syntax
[ 'Cluster' => [ 'Applications' => [ [ 'AdditionalInfo' => ['<string>', ...], 'Args' => ['<string>', ...], 'Name' => '<string>', 'Version' => '<string>', ], // ... ], 'AutoScalingRole' => '<string>', 'AutoTerminate' => true || false, 'ClusterArn' => '<string>', 'Configurations' => [ [ 'Classification' => '<string>', 'Configurations' => [...], // RECURSIVE 'Properties' => ['<string>', ...], ], // ... ], 'CustomAmiId' => '<string>', 'EbsRootVolumeIops' => <integer>, 'EbsRootVolumeSize' => <integer>, 'EbsRootVolumeThroughput' => <integer>, 'Ec2InstanceAttributes' => [ 'AdditionalMasterSecurityGroups' => ['<string>', ...], 'AdditionalSlaveSecurityGroups' => ['<string>', ...], 'Ec2AvailabilityZone' => '<string>', 'Ec2KeyName' => '<string>', 'Ec2SubnetId' => '<string>', 'EmrManagedMasterSecurityGroup' => '<string>', 'EmrManagedSlaveSecurityGroup' => '<string>', 'IamInstanceProfile' => '<string>', 'RequestedEc2AvailabilityZones' => ['<string>', ...], 'RequestedEc2SubnetIds' => ['<string>', ...], 'ServiceAccessSecurityGroup' => '<string>', ], 'Id' => '<string>', 'InstanceCollectionType' => 'INSTANCE_FLEET|INSTANCE_GROUP', 'KerberosAttributes' => [ 'ADDomainJoinPassword' => '<string>', 'ADDomainJoinUser' => '<string>', 'CrossRealmTrustPrincipalPassword' => '<string>', 'KdcAdminPassword' => '<string>', 'Realm' => '<string>', ], 'LogEncryptionKmsKeyId' => '<string>', 'LogUri' => '<string>', 'MasterPublicDnsName' => '<string>', 'Name' => '<string>', 'NormalizedInstanceHours' => <integer>, 'OSReleaseLabel' => '<string>', 'OutpostArn' => '<string>', 'PlacementGroups' => [ [ 'InstanceRole' => 'MASTER|CORE|TASK', 'PlacementStrategy' => 'SPREAD|PARTITION|CLUSTER|NONE', ], // ... ], 'ReleaseLabel' => '<string>', 'RepoUpgradeOnBoot' => 'SECURITY|NONE', 'RequestedAmiVersion' => '<string>', 'RunningAmiVersion' => '<string>', 'ScaleDownBehavior' => 'TERMINATE_AT_INSTANCE_HOUR|TERMINATE_AT_TASK_COMPLETION', 'SecurityConfiguration' => '<string>', 'ServiceRole' => '<string>', 'Status' => [ 'ErrorDetails' => [ [ 'ErrorCode' => '<string>', 'ErrorData' => [ ['<string>', ...], // ... ], 'ErrorMessage' => '<string>', ], // ... ], 'State' => 'STARTING|BOOTSTRAPPING|RUNNING|WAITING|TERMINATING|TERMINATED|TERMINATED_WITH_ERRORS', 'StateChangeReason' => [ 'Code' => 'INTERNAL_ERROR|VALIDATION_ERROR|INSTANCE_FAILURE|INSTANCE_FLEET_TIMEOUT|BOOTSTRAP_FAILURE|USER_REQUEST|STEP_FAILURE|ALL_STEPS_COMPLETED', 'Message' => '<string>', ], 'Timeline' => [ 'CreationDateTime' => <DateTime>, 'EndDateTime' => <DateTime>, 'ReadyDateTime' => <DateTime>, ], ], 'StepConcurrencyLevel' => <integer>, 'Tags' => [ [ 'Key' => '<string>', 'Value' => '<string>', ], // ... ], 'TerminationProtected' => true || false, 'UnhealthyNodeReplacement' => true || false, 'VisibleToAllUsers' => true || false, ], ]
Result Details
Members
- Cluster
-
- Type: Cluster structure
This output contains the details for the requested cluster.
Errors
- InternalServerException:
This exception occurs when there is an internal failure in the Amazon EMR service.
- InvalidRequestException:
This exception occurs when there is something wrong with user input.
DescribeJobFlows
$result = $client->describeJobFlows
([/* ... */]); $promise = $client->describeJobFlowsAsync
([/* ... */]);
This API is no longer supported and will eventually be removed. We recommend you use ListClusters, DescribeCluster, ListSteps, ListInstanceGroups and ListBootstrapActions instead.
DescribeJobFlows returns a list of job flows that match all of the supplied parameters. The parameters can include a list of job flow IDs, job flow states, and restrictions on job flow creation date and time.
Regardless of supplied parameters, only job flows created within the last two months are returned.
If no parameters are supplied, then job flows matching either of the following criteria are returned:
-
Job flows created and completed in the last two weeks
-
Job flows created within the last two months that are in one of the following states:
RUNNING
,WAITING
,SHUTTING_DOWN
,STARTING
Amazon EMR can return a maximum of 512 job flow descriptions.
Parameter Syntax
$result = $client->describeJobFlows([ 'CreatedAfter' => <integer || string || DateTime>, 'CreatedBefore' => <integer || string || DateTime>, 'JobFlowIds' => ['<string>', ...], 'JobFlowStates' => ['<string>', ...], ]);
Parameter Details
Members
- CreatedAfter
-
- Type: timestamp (string|DateTime or anything parsable by strtotime)
Return only job flows created after this date and time.
- CreatedBefore
-
- Type: timestamp (string|DateTime or anything parsable by strtotime)
Return only job flows created before this date and time.
- JobFlowIds
-
- Type: Array of strings
Return only job flows whose job flow ID is contained in this list.
- JobFlowStates
-
- Type: Array of strings
Return only job flows whose state is contained in this list.
Result Syntax
[ 'JobFlows' => [ [ 'AmiVersion' => '<string>', 'AutoScalingRole' => '<string>', 'BootstrapActions' => [ [ 'BootstrapActionConfig' => [ 'Name' => '<string>', 'ScriptBootstrapAction' => [ 'Args' => ['<string>', ...], 'Path' => '<string>', ], ], ], // ... ], 'ExecutionStatusDetail' => [ 'CreationDateTime' => <DateTime>, 'EndDateTime' => <DateTime>, 'LastStateChangeReason' => '<string>', 'ReadyDateTime' => <DateTime>, 'StartDateTime' => <DateTime>, 'State' => 'STARTING|BOOTSTRAPPING|RUNNING|WAITING|SHUTTING_DOWN|TERMINATED|COMPLETED|FAILED', ], 'Instances' => [ 'Ec2KeyName' => '<string>', 'Ec2SubnetId' => '<string>', 'HadoopVersion' => '<string>', 'InstanceCount' => <integer>, 'InstanceGroups' => [ [ 'BidPrice' => '<string>', 'CreationDateTime' => <DateTime>, 'CustomAmiId' => '<string>', 'EndDateTime' => <DateTime>, 'InstanceGroupId' => '<string>', 'InstanceRequestCount' => <integer>, 'InstanceRole' => 'MASTER|CORE|TASK', 'InstanceRunningCount' => <integer>, 'InstanceType' => '<string>', 'LastStateChangeReason' => '<string>', 'Market' => 'ON_DEMAND|SPOT', 'Name' => '<string>', 'ReadyDateTime' => <DateTime>, 'StartDateTime' => <DateTime>, 'State' => 'PROVISIONING|BOOTSTRAPPING|RUNNING|RECONFIGURING|RESIZING|SUSPENDED|TERMINATING|TERMINATED|ARRESTED|SHUTTING_DOWN|ENDED', ], // ... ], 'KeepJobFlowAliveWhenNoSteps' => true || false, 'MasterInstanceId' => '<string>', 'MasterInstanceType' => '<string>', 'MasterPublicDnsName' => '<string>', 'NormalizedInstanceHours' => <integer>, 'Placement' => [ 'AvailabilityZone' => '<string>', 'AvailabilityZones' => ['<string>', ...], ], 'SlaveInstanceType' => '<string>', 'TerminationProtected' => true || false, 'UnhealthyNodeReplacement' => true || false, ], 'JobFlowId' => '<string>', 'JobFlowRole' => '<string>', 'LogEncryptionKmsKeyId' => '<string>', 'LogUri' => '<string>', 'Name' => '<string>', 'ScaleDownBehavior' => 'TERMINATE_AT_INSTANCE_HOUR|TERMINATE_AT_TASK_COMPLETION', 'ServiceRole' => '<string>', 'Steps' => [ [ 'ExecutionStatusDetail' => [ 'CreationDateTime' => <DateTime>, 'EndDateTime' => <DateTime>, 'LastStateChangeReason' => '<string>', 'StartDateTime' => <DateTime>, 'State' => 'PENDING|RUNNING|CONTINUE|COMPLETED|CANCELLED|FAILED|INTERRUPTED', ], 'StepConfig' => [ 'ActionOnFailure' => 'TERMINATE_JOB_FLOW|TERMINATE_CLUSTER|CANCEL_AND_WAIT|CONTINUE', 'HadoopJarStep' => [ 'Args' => ['<string>', ...], 'Jar' => '<string>', 'MainClass' => '<string>', 'Properties' => [ [ 'Key' => '<string>', 'Value' => '<string>', ], // ... ], ], 'Name' => '<string>', ], ], // ... ], 'SupportedProducts' => ['<string>', ...], 'VisibleToAllUsers' => true || false, ], // ... ], ]
Result Details
Members
- JobFlows
-
- Type: Array of JobFlowDetail structures
A list of job flows matching the parameters supplied.
Errors
- InternalServerError:
Indicates that an error occurred while processing the request and that the request was not completed.
DescribeNotebookExecution
$result = $client->describeNotebookExecution
([/* ... */]); $promise = $client->describeNotebookExecutionAsync
([/* ... */]);
Provides details of a notebook execution.
Parameter Syntax
$result = $client->describeNotebookExecution([ 'NotebookExecutionId' => '<string>', // REQUIRED ]);
Parameter Details
Members
- NotebookExecutionId
-
- Required: Yes
- Type: string
The unique identifier of the notebook execution.
Result Syntax
[ 'NotebookExecution' => [ 'Arn' => '<string>', 'EditorId' => '<string>', 'EndTime' => <DateTime>, 'EnvironmentVariables' => ['<string>', ...], 'ExecutionEngine' => [ 'ExecutionRoleArn' => '<string>', 'Id' => '<string>', 'MasterInstanceSecurityGroupId' => '<string>', 'Type' => 'EMR', ], 'LastStateChangeReason' => '<string>', 'NotebookExecutionId' => '<string>', 'NotebookExecutionName' => '<string>', 'NotebookInstanceSecurityGroupId' => '<string>', 'NotebookParams' => '<string>', 'NotebookS3Location' => [ 'Bucket' => '<string>', 'Key' => '<string>', ], 'OutputNotebookFormat' => 'HTML', 'OutputNotebookS3Location' => [ 'Bucket' => '<string>', 'Key' => '<string>', ], 'OutputNotebookURI' => '<string>', 'StartTime' => <DateTime>, 'Status' => 'START_PENDING|STARTING|RUNNING|FINISHING|FINISHED|FAILING|FAILED|STOP_PENDING|STOPPING|STOPPED', 'Tags' => [ [ 'Key' => '<string>', 'Value' => '<string>', ], // ... ], ], ]
Result Details
Members
- NotebookExecution
-
- Type: NotebookExecution structure
Properties of the notebook execution.
Errors
- InternalServerError:
Indicates that an error occurred while processing the request and that the request was not completed.
- InvalidRequestException:
This exception occurs when there is something wrong with user input.
DescribeReleaseLabel
$result = $client->describeReleaseLabel
([/* ... */]); $promise = $client->describeReleaseLabelAsync
([/* ... */]);
Provides Amazon EMR release label details, such as the releases available the Region where the API request is run, and the available applications for a specific Amazon EMR release label. Can also list Amazon EMR releases that support a specified version of Spark.
Parameter Syntax
$result = $client->describeReleaseLabel([ 'MaxResults' => <integer>, 'NextToken' => '<string>', 'ReleaseLabel' => '<string>', ]);
Parameter Details
Members
- MaxResults
-
- Type: int
Reserved for future use. Currently set to null.
- NextToken
-
- Type: string
The pagination token. Reserved for future use. Currently set to null.
- ReleaseLabel
-
- Type: string
The target release label to be described.
Result Syntax
[ 'Applications' => [ [ 'Name' => '<string>', 'Version' => '<string>', ], // ... ], 'AvailableOSReleases' => [ [ 'Label' => '<string>', ], // ... ], 'NextToken' => '<string>', 'ReleaseLabel' => '<string>', ]
Result Details
Members
- Applications
-
- Type: Array of SimplifiedApplication structures
The list of applications available for the target release label.
Name
is the name of the application.Version
is the concise version of the application. - AvailableOSReleases
-
- Type: Array of OSRelease structures
The list of available Amazon Linux release versions for an Amazon EMR release. Contains a Label field that is formatted as shown in Amazon Linux 2 Release Notes . For example, 2.0.20220218.1.
- NextToken
-
- Type: string
The pagination token. Reserved for future use. Currently set to null.
- ReleaseLabel
-
- Type: string
The target release label described in the response.
Errors
- InternalServerException:
This exception occurs when there is an internal failure in the Amazon EMR service.
- InvalidRequestException:
This exception occurs when there is something wrong with user input.
DescribeSecurityConfiguration
$result = $client->describeSecurityConfiguration
([/* ... */]); $promise = $client->describeSecurityConfigurationAsync
([/* ... */]);
Provides the details of a security configuration by returning the configuration JSON.
Parameter Syntax
$result = $client->describeSecurityConfiguration([ 'Name' => '<string>', // REQUIRED ]);
Parameter Details
Members
- Name
-
- Required: Yes
- Type: string
The name of the security configuration.
Result Syntax
[ 'CreationDateTime' => <DateTime>, 'Name' => '<string>', 'SecurityConfiguration' => '<string>', ]
Result Details
Members
- CreationDateTime
-
- Type: timestamp (string|DateTime or anything parsable by strtotime)
The date and time the security configuration was created
- Name
-
- Type: string
The name of the security configuration.
- SecurityConfiguration
-
- Type: string
The security configuration details in JSON format.
Errors
- InternalServerException:
This exception occurs when there is an internal failure in the Amazon EMR service.
- InvalidRequestException:
This exception occurs when there is something wrong with user input.
DescribeStep
$result = $client->describeStep
([/* ... */]); $promise = $client->describeStepAsync
([/* ... */]);
Provides more detail about the cluster step.
Parameter Syntax
$result = $client->describeStep([ 'ClusterId' => '<string>', // REQUIRED 'StepId' => '<string>', // REQUIRED ]);
Parameter Details
Members
- ClusterId
-
- Required: Yes
- Type: string
The identifier of the cluster with steps to describe.
- StepId
-
- Required: Yes
- Type: string
The identifier of the step to describe.
Result Syntax
[ 'Step' => [ 'ActionOnFailure' => 'TERMINATE_JOB_FLOW|TERMINATE_CLUSTER|CANCEL_AND_WAIT|CONTINUE', 'Config' => [ 'Args' => ['<string>', ...], 'Jar' => '<string>', 'MainClass' => '<string>', 'Properties' => ['<string>', ...], ], 'ExecutionRoleArn' => '<string>', 'Id' => '<string>', 'Name' => '<string>', 'Status' => [ 'FailureDetails' => [ 'LogFile' => '<string>', 'Message' => '<string>', 'Reason' => '<string>', ], 'State' => 'PENDING|CANCEL_PENDING|RUNNING|COMPLETED|CANCELLED|FAILED|INTERRUPTED', 'StateChangeReason' => [ 'Code' => 'NONE', 'Message' => '<string>', ], 'Timeline' => [ 'CreationDateTime' => <DateTime>, 'EndDateTime' => <DateTime>, 'StartDateTime' => <DateTime>, ], ], ], ]
Result Details
Members
- Step
-
- Type: Step structure
The step details for the requested step identifier.
Errors
- InternalServerException:
This exception occurs when there is an internal failure in the Amazon EMR service.
- InvalidRequestException:
This exception occurs when there is something wrong with user input.
DescribeStudio
$result = $client->describeStudio
([/* ... */]); $promise = $client->describeStudioAsync
([/* ... */]);
Returns details for the specified Amazon EMR Studio including ID, Name, VPC, Studio access URL, and so on.
Parameter Syntax
$result = $client->describeStudio([ 'StudioId' => '<string>', // REQUIRED ]);
Parameter Details
Members
- StudioId
-
- Required: Yes
- Type: string
The Amazon EMR Studio ID.
Result Syntax
[ 'Studio' => [ 'AuthMode' => 'SSO|IAM', 'CreationTime' => <DateTime>, 'DefaultS3Location' => '<string>', 'Description' => '<string>', 'EncryptionKeyArn' => '<string>', 'EngineSecurityGroupId' => '<string>', 'IdcInstanceArn' => '<string>', 'IdcUserAssignment' => 'REQUIRED|OPTIONAL', 'IdpAuthUrl' => '<string>', 'IdpRelayStateParameterName' => '<string>', 'Name' => '<string>', 'ServiceRole' => '<string>', 'StudioArn' => '<string>', 'StudioId' => '<string>', 'SubnetIds' => ['<string>', ...], 'Tags' => [ [ 'Key' => '<string>', 'Value' => '<string>', ], // ... ], 'TrustedIdentityPropagationEnabled' => true || false, 'Url' => '<string>', 'UserRole' => '<string>', 'VpcId' => '<string>', 'WorkspaceSecurityGroupId' => '<string>', ], ]
Result Details
Members
- Studio
-
- Type: Studio structure
The Amazon EMR Studio details.
Errors
- InternalServerException:
This exception occurs when there is an internal failure in the Amazon EMR service.
- InvalidRequestException:
This exception occurs when there is something wrong with user input.
GetAutoTerminationPolicy
$result = $client->getAutoTerminationPolicy
([/* ... */]); $promise = $client->getAutoTerminationPolicyAsync
([/* ... */]);
Returns the auto-termination policy for an Amazon EMR cluster.
Parameter Syntax
$result = $client->getAutoTerminationPolicy([ 'ClusterId' => '<string>', // REQUIRED ]);
Parameter Details
Members
- ClusterId
-
- Required: Yes
- Type: string
Specifies the ID of the Amazon EMR cluster for which the auto-termination policy will be fetched.
Result Syntax
[ 'AutoTerminationPolicy' => [ 'IdleTimeout' => <integer>, ], ]
Result Details
Members
- AutoTerminationPolicy
-
- Type: AutoTerminationPolicy structure
Specifies the auto-termination policy that is attached to an Amazon EMR cluster.
Errors
There are no errors described for this operation.
GetBlockPublicAccessConfiguration
$result = $client->getBlockPublicAccessConfiguration
([/* ... */]); $promise = $client->getBlockPublicAccessConfigurationAsync
([/* ... */]);
Returns the Amazon EMR block public access configuration for your Amazon Web Services account in the current Region. For more information see Configure Block Public Access for Amazon EMR in the Amazon EMR Management Guide.
Parameter Syntax
$result = $client->getBlockPublicAccessConfiguration([ ]);
Parameter Details
Members
Result Syntax
[ 'BlockPublicAccessConfiguration' => [ 'BlockPublicSecurityGroupRules' => true || false, 'PermittedPublicSecurityGroupRuleRanges' => [ [ 'MaxRange' => <integer>, 'MinRange' => <integer>, ], // ... ], ], 'BlockPublicAccessConfigurationMetadata' => [ 'CreatedByArn' => '<string>', 'CreationDateTime' => <DateTime>, ], ]
Result Details
Members
- BlockPublicAccessConfiguration
-
- Required: Yes
- Type: BlockPublicAccessConfiguration structure
A configuration for Amazon EMR block public access. The configuration applies to all clusters created in your account for the current Region. The configuration specifies whether block public access is enabled. If block public access is enabled, security groups associated with the cluster cannot have rules that allow inbound traffic from 0.0.0.0/0 or ::/0 on a port, unless the port is specified as an exception using
PermittedPublicSecurityGroupRuleRanges
in theBlockPublicAccessConfiguration
. By default, Port 22 (SSH) is an exception, and public access is allowed on this port. You can change this by updating the block public access configuration to remove the exception.For accounts that created clusters in a Region before November 25, 2019, block public access is disabled by default in that Region. To use this feature, you must manually enable and configure it. For accounts that did not create an Amazon EMR cluster in a Region before this date, block public access is enabled by default in that Region.
- BlockPublicAccessConfigurationMetadata
-
- Required: Yes
- Type: BlockPublicAccessConfigurationMetadata structure
Properties that describe the Amazon Web Services principal that created the
BlockPublicAccessConfiguration
using thePutBlockPublicAccessConfiguration
action as well as the date and time that the configuration was created. Each time a configuration for block public access is updated, Amazon EMR updates this metadata.
Errors
- InternalServerException:
This exception occurs when there is an internal failure in the Amazon EMR service.
- InvalidRequestException:
This exception occurs when there is something wrong with user input.
GetClusterSessionCredentials
$result = $client->getClusterSessionCredentials
([/* ... */]); $promise = $client->getClusterSessionCredentialsAsync
([/* ... */]);
Provides temporary, HTTP basic credentials that are associated with a given runtime IAM role and used by a cluster with fine-grained access control activated. You can use these credentials to connect to cluster endpoints that support username and password authentication.
Parameter Syntax
$result = $client->getClusterSessionCredentials([ 'ClusterId' => '<string>', // REQUIRED 'ExecutionRoleArn' => '<string>', ]);
Parameter Details
Members
- ClusterId
-
- Required: Yes
- Type: string
The unique identifier of the cluster.
- ExecutionRoleArn
-
- Type: string
The Amazon Resource Name (ARN) of the runtime role for interactive workload submission on the cluster. The runtime role can be a cross-account IAM role. The runtime role ARN is a combination of account ID, role name, and role type using the following format:
arn:partition:service:region:account:resource
.
Result Syntax
[ 'Credentials' => [ 'UsernamePassword' => [ 'Password' => '<string>', 'Username' => '<string>', ], ], 'ExpiresAt' => <DateTime>, ]
Result Details
Members
- Credentials
-
- Type: Credentials structure
The credentials that you can use to connect to cluster endpoints that support username and password authentication.
- ExpiresAt
-
- Type: timestamp (string|DateTime or anything parsable by strtotime)
The time when the credentials that are returned by the
GetClusterSessionCredentials
API expire.
Errors
- InternalServerError:
Indicates that an error occurred while processing the request and that the request was not completed.
- InvalidRequestException:
This exception occurs when there is something wrong with user input.
GetManagedScalingPolicy
$result = $client->getManagedScalingPolicy
([/* ... */]); $promise = $client->getManagedScalingPolicyAsync
([/* ... */]);
Fetches the attached managed scaling policy for an Amazon EMR cluster.
Parameter Syntax
$result = $client->getManagedScalingPolicy([ 'ClusterId' => '<string>', // REQUIRED ]);
Parameter Details
Members
- ClusterId
-
- Required: Yes
- Type: string
Specifies the ID of the cluster for which the managed scaling policy will be fetched.
Result Syntax
[ 'ManagedScalingPolicy' => [ 'ComputeLimits' => [ 'MaximumCapacityUnits' => <integer>, 'MaximumCoreCapacityUnits' => <integer>, 'MaximumOnDemandCapacityUnits' => <integer>, 'MinimumCapacityUnits' => <integer>, 'UnitType' => 'InstanceFleetUnits|Instances|VCPU', ], ], ]
Result Details
Members
- ManagedScalingPolicy
-
- Type: ManagedScalingPolicy structure
Specifies the managed scaling policy that is attached to an Amazon EMR cluster.
Errors
There are no errors described for this operation.
GetStudioSessionMapping
$result = $client->getStudioSessionMapping
([/* ... */]); $promise = $client->getStudioSessionMappingAsync
([/* ... */]);
Fetches mapping details for the specified Amazon EMR Studio and identity (user or group).
Parameter Syntax
$result = $client->getStudioSessionMapping([ 'IdentityId' => '<string>', 'IdentityName' => '<string>', 'IdentityType' => 'USER|GROUP', // REQUIRED 'StudioId' => '<string>', // REQUIRED ]);
Parameter Details
Members
- IdentityId
-
- Type: string
- IdentityName
-
- Type: string
The name of the user or group to fetch. For more information, see UserName and DisplayName in the IAM Identity Center Identity Store API Reference. Either
IdentityName
orIdentityId
must be specified. - IdentityType
-
- Required: Yes
- Type: string
Specifies whether the identity to fetch is a user or a group.
- StudioId
-
- Required: Yes
- Type: string
The ID of the Amazon EMR Studio.
Result Syntax
[ 'SessionMapping' => [ 'CreationTime' => <DateTime>, 'IdentityId' => '<string>', 'IdentityName' => '<string>', 'IdentityType' => 'USER|GROUP', 'LastModifiedTime' => <DateTime>, 'SessionPolicyArn' => '<string>', 'StudioId' => '<string>', ], ]
Result Details
Members
- SessionMapping
-
- Type: SessionMappingDetail structure
The session mapping details for the specified Amazon EMR Studio and identity, including session policy ARN and creation time.
Errors
- InternalServerError:
Indicates that an error occurred while processing the request and that the request was not completed.
- InvalidRequestException:
This exception occurs when there is something wrong with user input.
ListBootstrapActions
$result = $client->listBootstrapActions
([/* ... */]); $promise = $client->listBootstrapActionsAsync
([/* ... */]);
Provides information about the bootstrap actions associated with a cluster.
Parameter Syntax
$result = $client->listBootstrapActions([ 'ClusterId' => '<string>', // REQUIRED 'Marker' => '<string>', ]);
Parameter Details
Members
- ClusterId
-
- Required: Yes
- Type: string
The cluster identifier for the bootstrap actions to list.
- Marker
-
- Type: string
The pagination token that indicates the next set of results to retrieve.
Result Syntax
[ 'BootstrapActions' => [ [ 'Args' => ['<string>', ...], 'Name' => '<string>', 'ScriptPath' => '<string>', ], // ... ], 'Marker' => '<string>', ]
Result Details
Members
- BootstrapActions
-
- Type: Array of Command structures
The bootstrap actions associated with the cluster.
- Marker
-
- Type: string
The pagination token that indicates the next set of results to retrieve.
Errors
- InternalServerException:
This exception occurs when there is an internal failure in the Amazon EMR service.
- InvalidRequestException:
This exception occurs when there is something wrong with user input.
ListClusters
$result = $client->listClusters
([/* ... */]); $promise = $client->listClustersAsync
([/* ... */]);
Provides the status of all clusters visible to this Amazon Web Services account. Allows you to filter the list of clusters based on certain criteria; for example, filtering by cluster creation date and time or by status. This call returns a maximum of 50 clusters in unsorted order per call, but returns a marker to track the paging of the cluster list across multiple ListClusters calls.
Parameter Syntax
$result = $client->listClusters([ 'ClusterStates' => ['<string>', ...], 'CreatedAfter' => <integer || string || DateTime>, 'CreatedBefore' => <integer || string || DateTime>, 'Marker' => '<string>', ]);
Parameter Details
Members
- ClusterStates
-
- Type: Array of strings
The cluster state filters to apply when listing clusters. Clusters that change state while this action runs may be not be returned as expected in the list of clusters.
- CreatedAfter
-
- Type: timestamp (string|DateTime or anything parsable by strtotime)
The creation date and time beginning value filter for listing clusters.
- CreatedBefore
-
- Type: timestamp (string|DateTime or anything parsable by strtotime)
The creation date and time end value filter for listing clusters.
- Marker
-
- Type: string
The pagination token that indicates the next set of results to retrieve.
Result Syntax
[ 'Clusters' => [ [ 'ClusterArn' => '<string>', 'Id' => '<string>', 'Name' => '<string>', 'NormalizedInstanceHours' => <integer>, 'OutpostArn' => '<string>', 'Status' => [ 'ErrorDetails' => [ [ 'ErrorCode' => '<string>', 'ErrorData' => [ ['<string>', ...], // ... ], 'ErrorMessage' => '<string>', ], // ... ], 'State' => 'STARTING|BOOTSTRAPPING|RUNNING|WAITING|TERMINATING|TERMINATED|TERMINATED_WITH_ERRORS', 'StateChangeReason' => [ 'Code' => 'INTERNAL_ERROR|VALIDATION_ERROR|INSTANCE_FAILURE|INSTANCE_FLEET_TIMEOUT|BOOTSTRAP_FAILURE|USER_REQUEST|STEP_FAILURE|ALL_STEPS_COMPLETED', 'Message' => '<string>', ], 'Timeline' => [ 'CreationDateTime' => <DateTime>, 'EndDateTime' => <DateTime>, 'ReadyDateTime' => <DateTime>, ], ], ], // ... ], 'Marker' => '<string>', ]
Result Details
Members
- Clusters
-
- Type: Array of ClusterSummary structures
The list of clusters for the account based on the given filters.
- Marker
-
- Type: string
The pagination token that indicates the next set of results to retrieve.
Errors
- InternalServerException:
This exception occurs when there is an internal failure in the Amazon EMR service.
- InvalidRequestException:
This exception occurs when there is something wrong with user input.
ListInstanceFleets
$result = $client->listInstanceFleets
([/* ... */]); $promise = $client->listInstanceFleetsAsync
([/* ... */]);
Lists all available details about the instance fleets in a cluster.
The instance fleet configuration is available only in Amazon EMR releases 4.8.0 and later, excluding 5.0.x versions.
Parameter Syntax
$result = $client->listInstanceFleets([ 'ClusterId' => '<string>', // REQUIRED 'Marker' => '<string>', ]);
Parameter Details
Members
- ClusterId
-
- Required: Yes
- Type: string
The unique identifier of the cluster.
- Marker
-
- Type: string
The pagination token that indicates the next set of results to retrieve.
Result Syntax
[ 'InstanceFleets' => [ [ 'Context' => '<string>', 'Id' => '<string>', 'InstanceFleetType' => 'MASTER|CORE|TASK', 'InstanceTypeSpecifications' => [ [ 'BidPrice' => '<string>', 'BidPriceAsPercentageOfOnDemandPrice' => <float>, 'Configurations' => [ [ 'Classification' => '<string>', 'Configurations' => [...], // RECURSIVE 'Properties' => ['<string>', ...], ], // ... ], 'CustomAmiId' => '<string>', 'EbsBlockDevices' => [ [ 'Device' => '<string>', 'VolumeSpecification' => [ 'Iops' => <integer>, 'SizeInGB' => <integer>, 'Throughput' => <integer>, 'VolumeType' => '<string>', ], ], // ... ], 'EbsOptimized' => true || false, 'InstanceType' => '<string>', 'Priority' => <float>, 'WeightedCapacity' => <integer>, ], // ... ], 'LaunchSpecifications' => [ 'OnDemandSpecification' => [ 'AllocationStrategy' => 'lowest-price|prioritized', 'CapacityReservationOptions' => [ 'CapacityReservationPreference' => 'open|none', 'CapacityReservationResourceGroupArn' => '<string>', 'UsageStrategy' => 'use-capacity-reservations-first', ], ], 'SpotSpecification' => [ 'AllocationStrategy' => 'capacity-optimized|price-capacity-optimized|lowest-price|diversified|capacity-optimized-prioritized', 'BlockDurationMinutes' => <integer>, 'TimeoutAction' => 'SWITCH_TO_ON_DEMAND|TERMINATE_CLUSTER', 'TimeoutDurationMinutes' => <integer>, ], ], 'Name' => '<string>', 'ProvisionedOnDemandCapacity' => <integer>, 'ProvisionedSpotCapacity' => <integer>, 'ResizeSpecifications' => [ 'OnDemandResizeSpecification' => [ 'AllocationStrategy' => 'lowest-price|prioritized', 'CapacityReservationOptions' => [ 'CapacityReservationPreference' => 'open|none', 'CapacityReservationResourceGroupArn' => '<string>', 'UsageStrategy' => 'use-capacity-reservations-first', ], 'TimeoutDurationMinutes' => <integer>, ], 'SpotResizeSpecification' => [ 'AllocationStrategy' => 'capacity-optimized|price-capacity-optimized|lowest-price|diversified|capacity-optimized-prioritized', 'TimeoutDurationMinutes' => <integer>, ], ], 'Status' => [ 'State' => 'PROVISIONING|BOOTSTRAPPING|RUNNING|RESIZING|SUSPENDED|TERMINATING|TERMINATED', 'StateChangeReason' => [ 'Code' => 'INTERNAL_ERROR|VALIDATION_ERROR|INSTANCE_FAILURE|CLUSTER_TERMINATED', 'Message' => '<string>', ], 'Timeline' => [ 'CreationDateTime' => <DateTime>, 'EndDateTime' => <DateTime>, 'ReadyDateTime' => <DateTime>, ], ], 'TargetOnDemandCapacity' => <integer>, 'TargetSpotCapacity' => <integer>, ], // ... ], 'Marker' => '<string>', ]
Result Details
Members
- InstanceFleets
-
- Type: Array of InstanceFleet structures
The list of instance fleets for the cluster and given filters.
- Marker
-
- Type: string
The pagination token that indicates the next set of results to retrieve.
Errors
- InternalServerException:
This exception occurs when there is an internal failure in the Amazon EMR service.
- InvalidRequestException:
This exception occurs when there is something wrong with user input.
ListInstanceGroups
$result = $client->listInstanceGroups
([/* ... */]); $promise = $client->listInstanceGroupsAsync
([/* ... */]);
Provides all available details about the instance groups in a cluster.
Parameter Syntax
$result = $client->listInstanceGroups([ 'ClusterId' => '<string>', // REQUIRED 'Marker' => '<string>', ]);
Parameter Details
Members
- ClusterId
-
- Required: Yes
- Type: string
The identifier of the cluster for which to list the instance groups.
- Marker
-
- Type: string
The pagination token that indicates the next set of results to retrieve.
Result Syntax
[ 'InstanceGroups' => [ [ 'AutoScalingPolicy' => [ 'Constraints' => [ 'MaxCapacity' => <integer>, 'MinCapacity' => <integer>, ], 'Rules' => [ [ 'Action' => [ 'Market' => 'ON_DEMAND|SPOT', 'SimpleScalingPolicyConfiguration' => [ 'AdjustmentType' => 'CHANGE_IN_CAPACITY|PERCENT_CHANGE_IN_CAPACITY|EXACT_CAPACITY', 'CoolDown' => <integer>, 'ScalingAdjustment' => <integer>, ], ], 'Description' => '<string>', 'Name' => '<string>', 'Trigger' => [ 'CloudWatchAlarmDefinition' => [ 'ComparisonOperator' => 'GREATER_THAN_OR_EQUAL|GREATER_THAN|LESS_THAN|LESS_THAN_OR_EQUAL', 'Dimensions' => [ [ 'Key' => '<string>', 'Value' => '<string>', ], // ... ], 'EvaluationPeriods' => <integer>, 'MetricName' => '<string>', 'Namespace' => '<string>', 'Period' => <integer>, 'Statistic' => 'SAMPLE_COUNT|AVERAGE|SUM|MINIMUM|MAXIMUM', 'Threshold' => <float>, 'Unit' => 'NONE|SECONDS|MICRO_SECONDS|MILLI_SECONDS|BYTES|KILO_BYTES|MEGA_BYTES|GIGA_BYTES|TERA_BYTES|BITS|KILO_BITS|MEGA_BITS|GIGA_BITS|TERA_BITS|PERCENT|COUNT|BYTES_PER_SECOND|KILO_BYTES_PER_SECOND|MEGA_BYTES_PER_SECOND|GIGA_BYTES_PER_SECOND|TERA_BYTES_PER_SECOND|BITS_PER_SECOND|KILO_BITS_PER_SECOND|MEGA_BITS_PER_SECOND|GIGA_BITS_PER_SECOND|TERA_BITS_PER_SECOND|COUNT_PER_SECOND', ], ], ], // ... ], 'Status' => [ 'State' => 'PENDING|ATTACHING|ATTACHED|DETACHING|DETACHED|FAILED', 'StateChangeReason' => [ 'Code' => 'USER_REQUEST|PROVISION_FAILURE|CLEANUP_FAILURE', 'Message' => '<string>', ], ], ], 'BidPrice' => '<string>', 'Configurations' => [ [ 'Classification' => '<string>', 'Configurations' => [...], // RECURSIVE 'Properties' => ['<string>', ...], ], // ... ], 'ConfigurationsVersion' => <integer>, 'CustomAmiId' => '<string>', 'EbsBlockDevices' => [ [ 'Device' => '<string>', 'VolumeSpecification' => [ 'Iops' => <integer>, 'SizeInGB' => <integer>, 'Throughput' => <integer>, 'VolumeType' => '<string>', ], ], // ... ], 'EbsOptimized' => true || false, 'Id' => '<string>', 'InstanceGroupType' => 'MASTER|CORE|TASK', 'InstanceType' => '<string>', 'LastSuccessfullyAppliedConfigurations' => [ [ 'Classification' => '<string>', 'Configurations' => [...], // RECURSIVE 'Properties' => ['<string>', ...], ], // ... ], 'LastSuccessfullyAppliedConfigurationsVersion' => <integer>, 'Market' => 'ON_DEMAND|SPOT', 'Name' => '<string>', 'RequestedInstanceCount' => <integer>, 'RunningInstanceCount' => <integer>, 'ShrinkPolicy' => [ 'DecommissionTimeout' => <integer>, 'InstanceResizePolicy' => [ 'InstanceTerminationTimeout' => <integer>, 'InstancesToProtect' => ['<string>', ...], 'InstancesToTerminate' => ['<string>', ...], ], ], 'Status' => [ 'State' => 'PROVISIONING|BOOTSTRAPPING|RUNNING|RECONFIGURING|RESIZING|SUSPENDED|TERMINATING|TERMINATED|ARRESTED|SHUTTING_DOWN|ENDED', 'StateChangeReason' => [ 'Code' => 'INTERNAL_ERROR|VALIDATION_ERROR|INSTANCE_FAILURE|CLUSTER_TERMINATED', 'Message' => '<string>', ], 'Timeline' => [ 'CreationDateTime' => <DateTime>, 'EndDateTime' => <DateTime>, 'ReadyDateTime' => <DateTime>, ], ], ], // ... ], 'Marker' => '<string>', ]
Result Details
Members
- InstanceGroups
-
- Type: Array of InstanceGroup structures
The list of instance groups for the cluster and given filters.
- Marker
-
- Type: string
The pagination token that indicates the next set of results to retrieve.
Errors
- InternalServerException:
This exception occurs when there is an internal failure in the Amazon EMR service.
- InvalidRequestException:
This exception occurs when there is something wrong with user input.
ListInstances
$result = $client->listInstances
([/* ... */]); $promise = $client->listInstancesAsync
([/* ... */]);
Provides information for all active Amazon EC2 instances and Amazon EC2 instances terminated in the last 30 days, up to a maximum of 2,000. Amazon EC2 instances in any of the following states are considered active: AWAITING_FULFILLMENT, PROVISIONING, BOOTSTRAPPING, RUNNING.
Parameter Syntax
$result = $client->listInstances([ 'ClusterId' => '<string>', // REQUIRED 'InstanceFleetId' => '<string>', 'InstanceFleetType' => 'MASTER|CORE|TASK', 'InstanceGroupId' => '<string>', 'InstanceGroupTypes' => ['<string>', ...], 'InstanceStates' => ['<string>', ...], 'Marker' => '<string>', ]);
Parameter Details
Members
- ClusterId
-
- Required: Yes
- Type: string
The identifier of the cluster for which to list the instances.
- InstanceFleetId
-
- Type: string
The unique identifier of the instance fleet.
- InstanceFleetType
-
- Type: string
The node type of the instance fleet. For example MASTER, CORE, or TASK.
- InstanceGroupId
-
- Type: string
The identifier of the instance group for which to list the instances.
- InstanceGroupTypes
-
- Type: Array of strings
The type of instance group for which to list the instances.
- InstanceStates
-
- Type: Array of strings
A list of instance states that will filter the instances returned with this request.
- Marker
-
- Type: string
The pagination token that indicates the next set of results to retrieve.
Result Syntax
[ 'Instances' => [ [ 'EbsVolumes' => [ [ 'Device' => '<string>', 'VolumeId' => '<string>', ], // ... ], 'Ec2InstanceId' => '<string>', 'Id' => '<string>', 'InstanceFleetId' => '<string>', 'InstanceGroupId' => '<string>', 'InstanceType' => '<string>', 'Market' => 'ON_DEMAND|SPOT', 'PrivateDnsName' => '<string>', 'PrivateIpAddress' => '<string>', 'PublicDnsName' => '<string>', 'PublicIpAddress' => '<string>', 'Status' => [ 'State' => 'AWAITING_FULFILLMENT|PROVISIONING|BOOTSTRAPPING|RUNNING|TERMINATED', 'StateChangeReason' => [ 'Code' => 'INTERNAL_ERROR|VALIDATION_ERROR|INSTANCE_FAILURE|BOOTSTRAP_FAILURE|CLUSTER_TERMINATED', 'Message' => '<string>', ], 'Timeline' => [ 'CreationDateTime' => <DateTime>, 'EndDateTime' => <DateTime>, 'ReadyDateTime' => <DateTime>, ], ], ], // ... ], 'Marker' => '<string>', ]
Result Details
Members
- Instances
-
- Type: Array of Instance structures
The list of instances for the cluster and given filters.
- Marker
-
- Type: string
The pagination token that indicates the next set of results to retrieve.
Errors
- InternalServerException:
This exception occurs when there is an internal failure in the Amazon EMR service.
- InvalidRequestException:
This exception occurs when there is something wrong with user input.
ListNotebookExecutions
$result = $client->listNotebookExecutions
([/* ... */]); $promise = $client->listNotebookExecutionsAsync
([/* ... */]);
Provides summaries of all notebook executions. You can filter the list based on multiple criteria such as status, time range, and editor id. Returns a maximum of 50 notebook executions and a marker to track the paging of a longer notebook execution list across multiple ListNotebookExecutions
calls.
Parameter Syntax
$result = $client->listNotebookExecutions([ 'EditorId' => '<string>', 'ExecutionEngineId' => '<string>', 'From' => <integer || string || DateTime>, 'Marker' => '<string>', 'Status' => 'START_PENDING|STARTING|RUNNING|FINISHING|FINISHED|FAILING|FAILED|STOP_PENDING|STOPPING|STOPPED', 'To' => <integer || string || DateTime>, ]);
Parameter Details
Members
- EditorId
-
- Type: string
The unique ID of the editor associated with the notebook execution.
- ExecutionEngineId
-
- Type: string
The unique ID of the execution engine.
- From
-
- Type: timestamp (string|DateTime or anything parsable by strtotime)
The beginning of time range filter for listing notebook executions. The default is the timestamp of 30 days ago.
- Marker
-
- Type: string
The pagination token, returned by a previous
ListNotebookExecutions
call, that indicates the start of the list for thisListNotebookExecutions
call. - Status
-
- Type: string
The status filter for listing notebook executions.
-
START_PENDING
indicates that the cluster has received the execution request but execution has not begun. -
STARTING
indicates that the execution is starting on the cluster. -
RUNNING
indicates that the execution is being processed by the cluster. -
FINISHING
indicates that execution processing is in the final stages. -
FINISHED
indicates that the execution has completed without error. -
FAILING
indicates that the execution is failing and will not finish successfully. -
FAILED
indicates that the execution failed. -
STOP_PENDING
indicates that the cluster has received aStopNotebookExecution
request and the stop is pending. -
STOPPING
indicates that the cluster is in the process of stopping the execution as a result of aStopNotebookExecution
request. -
STOPPED
indicates that the execution stopped because of aStopNotebookExecution
request.
- To
-
- Type: timestamp (string|DateTime or anything parsable by strtotime)
The end of time range filter for listing notebook executions. The default is the current timestamp.
Result Syntax
[ 'Marker' => '<string>', 'NotebookExecutions' => [ [ 'EditorId' => '<string>', 'EndTime' => <DateTime>, 'ExecutionEngineId' => '<string>', 'NotebookExecutionId' => '<string>', 'NotebookExecutionName' => '<string>', 'NotebookS3Location' => [ 'Bucket' => '<string>', 'Key' => '<string>', ], 'StartTime' => <DateTime>, 'Status' => 'START_PENDING|STARTING|RUNNING|FINISHING|FINISHED|FAILING|FAILED|STOP_PENDING|STOPPING|STOPPED', ], // ... ], ]
Result Details
Members
- Marker
-
- Type: string
A pagination token that a subsequent
ListNotebookExecutions
can use to determine the next set of results to retrieve. - NotebookExecutions
-
- Type: Array of NotebookExecutionSummary structures
A list of notebook executions.
Errors
- InternalServerError:
Indicates that an error occurred while processing the request and that the request was not completed.
- InvalidRequestException:
This exception occurs when there is something wrong with user input.
ListReleaseLabels
$result = $client->listReleaseLabels
([/* ... */]); $promise = $client->listReleaseLabelsAsync
([/* ... */]);
Retrieves release labels of Amazon EMR services in the Region where the API is called.
Parameter Syntax
$result = $client->listReleaseLabels([ 'Filters' => [ 'Application' => '<string>', 'Prefix' => '<string>', ], 'MaxResults' => <integer>, 'NextToken' => '<string>', ]);
Parameter Details
Members
- Filters
-
- Type: ReleaseLabelFilter structure
Filters the results of the request.
Prefix
specifies the prefix of release labels to return.Application
specifies the application (with/without version) of release labels to return. - MaxResults
-
- Type: int
Defines the maximum number of release labels to return in a single response. The default is
100
. - NextToken
-
- Type: string
Specifies the next page of results. If
NextToken
is not specified, which is usually the case for the first request of ListReleaseLabels, the first page of results are determined by other filtering parameters or by the latest version. TheListReleaseLabels
request fails if the identity (Amazon Web Services account ID) and all filtering parameters are different from the original request, or if theNextToken
is expired or tampered with.
Result Syntax
[ 'NextToken' => '<string>', 'ReleaseLabels' => ['<string>', ...], ]
Result Details
Members
- NextToken
-
- Type: string
Used to paginate the next page of results if specified in the next
ListReleaseLabels
request. - ReleaseLabels
-
- Type: Array of strings
The returned release labels.
Errors
- InternalServerException:
This exception occurs when there is an internal failure in the Amazon EMR service.
- InvalidRequestException:
This exception occurs when there is something wrong with user input.
ListSecurityConfigurations
$result = $client->listSecurityConfigurations
([/* ... */]); $promise = $client->listSecurityConfigurationsAsync
([/* ... */]);
Lists all the security configurations visible to this account, providing their creation dates and times, and their names. This call returns a maximum of 50 clusters per call, but returns a marker to track the paging of the cluster list across multiple ListSecurityConfigurations calls.
Parameter Syntax
$result = $client->listSecurityConfigurations([ 'Marker' => '<string>', ]);
Parameter Details
Members
- Marker
-
- Type: string
The pagination token that indicates the set of results to retrieve.
Result Syntax
[ 'Marker' => '<string>', 'SecurityConfigurations' => [ [ 'CreationDateTime' => <DateTime>, 'Name' => '<string>', ], // ... ], ]
Result Details
Members
- Marker
-
- Type: string
A pagination token that indicates the next set of results to retrieve. Include the marker in the next ListSecurityConfiguration call to retrieve the next page of results, if required.
- SecurityConfigurations
-
- Type: Array of SecurityConfigurationSummary structures
The creation date and time, and name, of each security configuration.
Errors
- InternalServerException:
This exception occurs when there is an internal failure in the Amazon EMR service.
- InvalidRequestException:
This exception occurs when there is something wrong with user input.
ListSteps
$result = $client->listSteps
([/* ... */]); $promise = $client->listStepsAsync
([/* ... */]);
Provides a list of steps for the cluster in reverse order unless you specify stepIds
with the request or filter by StepStates
. You can specify a maximum of 10 stepIDs
. The CLI automatically paginates results to return a list greater than 50 steps. To return more than 50 steps using the CLI, specify a Marker
, which is a pagination token that indicates the next set of steps to retrieve.
Parameter Syntax
$result = $client->listSteps([ 'ClusterId' => '<string>', // REQUIRED 'Marker' => '<string>', 'StepIds' => ['<string>', ...], 'StepStates' => ['<string>', ...], ]);
Parameter Details
Members
- ClusterId
-
- Required: Yes
- Type: string
The identifier of the cluster for which to list the steps.
- Marker
-
- Type: string
The maximum number of steps that a single
ListSteps
action returns is 50. To return a longer list of steps, use multipleListSteps
actions along with theMarker
parameter, which is a pagination token that indicates the next set of results to retrieve. - StepIds
-
- Type: Array of strings
The filter to limit the step list based on the identifier of the steps. You can specify a maximum of ten Step IDs. The character constraint applies to the overall length of the array.
- StepStates
-
- Type: Array of strings
The filter to limit the step list based on certain states.
Result Syntax
[ 'Marker' => '<string>', 'Steps' => [ [ 'ActionOnFailure' => 'TERMINATE_JOB_FLOW|TERMINATE_CLUSTER|CANCEL_AND_WAIT|CONTINUE', 'Config' => [ 'Args' => ['<string>', ...], 'Jar' => '<string>', 'MainClass' => '<string>', 'Properties' => ['<string>', ...], ], 'Id' => '<string>', 'Name' => '<string>', 'Status' => [ 'FailureDetails' => [ 'LogFile' => '<string>', 'Message' => '<string>', 'Reason' => '<string>', ], 'State' => 'PENDING|CANCEL_PENDING|RUNNING|COMPLETED|CANCELLED|FAILED|INTERRUPTED', 'StateChangeReason' => [ 'Code' => 'NONE', 'Message' => '<string>', ], 'Timeline' => [ 'CreationDateTime' => <DateTime>, 'EndDateTime' => <DateTime>, 'StartDateTime' => <DateTime>, ], ], ], // ... ], ]
Result Details
Members
- Marker
-
- Type: string
The maximum number of steps that a single
ListSteps
action returns is 50. To return a longer list of steps, use multipleListSteps
actions along with theMarker
parameter, which is a pagination token that indicates the next set of results to retrieve. - Steps
-
- Type: Array of StepSummary structures
The filtered list of steps for the cluster.
Errors
- InternalServerException:
This exception occurs when there is an internal failure in the Amazon EMR service.
- InvalidRequestException:
This exception occurs when there is something wrong with user input.
ListStudioSessionMappings
$result = $client->listStudioSessionMappings
([/* ... */]); $promise = $client->listStudioSessionMappingsAsync
([/* ... */]);
Returns a list of all user or group session mappings for the Amazon EMR Studio specified by StudioId
.
Parameter Syntax
$result = $client->listStudioSessionMappings([ 'IdentityType' => 'USER|GROUP', 'Marker' => '<string>', 'StudioId' => '<string>', ]);
Parameter Details
Members
- IdentityType
-
- Type: string
Specifies whether to return session mappings for users or groups. If not specified, the results include session mapping details for both users and groups.
- Marker
-
- Type: string
The pagination token that indicates the set of results to retrieve.
- StudioId
-
- Type: string
The ID of the Amazon EMR Studio.
Result Syntax
[ 'Marker' => '<string>', 'SessionMappings' => [ [ 'CreationTime' => <DateTime>, 'IdentityId' => '<string>', 'IdentityName' => '<string>', 'IdentityType' => 'USER|GROUP', 'SessionPolicyArn' => '<string>', 'StudioId' => '<string>', ], // ... ], ]
Result Details
Members
- Marker
-
- Type: string
The pagination token that indicates the next set of results to retrieve.
- SessionMappings
-
- Type: Array of SessionMappingSummary structures
A list of session mapping summary objects. Each object includes session mapping details such as creation time, identity type (user or group), and Amazon EMR Studio ID.
Errors
- InternalServerError:
Indicates that an error occurred while processing the request and that the request was not completed.
- InvalidRequestException:
This exception occurs when there is something wrong with user input.
ListStudios
$result = $client->listStudios
([/* ... */]); $promise = $client->listStudiosAsync
([/* ... */]);
Returns a list of all Amazon EMR Studios associated with the Amazon Web Services account. The list includes details such as ID, Studio Access URL, and creation time for each Studio.
Parameter Syntax
$result = $client->listStudios([ 'Marker' => '<string>', ]);
Parameter Details
Members
- Marker
-
- Type: string
The pagination token that indicates the set of results to retrieve.
Result Syntax
[ 'Marker' => '<string>', 'Studios' => [ [ 'AuthMode' => 'SSO|IAM', 'CreationTime' => <DateTime>, 'Description' => '<string>', 'Name' => '<string>', 'StudioId' => '<string>', 'Url' => '<string>', 'VpcId' => '<string>', ], // ... ], ]
Result Details
Members
- Marker
-
- Type: string
The pagination token that indicates the next set of results to retrieve.
- Studios
-
- Type: Array of StudioSummary structures
The list of Studio summary objects.
Errors
- InternalServerException:
This exception occurs when there is an internal failure in the Amazon EMR service.
- InvalidRequestException:
This exception occurs when there is something wrong with user input.
ListSupportedInstanceTypes
$result = $client->listSupportedInstanceTypes
([/* ... */]); $promise = $client->listSupportedInstanceTypesAsync
([/* ... */]);
A list of the instance types that Amazon EMR supports. You can filter the list by Amazon Web Services Region and Amazon EMR release.
Parameter Syntax
$result = $client->listSupportedInstanceTypes([ 'Marker' => '<string>', 'ReleaseLabel' => '<string>', // REQUIRED ]);
Parameter Details
Members
- Marker
-
- Type: string
The pagination token that marks the next set of results to retrieve.
- ReleaseLabel
-
- Required: Yes
- Type: string
The Amazon EMR release label determines the versions of open-source application packages that Amazon EMR has installed on the cluster. Release labels are in the format
emr-x.x.x
, where x.x.x is an Amazon EMR release number such asemr-6.10.0
. For more information about Amazon EMR releases and their included application versions and features, see the Amazon EMR Release Guide .
Result Syntax
[ 'Marker' => '<string>', 'SupportedInstanceTypes' => [ [ 'Architecture' => '<string>', 'EbsOptimizedAvailable' => true || false, 'EbsOptimizedByDefault' => true || false, 'EbsStorageOnly' => true || false, 'InstanceFamilyId' => '<string>', 'Is64BitsOnly' => true || false, 'MemoryGB' => <float>, 'NumberOfDisks' => <integer>, 'StorageGB' => <integer>, 'Type' => '<string>', 'VCPU' => <integer>, ], // ... ], ]
Result Details
Members
- Marker
-
- Type: string
The pagination token that marks the next set of results to retrieve.
- SupportedInstanceTypes
-
- Type: Array of SupportedInstanceType structures
The list of instance types that the release specified in
ListSupportedInstanceTypesInput$ReleaseLabel
supports, filtered by Amazon Web Services Region.
Errors
- InternalServerException:
This exception occurs when there is an internal failure in the Amazon EMR service.
- InvalidRequestException:
This exception occurs when there is something wrong with user input.
ModifyCluster
$result = $client->modifyCluster
([/* ... */]); $promise = $client->modifyClusterAsync
([/* ... */]);
Modifies the number of steps that can be executed concurrently for the cluster specified using ClusterID.
Parameter Syntax
$result = $client->modifyCluster([ 'ClusterId' => '<string>', // REQUIRED 'StepConcurrencyLevel' => <integer>, ]);
Parameter Details
Members
- ClusterId
-
- Required: Yes
- Type: string
The unique identifier of the cluster.
- StepConcurrencyLevel
-
- Type: int
The number of steps that can be executed concurrently. You can specify a minimum of 1 step and a maximum of 256 steps. We recommend that you do not change this parameter while steps are running or the
ActionOnFailure
setting may not behave as expected. For more information see Step$ActionOnFailure.
Result Syntax
[ 'StepConcurrencyLevel' => <integer>, ]
Result Details
Members
- StepConcurrencyLevel
-
- Type: int
The number of steps that can be executed concurrently.
Errors
- InternalServerError:
Indicates that an error occurred while processing the request and that the request was not completed.
- InvalidRequestException:
This exception occurs when there is something wrong with user input.
ModifyInstanceFleet
$result = $client->modifyInstanceFleet
([/* ... */]); $promise = $client->modifyInstanceFleetAsync
([/* ... */]);
Modifies the target On-Demand and target Spot capacities for the instance fleet with the specified InstanceFleetID within the cluster specified using ClusterID. The call either succeeds or fails atomically.
The instance fleet configuration is available only in Amazon EMR releases 4.8.0 and later, excluding 5.0.x versions.
Parameter Syntax
$result = $client->modifyInstanceFleet([ 'ClusterId' => '<string>', // REQUIRED 'InstanceFleet' => [ // REQUIRED 'Context' => '<string>', 'InstanceFleetId' => '<string>', // REQUIRED 'InstanceTypeConfigs' => [ [ 'BidPrice' => '<string>', 'BidPriceAsPercentageOfOnDemandPrice' => <float>, 'Configurations' => [ [ 'Classification' => '<string>', 'Configurations' => [...], // RECURSIVE 'Properties' => ['<string>', ...], ], // ... ], 'CustomAmiId' => '<string>', 'EbsConfiguration' => [ 'EbsBlockDeviceConfigs' => [ [ 'VolumeSpecification' => [ // REQUIRED 'Iops' => <integer>, 'SizeInGB' => <integer>, // REQUIRED 'Throughput' => <integer>, 'VolumeType' => '<string>', // REQUIRED ], 'VolumesPerInstance' => <integer>, ], // ... ], 'EbsOptimized' => true || false, ], 'InstanceType' => '<string>', // REQUIRED 'Priority' => <float>, 'WeightedCapacity' => <integer>, ], // ... ], 'ResizeSpecifications' => [ 'OnDemandResizeSpecification' => [ 'AllocationStrategy' => 'lowest-price|prioritized', 'CapacityReservationOptions' => [ 'CapacityReservationPreference' => 'open|none', 'CapacityReservationResourceGroupArn' => '<string>', 'UsageStrategy' => 'use-capacity-reservations-first', ], 'TimeoutDurationMinutes' => <integer>, ], 'SpotResizeSpecification' => [ 'AllocationStrategy' => 'capacity-optimized|price-capacity-optimized|lowest-price|diversified|capacity-optimized-prioritized', 'TimeoutDurationMinutes' => <integer>, ], ], 'TargetOnDemandCapacity' => <integer>, 'TargetSpotCapacity' => <integer>, ], ]);
Parameter Details
Members
- ClusterId
-
- Required: Yes
- Type: string
The unique identifier of the cluster.
- InstanceFleet
-
- Required: Yes
- Type: InstanceFleetModifyConfig structure
The configuration parameters of the instance fleet.
Result Syntax
[]
Result Details
Errors
- InternalServerException:
This exception occurs when there is an internal failure in the Amazon EMR service.
- InvalidRequestException:
This exception occurs when there is something wrong with user input.
ModifyInstanceGroups
$result = $client->modifyInstanceGroups
([/* ... */]); $promise = $client->modifyInstanceGroupsAsync
([/* ... */]);
ModifyInstanceGroups modifies the number of nodes and configuration settings of an instance group. The input parameters include the new target instance count for the group and the instance group ID. The call will either succeed or fail atomically.
Parameter Syntax
$result = $client->modifyInstanceGroups([ 'ClusterId' => '<string>', 'InstanceGroups' => [ [ 'Configurations' => [ [ 'Classification' => '<string>', 'Configurations' => [...], // RECURSIVE 'Properties' => ['<string>', ...], ], // ... ], 'EC2InstanceIdsToTerminate' => ['<string>', ...], 'InstanceCount' => <integer>, 'InstanceGroupId' => '<string>', // REQUIRED 'ReconfigurationType' => 'OVERWRITE|MERGE', 'ShrinkPolicy' => [ 'DecommissionTimeout' => <integer>, 'InstanceResizePolicy' => [ 'InstanceTerminationTimeout' => <integer>, 'InstancesToProtect' => ['<string>', ...], 'InstancesToTerminate' => ['<string>', ...], ], ], ], // ... ], ]);
Parameter Details
Members
- ClusterId
-
- Type: string
The ID of the cluster to which the instance group belongs.
- InstanceGroups
-
- Type: Array of InstanceGroupModifyConfig structures
Instance groups to change.
Result Syntax
[]
Result Details
Errors
- InternalServerError:
Indicates that an error occurred while processing the request and that the request was not completed.
PutAutoScalingPolicy
$result = $client->putAutoScalingPolicy
([/* ... */]); $promise = $client->putAutoScalingPolicyAsync
([/* ... */]);
Creates or updates an automatic scaling policy for a core instance group or task instance group in an Amazon EMR cluster. The automatic scaling policy defines how an instance group dynamically adds and terminates Amazon EC2 instances in response to the value of a CloudWatch metric.
Parameter Syntax
$result = $client->putAutoScalingPolicy([ 'AutoScalingPolicy' => [ // REQUIRED 'Constraints' => [ // REQUIRED 'MaxCapacity' => <integer>, // REQUIRED 'MinCapacity' => <integer>, // REQUIRED ], 'Rules' => [ // REQUIRED [ 'Action' => [ // REQUIRED 'Market' => 'ON_DEMAND|SPOT', 'SimpleScalingPolicyConfiguration' => [ // REQUIRED 'AdjustmentType' => 'CHANGE_IN_CAPACITY|PERCENT_CHANGE_IN_CAPACITY|EXACT_CAPACITY', 'CoolDown' => <integer>, 'ScalingAdjustment' => <integer>, // REQUIRED ], ], 'Description' => '<string>', 'Name' => '<string>', // REQUIRED 'Trigger' => [ // REQUIRED 'CloudWatchAlarmDefinition' => [ // REQUIRED 'ComparisonOperator' => 'GREATER_THAN_OR_EQUAL|GREATER_THAN|LESS_THAN|LESS_THAN_OR_EQUAL', // REQUIRED 'Dimensions' => [ [ 'Key' => '<string>', 'Value' => '<string>', ], // ... ], 'EvaluationPeriods' => <integer>, 'MetricName' => '<string>', // REQUIRED 'Namespace' => '<string>', 'Period' => <integer>, // REQUIRED 'Statistic' => 'SAMPLE_COUNT|AVERAGE|SUM|MINIMUM|MAXIMUM', 'Threshold' => <float>, // REQUIRED 'Unit' => 'NONE|SECONDS|MICRO_SECONDS|MILLI_SECONDS|BYTES|KILO_BYTES|MEGA_BYTES|GIGA_BYTES|TERA_BYTES|BITS|KILO_BITS|MEGA_BITS|GIGA_BITS|TERA_BITS|PERCENT|COUNT|BYTES_PER_SECOND|KILO_BYTES_PER_SECOND|MEGA_BYTES_PER_SECOND|GIGA_BYTES_PER_SECOND|TERA_BYTES_PER_SECOND|BITS_PER_SECOND|KILO_BITS_PER_SECOND|MEGA_BITS_PER_SECOND|GIGA_BITS_PER_SECOND|TERA_BITS_PER_SECOND|COUNT_PER_SECOND', ], ], ], // ... ], ], 'ClusterId' => '<string>', // REQUIRED 'InstanceGroupId' => '<string>', // REQUIRED ]);
Parameter Details
Members
- AutoScalingPolicy
-
- Required: Yes
- Type: AutoScalingPolicy structure
Specifies the definition of the automatic scaling policy.
- ClusterId
-
- Required: Yes
- Type: string
Specifies the ID of a cluster. The instance group to which the automatic scaling policy is applied is within this cluster.
- InstanceGroupId
-
- Required: Yes
- Type: string
Specifies the ID of the instance group to which the automatic scaling policy is applied.
Result Syntax
[ 'AutoScalingPolicy' => [ 'Constraints' => [ 'MaxCapacity' => <integer>, 'MinCapacity' => <integer>, ], 'Rules' => [ [ 'Action' => [ 'Market' => 'ON_DEMAND|SPOT', 'SimpleScalingPolicyConfiguration' => [ 'AdjustmentType' => 'CHANGE_IN_CAPACITY|PERCENT_CHANGE_IN_CAPACITY|EXACT_CAPACITY', 'CoolDown' => <integer>, 'ScalingAdjustment' => <integer>, ], ], 'Description' => '<string>', 'Name' => '<string>', 'Trigger' => [ 'CloudWatchAlarmDefinition' => [ 'ComparisonOperator' => 'GREATER_THAN_OR_EQUAL|GREATER_THAN|LESS_THAN|LESS_THAN_OR_EQUAL', 'Dimensions' => [ [ 'Key' => '<string>', 'Value' => '<string>', ], // ... ], 'EvaluationPeriods' => <integer>, 'MetricName' => '<string>', 'Namespace' => '<string>', 'Period' => <integer>, 'Statistic' => 'SAMPLE_COUNT|AVERAGE|SUM|MINIMUM|MAXIMUM', 'Threshold' => <float>, 'Unit' => 'NONE|SECONDS|MICRO_SECONDS|MILLI_SECONDS|BYTES|KILO_BYTES|MEGA_BYTES|GIGA_BYTES|TERA_BYTES|BITS|KILO_BITS|MEGA_BITS|GIGA_BITS|TERA_BITS|PERCENT|COUNT|BYTES_PER_SECOND|KILO_BYTES_PER_SECOND|MEGA_BYTES_PER_SECOND|GIGA_BYTES_PER_SECOND|TERA_BYTES_PER_SECOND|BITS_PER_SECOND|KILO_BITS_PER_SECOND|MEGA_BITS_PER_SECOND|GIGA_BITS_PER_SECOND|TERA_BITS_PER_SECOND|COUNT_PER_SECOND', ], ], ], // ... ], 'Status' => [ 'State' => 'PENDING|ATTACHING|ATTACHED|DETACHING|DETACHED|FAILED', 'StateChangeReason' => [ 'Code' => 'USER_REQUEST|PROVISION_FAILURE|CLEANUP_FAILURE', 'Message' => '<string>', ], ], ], 'ClusterArn' => '<string>', 'ClusterId' => '<string>', 'InstanceGroupId' => '<string>', ]
Result Details
Members
- AutoScalingPolicy
-
- Type: AutoScalingPolicyDescription structure
The automatic scaling policy definition.
- ClusterArn
-
- Type: string
The Amazon Resource Name (ARN) of the cluster.
- ClusterId
-
- Type: string
Specifies the ID of a cluster. The instance group to which the automatic scaling policy is applied is within this cluster.
- InstanceGroupId
-
- Type: string
Specifies the ID of the instance group to which the scaling policy is applied.
Errors
There are no errors described for this operation.
PutAutoTerminationPolicy
$result = $client->putAutoTerminationPolicy
([/* ... */]); $promise = $client->putAutoTerminationPolicyAsync
([/* ... */]);
Auto-termination is supported in Amazon EMR releases 5.30.0 and 6.1.0 and later. For more information, see Using an auto-termination policy.
Creates or updates an auto-termination policy for an Amazon EMR cluster. An auto-termination policy defines the amount of idle time in seconds after which a cluster automatically terminates. For alternative cluster termination options, see Control cluster termination.
Parameter Syntax
$result = $client->putAutoTerminationPolicy([ 'AutoTerminationPolicy' => [ 'IdleTimeout' => <integer>, ], 'ClusterId' => '<string>', // REQUIRED ]);
Parameter Details
Members
- AutoTerminationPolicy
-
- Type: AutoTerminationPolicy structure
Specifies the auto-termination policy to attach to the cluster.
- ClusterId
-
- Required: Yes
- Type: string
Specifies the ID of the Amazon EMR cluster to which the auto-termination policy will be attached.
Result Syntax
[]
Result Details
Errors
There are no errors described for this operation.
PutBlockPublicAccessConfiguration
$result = $client->putBlockPublicAccessConfiguration
([/* ... */]); $promise = $client->putBlockPublicAccessConfigurationAsync
([/* ... */]);
Creates or updates an Amazon EMR block public access configuration for your Amazon Web Services account in the current Region. For more information see Configure Block Public Access for Amazon EMR in the Amazon EMR Management Guide.
Parameter Syntax
$result = $client->putBlockPublicAccessConfiguration([ 'BlockPublicAccessConfiguration' => [ // REQUIRED 'BlockPublicSecurityGroupRules' => true || false, // REQUIRED 'PermittedPublicSecurityGroupRuleRanges' => [ [ 'MaxRange' => <integer>, 'MinRange' => <integer>, // REQUIRED ], // ... ], ], ]);
Parameter Details
Members
- BlockPublicAccessConfiguration
-
- Required: Yes
- Type: BlockPublicAccessConfiguration structure
A configuration for Amazon EMR block public access. The configuration applies to all clusters created in your account for the current Region. The configuration specifies whether block public access is enabled. If block public access is enabled, security groups associated with the cluster cannot have rules that allow inbound traffic from 0.0.0.0/0 or ::/0 on a port, unless the port is specified as an exception using
PermittedPublicSecurityGroupRuleRanges
in theBlockPublicAccessConfiguration
. By default, Port 22 (SSH) is an exception, and public access is allowed on this port. You can change this by updatingBlockPublicSecurityGroupRules
to remove the exception.For accounts that created clusters in a Region before November 25, 2019, block public access is disabled by default in that Region. To use this feature, you must manually enable and configure it. For accounts that did not create an Amazon EMR cluster in a Region before this date, block public access is enabled by default in that Region.
Result Syntax
[]
Result Details
Errors
- InternalServerException:
This exception occurs when there is an internal failure in the Amazon EMR service.
- InvalidRequestException:
This exception occurs when there is something wrong with user input.
PutManagedScalingPolicy
$result = $client->putManagedScalingPolicy
([/* ... */]); $promise = $client->putManagedScalingPolicyAsync
([/* ... */]);
Creates or updates a managed scaling policy for an Amazon EMR cluster. The managed scaling policy defines the limits for resources, such as Amazon EC2 instances that can be added or terminated from a cluster. The policy only applies to the core and task nodes. The master node cannot be scaled after initial configuration.
Parameter Syntax
$result = $client->putManagedScalingPolicy([ 'ClusterId' => '<string>', // REQUIRED 'ManagedScalingPolicy' => [ // REQUIRED 'ComputeLimits' => [ 'MaximumCapacityUnits' => <integer>, // REQUIRED 'MaximumCoreCapacityUnits' => <integer>, 'MaximumOnDemandCapacityUnits' => <integer>, 'MinimumCapacityUnits' => <integer>, // REQUIRED 'UnitType' => 'InstanceFleetUnits|Instances|VCPU', // REQUIRED ], ], ]);
Parameter Details
Members
- ClusterId
-
- Required: Yes
- Type: string
Specifies the ID of an Amazon EMR cluster where the managed scaling policy is attached.
- ManagedScalingPolicy
-
- Required: Yes
- Type: ManagedScalingPolicy structure
Specifies the constraints for the managed scaling policy.
Result Syntax
[]
Result Details
Errors
There are no errors described for this operation.
RemoveAutoScalingPolicy
$result = $client->removeAutoScalingPolicy
([/* ... */]); $promise = $client->removeAutoScalingPolicyAsync
([/* ... */]);
Removes an automatic scaling policy from a specified instance group within an Amazon EMR cluster.
Parameter Syntax
$result = $client->removeAutoScalingPolicy([ 'ClusterId' => '<string>', // REQUIRED 'InstanceGroupId' => '<string>', // REQUIRED ]);
Parameter Details
Members
- ClusterId
-
- Required: Yes
- Type: string
Specifies the ID of a cluster. The instance group to which the automatic scaling policy is applied is within this cluster.
- InstanceGroupId
-
- Required: Yes
- Type: string
Specifies the ID of the instance group to which the scaling policy is applied.
Result Syntax
[]
Result Details
Errors
There are no errors described for this operation.
RemoveAutoTerminationPolicy
$result = $client->removeAutoTerminationPolicy
([/* ... */]); $promise = $client->removeAutoTerminationPolicyAsync
([/* ... */]);
Removes an auto-termination policy from an Amazon EMR cluster.
Parameter Syntax
$result = $client->removeAutoTerminationPolicy([ 'ClusterId' => '<string>', // REQUIRED ]);
Parameter Details
Members
- ClusterId
-
- Required: Yes
- Type: string
Specifies the ID of the Amazon EMR cluster from which the auto-termination policy will be removed.
Result Syntax
[]
Result Details
Errors
There are no errors described for this operation.
RemoveManagedScalingPolicy
$result = $client->removeManagedScalingPolicy
([/* ... */]); $promise = $client->removeManagedScalingPolicyAsync
([/* ... */]);
Removes a managed scaling policy from a specified Amazon EMR cluster.
Parameter Syntax
$result = $client->removeManagedScalingPolicy([ 'ClusterId' => '<string>', // REQUIRED ]);
Parameter Details
Members
- ClusterId
-
- Required: Yes
- Type: string
Specifies the ID of the cluster from which the managed scaling policy will be removed.
Result Syntax
[]
Result Details
Errors
There are no errors described for this operation.
RemoveTags
$result = $client->removeTags
([/* ... */]); $promise = $client->removeTagsAsync
([/* ... */]);
Removes tags from an Amazon EMR resource, such as a cluster or Amazon EMR Studio. Tags make it easier to associate resources in various ways, such as grouping clusters to track your Amazon EMR resource allocation costs. For more information, see Tag Clusters.
The following example removes the stack tag with value Prod from a cluster:
Parameter Syntax
$result = $client->removeTags([ 'ResourceId' => '<string>', // REQUIRED 'TagKeys' => ['<string>', ...], // REQUIRED ]);
Parameter Details
Members
- ResourceId
-
- Required: Yes
- Type: string
The Amazon EMR resource identifier from which tags will be removed. For example, a cluster identifier or an Amazon EMR Studio ID.
- TagKeys
-
- Required: Yes
- Type: Array of strings
A list of tag keys to remove from the resource.
Result Syntax
[]
Result Details
Errors
- InternalServerException:
This exception occurs when there is an internal failure in the Amazon EMR service.
- InvalidRequestException:
This exception occurs when there is something wrong with user input.
RunJobFlow
$result = $client->runJobFlow
([/* ... */]); $promise = $client->runJobFlowAsync
([/* ... */]);
RunJobFlow creates and starts running a new cluster (job flow). The cluster runs the steps specified. After the steps complete, the cluster stops and the HDFS partition is lost. To prevent loss of data, configure the last step of the job flow to store results in Amazon S3. If the JobFlowInstancesConfig KeepJobFlowAliveWhenNoSteps
parameter is set to TRUE
, the cluster transitions to the WAITING state rather than shutting down after the steps have completed.
For additional protection, you can set the JobFlowInstancesConfig TerminationProtected
parameter to TRUE
to lock the cluster and prevent it from being terminated by API call, user intervention, or in the event of a job flow error.
A maximum of 256 steps are allowed in each job flow.
If your cluster is long-running (such as a Hive data warehouse) or complex, you may require more than 256 steps to process your data. You can bypass the 256-step limitation in various ways, including using the SSH shell to connect to the master node and submitting queries directly to the software running on the master node, such as Hive and Hadoop.
For long-running clusters, we recommend that you periodically store your results.
The instance fleets configuration is available only in Amazon EMR releases 4.8.0 and later, excluding 5.0.x versions. The RunJobFlow request can contain InstanceFleets parameters or InstanceGroups parameters, but not both.
Parameter Syntax
$result = $client->runJobFlow([ 'AdditionalInfo' => '<string>', 'AmiVersion' => '<string>', 'Applications' => [ [ 'AdditionalInfo' => ['<string>', ...], 'Args' => ['<string>', ...], 'Name' => '<string>', 'Version' => '<string>', ], // ... ], 'AutoScalingRole' => '<string>', 'AutoTerminationPolicy' => [ 'IdleTimeout' => <integer>, ], 'BootstrapActions' => [ [ 'Name' => '<string>', // REQUIRED 'ScriptBootstrapAction' => [ // REQUIRED 'Args' => ['<string>', ...], 'Path' => '<string>', // REQUIRED ], ], // ... ], 'Configurations' => [ [ 'Classification' => '<string>', 'Configurations' => [...], // RECURSIVE 'Properties' => ['<string>', ...], ], // ... ], 'CustomAmiId' => '<string>', 'EbsRootVolumeIops' => <integer>, 'EbsRootVolumeSize' => <integer>, 'EbsRootVolumeThroughput' => <integer>, 'Instances' => [ // REQUIRED 'AdditionalMasterSecurityGroups' => ['<string>', ...], 'AdditionalSlaveSecurityGroups' => ['<string>', ...], 'Ec2KeyName' => '<string>', 'Ec2SubnetId' => '<string>', 'Ec2SubnetIds' => ['<string>', ...], 'EmrManagedMasterSecurityGroup' => '<string>', 'EmrManagedSlaveSecurityGroup' => '<string>', 'HadoopVersion' => '<string>', 'InstanceCount' => <integer>, 'InstanceFleets' => [ [ 'Context' => '<string>', 'InstanceFleetType' => 'MASTER|CORE|TASK', // REQUIRED 'InstanceTypeConfigs' => [ [ 'BidPrice' => '<string>', 'BidPriceAsPercentageOfOnDemandPrice' => <float>, 'Configurations' => [ [ 'Classification' => '<string>', 'Configurations' => [...], // RECURSIVE 'Properties' => ['<string>', ...], ], // ... ], 'CustomAmiId' => '<string>', 'EbsConfiguration' => [ 'EbsBlockDeviceConfigs' => [ [ 'VolumeSpecification' => [ // REQUIRED 'Iops' => <integer>, 'SizeInGB' => <integer>, // REQUIRED 'Throughput' => <integer>, 'VolumeType' => '<string>', // REQUIRED ], 'VolumesPerInstance' => <integer>, ], // ... ], 'EbsOptimized' => true || false, ], 'InstanceType' => '<string>', // REQUIRED 'Priority' => <float>, 'WeightedCapacity' => <integer>, ], // ... ], 'LaunchSpecifications' => [ 'OnDemandSpecification' => [ 'AllocationStrategy' => 'lowest-price|prioritized', // REQUIRED 'CapacityReservationOptions' => [ 'CapacityReservationPreference' => 'open|none', 'CapacityReservationResourceGroupArn' => '<string>', 'UsageStrategy' => 'use-capacity-reservations-first', ], ], 'SpotSpecification' => [ 'AllocationStrategy' => 'capacity-optimized|price-capacity-optimized|lowest-price|diversified|capacity-optimized-prioritized', 'BlockDurationMinutes' => <integer>, 'TimeoutAction' => 'SWITCH_TO_ON_DEMAND|TERMINATE_CLUSTER', // REQUIRED 'TimeoutDurationMinutes' => <integer>, // REQUIRED ], ], 'Name' => '<string>', 'ResizeSpecifications' => [ 'OnDemandResizeSpecification' => [ 'AllocationStrategy' => 'lowest-price|prioritized', 'CapacityReservationOptions' => [ 'CapacityReservationPreference' => 'open|none', 'CapacityReservationResourceGroupArn' => '<string>', 'UsageStrategy' => 'use-capacity-reservations-first', ], 'TimeoutDurationMinutes' => <integer>, ], 'SpotResizeSpecification' => [ 'AllocationStrategy' => 'capacity-optimized|price-capacity-optimized|lowest-price|diversified|capacity-optimized-prioritized', 'TimeoutDurationMinutes' => <integer>, ], ], 'TargetOnDemandCapacity' => <integer>, 'TargetSpotCapacity' => <integer>, ], // ... ], 'InstanceGroups' => [ [ 'AutoScalingPolicy' => [ 'Constraints' => [ // REQUIRED 'MaxCapacity' => <integer>, // REQUIRED 'MinCapacity' => <integer>, // REQUIRED ], 'Rules' => [ // REQUIRED [ 'Action' => [ // REQUIRED 'Market' => 'ON_DEMAND|SPOT', 'SimpleScalingPolicyConfiguration' => [ // REQUIRED 'AdjustmentType' => 'CHANGE_IN_CAPACITY|PERCENT_CHANGE_IN_CAPACITY|EXACT_CAPACITY', 'CoolDown' => <integer>, 'ScalingAdjustment' => <integer>, // REQUIRED ], ], 'Description' => '<string>', 'Name' => '<string>', // REQUIRED 'Trigger' => [ // REQUIRED 'CloudWatchAlarmDefinition' => [ // REQUIRED 'ComparisonOperator' => 'GREATER_THAN_OR_EQUAL|GREATER_THAN|LESS_THAN|LESS_THAN_OR_EQUAL', // REQUIRED 'Dimensions' => [ [ 'Key' => '<string>', 'Value' => '<string>', ], // ... ], 'EvaluationPeriods' => <integer>, 'MetricName' => '<string>', // REQUIRED 'Namespace' => '<string>', 'Period' => <integer>, // REQUIRED 'Statistic' => 'SAMPLE_COUNT|AVERAGE|SUM|MINIMUM|MAXIMUM', 'Threshold' => <float>, // REQUIRED 'Unit' => 'NONE|SECONDS|MICRO_SECONDS|MILLI_SECONDS|BYTES|KILO_BYTES|MEGA_BYTES|GIGA_BYTES|TERA_BYTES|BITS|KILO_BITS|MEGA_BITS|GIGA_BITS|TERA_BITS|PERCENT|COUNT|BYTES_PER_SECOND|KILO_BYTES_PER_SECOND|MEGA_BYTES_PER_SECOND|GIGA_BYTES_PER_SECOND|TERA_BYTES_PER_SECOND|BITS_PER_SECOND|KILO_BITS_PER_SECOND|MEGA_BITS_PER_SECOND|GIGA_BITS_PER_SECOND|TERA_BITS_PER_SECOND|COUNT_PER_SECOND', ], ], ], // ... ], ], 'BidPrice' => '<string>', 'Configurations' => [ [ 'Classification' => '<string>', 'Configurations' => [...], // RECURSIVE 'Properties' => ['<string>', ...], ], // ... ], 'CustomAmiId' => '<string>', 'EbsConfiguration' => [ 'EbsBlockDeviceConfigs' => [ [ 'VolumeSpecification' => [ // REQUIRED 'Iops' => <integer>, 'SizeInGB' => <integer>, // REQUIRED 'Throughput' => <integer>, 'VolumeType' => '<string>', // REQUIRED ], 'VolumesPerInstance' => <integer>, ], // ... ], 'EbsOptimized' => true || false, ], 'InstanceCount' => <integer>, // REQUIRED 'InstanceRole' => 'MASTER|CORE|TASK', // REQUIRED 'InstanceType' => '<string>', // REQUIRED 'Market' => 'ON_DEMAND|SPOT', 'Name' => '<string>', ], // ... ], 'KeepJobFlowAliveWhenNoSteps' => true || false, 'MasterInstanceType' => '<string>', 'Placement' => [ 'AvailabilityZone' => '<string>', 'AvailabilityZones' => ['<string>', ...], ], 'ServiceAccessSecurityGroup' => '<string>', 'SlaveInstanceType' => '<string>', 'TerminationProtected' => true || false, 'UnhealthyNodeReplacement' => true || false, ], 'JobFlowRole' => '<string>', 'KerberosAttributes' => [ 'ADDomainJoinPassword' => '<string>', 'ADDomainJoinUser' => '<string>', 'CrossRealmTrustPrincipalPassword' => '<string>', 'KdcAdminPassword' => '<string>', // REQUIRED 'Realm' => '<string>', // REQUIRED ], 'LogEncryptionKmsKeyId' => '<string>', 'LogUri' => '<string>', 'ManagedScalingPolicy' => [ 'ComputeLimits' => [ 'MaximumCapacityUnits' => <integer>, // REQUIRED 'MaximumCoreCapacityUnits' => <integer>, 'MaximumOnDemandCapacityUnits' => <integer>, 'MinimumCapacityUnits' => <integer>, // REQUIRED 'UnitType' => 'InstanceFleetUnits|Instances|VCPU', // REQUIRED ], ], 'Name' => '<string>', // REQUIRED 'NewSupportedProducts' => [ [ 'Args' => ['<string>', ...], 'Name' => '<string>', ], // ... ], 'OSReleaseLabel' => '<string>', 'PlacementGroupConfigs' => [ [ 'InstanceRole' => 'MASTER|CORE|TASK', // REQUIRED 'PlacementStrategy' => 'SPREAD|PARTITION|CLUSTER|NONE', ], // ... ], 'ReleaseLabel' => '<string>', 'RepoUpgradeOnBoot' => 'SECURITY|NONE', 'ScaleDownBehavior' => 'TERMINATE_AT_INSTANCE_HOUR|TERMINATE_AT_TASK_COMPLETION', 'SecurityConfiguration' => '<string>', 'ServiceRole' => '<string>', 'StepConcurrencyLevel' => <integer>, 'Steps' => [ [ 'ActionOnFailure' => 'TERMINATE_JOB_FLOW|TERMINATE_CLUSTER|CANCEL_AND_WAIT|CONTINUE', 'HadoopJarStep' => [ // REQUIRED 'Args' => ['<string>', ...], 'Jar' => '<string>', // REQUIRED 'MainClass' => '<string>', 'Properties' => [ [ 'Key' => '<string>', 'Value' => '<string>', ], // ... ], ], 'Name' => '<string>', // REQUIRED ], // ... ], 'SupportedProducts' => ['<string>', ...], 'Tags' => [ [ 'Key' => '<string>', 'Value' => '<string>', ], // ... ], 'VisibleToAllUsers' => true || false, ]);
Parameter Details
Members
- AdditionalInfo
-
- Type: string
A JSON string for selecting additional features.
- AmiVersion
-
- Type: string
Applies only to Amazon EMR AMI versions 3.x and 2.x. For Amazon EMR releases 4.0 and later,
ReleaseLabel
is used. To specify a custom AMI, useCustomAmiID
. - Applications
-
- Type: Array of Application structures
Applies to Amazon EMR releases 4.0 and later. A case-insensitive list of applications for Amazon EMR to install and configure when launching the cluster. For a list of applications available for each Amazon EMR release version, see the Amazon EMRRelease Guide.
- AutoScalingRole
-
- Type: string
An IAM role for automatic scaling policies. The default role is
EMR_AutoScaling_DefaultRole
. The IAM role provides permissions that the automatic scaling feature requires to launch and terminate Amazon EC2 instances in an instance group. - AutoTerminationPolicy
-
- Type: AutoTerminationPolicy structure
An auto-termination policy for an Amazon EMR cluster. An auto-termination policy defines the amount of idle time in seconds after which a cluster automatically terminates. For alternative cluster termination options, see Control cluster termination.
- BootstrapActions
-
- Type: Array of BootstrapActionConfig structures
A list of bootstrap actions to run before Hadoop starts on the cluster nodes.
- Configurations
-
- Type: Array of Configuration structures
For Amazon EMR releases 4.0 and later. The list of configurations supplied for the Amazon EMR cluster that you are creating.
- CustomAmiId
-
- Type: string
Available only in Amazon EMR releases 5.7.0 and later. The ID of a custom Amazon EBS-backed Linux AMI. If specified, Amazon EMR uses this AMI when it launches cluster Amazon EC2 instances. For more information about custom AMIs in Amazon EMR, see Using a Custom AMI in the Amazon EMR Management Guide. If omitted, the cluster uses the base Linux AMI for the
ReleaseLabel
specified. For Amazon EMR releases 2.x and 3.x, useAmiVersion
instead.For information about creating a custom AMI, see Creating an Amazon EBS-Backed Linux AMI in the Amazon Elastic Compute Cloud User Guide for Linux Instances. For information about finding an AMI ID, see Finding a Linux AMI.
- EbsRootVolumeIops
-
- Type: int
The IOPS, of the Amazon EBS root device volume of the Linux AMI that is used for each Amazon EC2 instance. Available in Amazon EMR releases 6.15.0 and later.
- EbsRootVolumeSize
-
- Type: int
The size, in GiB, of the Amazon EBS root device volume of the Linux AMI that is used for each Amazon EC2 instance. Available in Amazon EMR releases 4.x and later.
- EbsRootVolumeThroughput
-
- Type: int
The throughput, in MiB/s, of the Amazon EBS root device volume of the Linux AMI that is used for each Amazon EC2 instance. Available in Amazon EMR releases 6.15.0 and later.
- Instances
-
- Required: Yes
- Type: JobFlowInstancesConfig structure
A specification of the number and type of Amazon EC2 instances.
- JobFlowRole
-
- Type: string
Also called instance profile and Amazon EC2 role. An IAM role for an Amazon EMR cluster. The Amazon EC2 instances of the cluster assume this role. The default role is
EMR_EC2_DefaultRole
. In order to use the default role, you must have already created it using the CLI or console. - KerberosAttributes
-
- Type: KerberosAttributes structure
Attributes for Kerberos configuration when Kerberos authentication is enabled using a security configuration. For more information see Use Kerberos Authentication in the Amazon EMR Management Guide.
- LogEncryptionKmsKeyId
-
- Type: string
The KMS key used for encrypting log files. If a value is not provided, the logs remain encrypted by AES-256. This attribute is only available with Amazon EMR releases 5.30.0 and later, excluding Amazon EMR 6.0.0.
- LogUri
-
- Type: string
The location in Amazon S3 to write the log files of the job flow. If a value is not provided, logs are not created.
- ManagedScalingPolicy
-
- Type: ManagedScalingPolicy structure
The specified managed scaling policy for an Amazon EMR cluster.
- Name
-
- Required: Yes
- Type: string
The name of the job flow.
- NewSupportedProducts
-
- Type: Array of SupportedProductConfig structures
For Amazon EMR releases 3.x and 2.x. For Amazon EMR releases 4.x and later, use Applications.
A list of strings that indicates third-party software to use with the job flow that accepts a user argument list. Amazon EMR accepts and forwards the argument list to the corresponding installation script as bootstrap action arguments. For more information, see "Launch a Job Flow on the MapR Distribution for Hadoop" in the Amazon EMR Developer Guide. Supported values are:
-
"mapr-m3" - launch the cluster using MapR M3 Edition.
-
"mapr-m5" - launch the cluster using MapR M5 Edition.
-
"mapr" with the user arguments specifying "--edition,m3" or "--edition,m5" - launch the job flow using MapR M3 or M5 Edition respectively.
-
"mapr-m7" - launch the cluster using MapR M7 Edition.
-
"hunk" - launch the cluster with the Hunk Big Data Analytics Platform.
-
"hue"- launch the cluster with Hue installed.
-
"spark" - launch the cluster with Apache Spark installed.
-
"ganglia" - launch the cluster with the Ganglia Monitoring System installed.
- OSReleaseLabel
-
- Type: string
Specifies a particular Amazon Linux release for all nodes in a cluster launch RunJobFlow request. If a release is not specified, Amazon EMR uses the latest validated Amazon Linux release for cluster launch.
- PlacementGroupConfigs
-
- Type: Array of PlacementGroupConfig structures
The specified placement group configuration for an Amazon EMR cluster.
- ReleaseLabel
-
- Type: string
The Amazon EMR release label, which determines the version of open-source application packages installed on the cluster. Release labels are in the form
emr-x.x.x
, where x.x.x is an Amazon EMR release version such asemr-5.14.0
. For more information about Amazon EMR release versions and included application versions and features, see https://docs.aws.amazon.com/emr/latest/ReleaseGuide/. The release label applies only to Amazon EMR releases version 4.0 and later. Earlier versions useAmiVersion
. - RepoUpgradeOnBoot
-
- Type: string
Applies only when
CustomAmiID
is used. Specifies which updates from the Amazon Linux AMI package repositories to apply automatically when the instance boots using the AMI. If omitted, the default isSECURITY
, which indicates that only security updates are applied. IfNONE
is specified, no updates are applied, and all updates must be applied manually. - ScaleDownBehavior
-
- Type: string
Specifies the way that individual Amazon EC2 instances terminate when an automatic scale-in activity occurs or an instance group is resized.
TERMINATE_AT_INSTANCE_HOUR
indicates that Amazon EMR terminates nodes at the instance-hour boundary, regardless of when the request to terminate the instance was submitted. This option is only available with Amazon EMR 5.1.0 and later and is the default for clusters created using that version.TERMINATE_AT_TASK_COMPLETION
indicates that Amazon EMR adds nodes to a deny list and drains tasks from nodes before terminating the Amazon EC2 instances, regardless of the instance-hour boundary. With either behavior, Amazon EMR removes the least active nodes first and blocks instance termination if it could lead to HDFS corruption.TERMINATE_AT_TASK_COMPLETION
available only in Amazon EMR releases 4.1.0 and later, and is the default for releases of Amazon EMR earlier than 5.1.0. - SecurityConfiguration
-
- Type: string
The name of a security configuration to apply to the cluster.
- ServiceRole
-
- Type: string
The IAM role that Amazon EMR assumes in order to access Amazon Web Services resources on your behalf. If you've created a custom service role path, you must specify it for the service role when you launch your cluster.
- StepConcurrencyLevel
-
- Type: int
Specifies the number of steps that can be executed concurrently. The default value is
1
. The maximum value is256
. - Steps
-
- Type: Array of StepConfig structures
A list of steps to run.
- SupportedProducts
-
- Type: Array of strings
For Amazon EMR releases 3.x and 2.x. For Amazon EMR releases 4.x and later, use Applications.
A list of strings that indicates third-party software to use. For more information, see the Amazon EMR Developer Guide. Currently supported values are:
-
"mapr-m3" - launch the job flow using MapR M3 Edition.
-
"mapr-m5" - launch the job flow using MapR M5 Edition.
- Tags
-
- Type: Array of Tag structures
A list of tags to associate with a cluster and propagate to Amazon EC2 instances.
- VisibleToAllUsers
-
- Type: boolean
The VisibleToAllUsers parameter is no longer supported. By default, the value is set to
true
. Setting it tofalse
now has no effect.Set this value to
true
so that IAM principals in the Amazon Web Services account associated with the cluster can perform Amazon EMR actions on the cluster that their IAM policies allow. This value defaults totrue
for clusters created using the Amazon EMR API or the CLI create-cluster command.When set to
false
, only the IAM principal that created the cluster and the Amazon Web Services account root user can perform Amazon EMR actions for the cluster, regardless of the IAM permissions policies attached to other IAM principals. For more information, see Understanding the Amazon EMR cluster VisibleToAllUsers setting in the Amazon EMR Management Guide.
Result Syntax
[ 'ClusterArn' => '<string>', 'JobFlowId' => '<string>', ]
Result Details
Members
- ClusterArn
-
- Type: string
The Amazon Resource Name (ARN) of the cluster.
- JobFlowId
-
- Type: string
A unique identifier for the job flow.
Errors
- InternalServerError:
Indicates that an error occurred while processing the request and that the request was not completed.
SetKeepJobFlowAliveWhenNoSteps
$result = $client->setKeepJobFlowAliveWhenNoSteps
([/* ... */]); $promise = $client->setKeepJobFlowAliveWhenNoStepsAsync
([/* ... */]);
You can use the SetKeepJobFlowAliveWhenNoSteps
to configure a cluster (job flow) to terminate after the step execution, i.e., all your steps are executed. If you want a transient cluster that shuts down after the last of the current executing steps are completed, you can configure SetKeepJobFlowAliveWhenNoSteps
to false. If you want a long running cluster, configure SetKeepJobFlowAliveWhenNoSteps
to true.
For more information, see Managing Cluster Termination in the Amazon EMR Management Guide.
Parameter Syntax
$result = $client->setKeepJobFlowAliveWhenNoSteps([ 'JobFlowIds' => ['<string>', ...], // REQUIRED 'KeepJobFlowAliveWhenNoSteps' => true || false, // REQUIRED ]);
Parameter Details
Members
- JobFlowIds
-
- Required: Yes
- Type: Array of strings
A list of strings that uniquely identify the clusters to protect. This identifier is returned by RunJobFlow and can also be obtained from DescribeJobFlows.
- KeepJobFlowAliveWhenNoSteps
-
- Required: Yes
- Type: boolean
A Boolean that indicates whether to terminate the cluster after all steps are executed.
Result Syntax
[]
Result Details
Errors
- InternalServerError:
Indicates that an error occurred while processing the request and that the request was not completed.
SetTerminationProtection
$result = $client->setTerminationProtection
([/* ... */]); $promise = $client->setTerminationProtectionAsync
([/* ... */]);
SetTerminationProtection locks a cluster (job flow) so the Amazon EC2 instances in the cluster cannot be terminated by user intervention, an API call, or in the event of a job-flow error. The cluster still terminates upon successful completion of the job flow. Calling SetTerminationProtection
on a cluster is similar to calling the Amazon EC2 DisableAPITermination
API on all Amazon EC2 instances in a cluster.
SetTerminationProtection
is used to prevent accidental termination of a cluster and to ensure that in the event of an error, the instances persist so that you can recover any data stored in their ephemeral instance storage.
To terminate a cluster that has been locked by setting SetTerminationProtection
to true
, you must first unlock the job flow by a subsequent call to SetTerminationProtection
in which you set the value to false
.
For more information, see Managing Cluster Termination in the Amazon EMR Management Guide.
Parameter Syntax
$result = $client->setTerminationProtection([ 'JobFlowIds' => ['<string>', ...], // REQUIRED 'TerminationProtected' => true || false, // REQUIRED ]);
Parameter Details
Members
- JobFlowIds
-
- Required: Yes
- Type: Array of strings
A list of strings that uniquely identify the clusters to protect. This identifier is returned by RunJobFlow and can also be obtained from DescribeJobFlows .
- TerminationProtected
-
- Required: Yes
- Type: boolean
A Boolean that indicates whether to protect the cluster and prevent the Amazon EC2 instances in the cluster from shutting down due to API calls, user intervention, or job-flow error.
Result Syntax
[]
Result Details
Errors
- InternalServerError:
Indicates that an error occurred while processing the request and that the request was not completed.
SetUnhealthyNodeReplacement
$result = $client->setUnhealthyNodeReplacement
([/* ... */]); $promise = $client->setUnhealthyNodeReplacementAsync
([/* ... */]);
Specify whether to enable unhealthy node replacement, which lets Amazon EMR gracefully replace core nodes on a cluster if any nodes become unhealthy. For example, a node becomes unhealthy if disk usage is above 90%. If unhealthy node replacement is on and TerminationProtected
are off, Amazon EMR immediately terminates the unhealthy core nodes. To use unhealthy node replacement and retain unhealthy core nodes, use to turn on termination protection. In such cases, Amazon EMR adds the unhealthy nodes to a denylist, reducing job interruptions and failures.
If unhealthy node replacement is on, Amazon EMR notifies YARN and other applications on the cluster to stop scheduling tasks with these nodes, moves the data, and then terminates the nodes.
For more information, see graceful node replacement in the Amazon EMR Management Guide.
Parameter Syntax
$result = $client->setUnhealthyNodeReplacement([ 'JobFlowIds' => ['<string>', ...], // REQUIRED 'UnhealthyNodeReplacement' => true || false, // REQUIRED ]);
Parameter Details
Members
- JobFlowIds
-
- Required: Yes
- Type: Array of strings
The list of strings that uniquely identify the clusters for which to turn on unhealthy node replacement. You can get these identifiers by running the RunJobFlow or the DescribeJobFlows operations.
- UnhealthyNodeReplacement
-
- Required: Yes
- Type: boolean
Indicates whether to turn on or turn off graceful unhealthy node replacement.
Result Syntax
[]
Result Details
Errors
- InternalServerError:
Indicates that an error occurred while processing the request and that the request was not completed.
SetVisibleToAllUsers
$result = $client->setVisibleToAllUsers
([/* ... */]); $promise = $client->setVisibleToAllUsersAsync
([/* ... */]);
The SetVisibleToAllUsers parameter is no longer supported. Your cluster may be visible to all users in your account. To restrict cluster access using an IAM policy, see Identity and Access Management for Amazon EMR.
Sets the Cluster$VisibleToAllUsers value for an Amazon EMR cluster. When true
, IAM principals in the Amazon Web Services account can perform Amazon EMR cluster actions that their IAM policies allow. When false
, only the IAM principal that created the cluster and the Amazon Web Services account root user can perform Amazon EMR actions on the cluster, regardless of IAM permissions policies attached to other IAM principals.
This action works on running clusters. When you create a cluster, use the RunJobFlowInput$VisibleToAllUsers parameter.
For more information, see Understanding the Amazon EMR Cluster VisibleToAllUsers Setting in the Amazon EMR Management Guide.
Parameter Syntax
$result = $client->setVisibleToAllUsers([ 'JobFlowIds' => ['<string>', ...], // REQUIRED 'VisibleToAllUsers' => true || false, // REQUIRED ]);
Parameter Details
Members
- JobFlowIds
-
- Required: Yes
- Type: Array of strings
The unique identifier of the job flow (cluster).
- VisibleToAllUsers
-
- Required: Yes
- Type: boolean
A value of
true
indicates that an IAM principal in the Amazon Web Services account can perform Amazon EMR actions on the cluster that the IAM policies attached to the principal allow. A value offalse
indicates that only the IAM principal that created the cluster and the Amazon Web Services root user can perform Amazon EMR actions on the cluster.
Result Syntax
[]
Result Details
Errors
- InternalServerError:
Indicates that an error occurred while processing the request and that the request was not completed.
StartNotebookExecution
$result = $client->startNotebookExecution
([/* ... */]); $promise = $client->startNotebookExecutionAsync
([/* ... */]);
Starts a notebook execution.
Parameter Syntax
$result = $client->startNotebookExecution([ 'EditorId' => '<string>', 'EnvironmentVariables' => ['<string>', ...], 'ExecutionEngine' => [ // REQUIRED 'ExecutionRoleArn' => '<string>', 'Id' => '<string>', // REQUIRED 'MasterInstanceSecurityGroupId' => '<string>', 'Type' => 'EMR', ], 'NotebookExecutionName' => '<string>', 'NotebookInstanceSecurityGroupId' => '<string>', 'NotebookParams' => '<string>', 'NotebookS3Location' => [ 'Bucket' => '<string>', 'Key' => '<string>', ], 'OutputNotebookFormat' => 'HTML', 'OutputNotebookS3Location' => [ 'Bucket' => '<string>', 'Key' => '<string>', ], 'RelativePath' => '<string>', 'ServiceRole' => '<string>', // REQUIRED 'Tags' => [ [ 'Key' => '<string>', 'Value' => '<string>', ], // ... ], ]);
Parameter Details
Members
- EditorId
-
- Type: string
The unique identifier of the Amazon EMR Notebook to use for notebook execution.
- EnvironmentVariables
-
- Type: Associative array of custom strings keys (XmlStringMaxLen256) to strings
The environment variables associated with the notebook execution.
- ExecutionEngine
-
- Required: Yes
- Type: ExecutionEngineConfig structure
Specifies the execution engine (cluster) that runs the notebook execution.
- NotebookExecutionName
-
- Type: string
An optional name for the notebook execution.
- NotebookInstanceSecurityGroupId
-
- Type: string
The unique identifier of the Amazon EC2 security group to associate with the Amazon EMR Notebook for this notebook execution.
- NotebookParams
-
- Type: string
Input parameters in JSON format passed to the Amazon EMR Notebook at runtime for execution.
- NotebookS3Location
-
- Type: NotebookS3LocationFromInput structure
The Amazon S3 location for the notebook execution input.
- OutputNotebookFormat
-
- Type: string
The output format for the notebook execution.
- OutputNotebookS3Location
-
- Type: OutputNotebookS3LocationFromInput structure
The Amazon S3 location for the notebook execution output.
- RelativePath
-
- Type: string
The path and file name of the notebook file for this execution, relative to the path specified for the Amazon EMR Notebook. For example, if you specify a path of
s3://MyBucket/MyNotebooks
when you create an Amazon EMR Notebook for a notebook with an ID ofe-ABCDEFGHIJK1234567890ABCD
(theEditorID
of this request), and you specify aRelativePath
ofmy_notebook_executions/notebook_execution.ipynb
, the location of the file for the notebook execution iss3://MyBucket/MyNotebooks/e-ABCDEFGHIJK1234567890ABCD/my_notebook_executions/notebook_execution.ipynb
. - ServiceRole
-
- Required: Yes
- Type: string
The name or ARN of the IAM role that is used as the service role for Amazon EMR (the Amazon EMR role) for the notebook execution.
- Tags
-
- Type: Array of Tag structures
A list of tags associated with a notebook execution. Tags are user-defined key-value pairs that consist of a required key string with a maximum of 128 characters and an optional value string with a maximum of 256 characters.
Result Syntax
[ 'NotebookExecutionId' => '<string>', ]
Result Details
Members
- NotebookExecutionId
-
- Type: string
The unique identifier of the notebook execution.
Errors
- InternalServerException:
This exception occurs when there is an internal failure in the Amazon EMR service.
- InvalidRequestException:
This exception occurs when there is something wrong with user input.
StopNotebookExecution
$result = $client->stopNotebookExecution
([/* ... */]); $promise = $client->stopNotebookExecutionAsync
([/* ... */]);
Stops a notebook execution.
Parameter Syntax
$result = $client->stopNotebookExecution([ 'NotebookExecutionId' => '<string>', // REQUIRED ]);
Parameter Details
Members
- NotebookExecutionId
-
- Required: Yes
- Type: string
The unique identifier of the notebook execution.
Result Syntax
[]
Result Details
Errors
- InternalServerError:
Indicates that an error occurred while processing the request and that the request was not completed.
- InvalidRequestException:
This exception occurs when there is something wrong with user input.
TerminateJobFlows
$result = $client->terminateJobFlows
([/* ... */]); $promise = $client->terminateJobFlowsAsync
([/* ... */]);
TerminateJobFlows shuts a list of clusters (job flows) down. When a job flow is shut down, any step not yet completed is canceled and the Amazon EC2 instances on which the cluster is running are stopped. Any log files not already saved are uploaded to Amazon S3 if a LogUri was specified when the cluster was created.
The maximum number of clusters allowed is 10. The call to TerminateJobFlows
is asynchronous. Depending on the configuration of the cluster, it may take up to 1-5 minutes for the cluster to completely terminate and release allocated resources, such as Amazon EC2 instances.
Parameter Syntax
$result = $client->terminateJobFlows([ 'JobFlowIds' => ['<string>', ...], // REQUIRED ]);
Parameter Details
Members
- JobFlowIds
-
- Required: Yes
- Type: Array of strings
A list of job flows to be shut down.
Result Syntax
[]
Result Details
Errors
- InternalServerError:
Indicates that an error occurred while processing the request and that the request was not completed.
UpdateStudio
$result = $client->updateStudio
([/* ... */]); $promise = $client->updateStudioAsync
([/* ... */]);
Updates an Amazon EMR Studio configuration, including attributes such as name, description, and subnets.
Parameter Syntax
$result = $client->updateStudio([ 'DefaultS3Location' => '<string>', 'Description' => '<string>', 'EncryptionKeyArn' => '<string>', 'Name' => '<string>', 'StudioId' => '<string>', // REQUIRED 'SubnetIds' => ['<string>', ...], ]);
Parameter Details
Members
- DefaultS3Location
-
- Type: string
The Amazon S3 location to back up Workspaces and notebook files for the Amazon EMR Studio.
- Description
-
- Type: string
A detailed description to assign to the Amazon EMR Studio.
- EncryptionKeyArn
-
- Type: string
The KMS key identifier (ARN) used to encrypt Amazon EMR Studio workspace and notebook files when backed up to Amazon S3.
- Name
-
- Type: string
A descriptive name for the Amazon EMR Studio.
- StudioId
-
- Required: Yes
- Type: string
The ID of the Amazon EMR Studio to update.
- SubnetIds
-
- Type: Array of strings
A list of subnet IDs to associate with the Amazon EMR Studio. The list can include new subnet IDs, but must also include all of the subnet IDs previously associated with the Studio. The list order does not matter. A Studio can have a maximum of 5 subnets. The subnets must belong to the same VPC as the Studio.
Result Syntax
[]
Result Details
Errors
- InternalServerException:
This exception occurs when there is an internal failure in the Amazon EMR service.
- InvalidRequestException:
This exception occurs when there is something wrong with user input.
UpdateStudioSessionMapping
$result = $client->updateStudioSessionMapping
([/* ... */]); $promise = $client->updateStudioSessionMappingAsync
([/* ... */]);
Updates the session policy attached to the user or group for the specified Amazon EMR Studio.
Parameter Syntax
$result = $client->updateStudioSessionMapping([ 'IdentityId' => '<string>', 'IdentityName' => '<string>', 'IdentityType' => 'USER|GROUP', // REQUIRED 'SessionPolicyArn' => '<string>', // REQUIRED 'StudioId' => '<string>', // REQUIRED ]);
Parameter Details
Members
- IdentityId
-
- Type: string
- IdentityName
-
- Type: string
The name of the user or group to update. For more information, see UserName and DisplayName in the IAM Identity Center Identity Store API Reference. Either
IdentityName
orIdentityId
must be specified. - IdentityType
-
- Required: Yes
- Type: string
Specifies whether the identity to update is a user or a group.
- SessionPolicyArn
-
- Required: Yes
- Type: string
The Amazon Resource Name (ARN) of the session policy to associate with the specified user or group.
- StudioId
-
- Required: Yes
- Type: string
The ID of the Amazon EMR Studio.
Result Syntax
[]
Result Details
Errors
- InternalServerError:
Indicates that an error occurred while processing the request and that the request was not completed.
- InvalidRequestException:
This exception occurs when there is something wrong with user input.
Shapes
Application
Description
With Amazon EMR release version 4.0 and later, the only accepted parameter is the application name. To pass arguments to applications, you use configuration classifications specified using configuration JSON objects. For more information, see Configuring Applications.
With earlier Amazon EMR releases, the application is any Amazon or third-party software that you can add to the cluster. This structure contains a list of strings that indicates the software to use with the cluster and accepts a user argument list. Amazon EMR accepts and forwards the argument list to the corresponding installation script as bootstrap action argument.
Members
- AdditionalInfo
-
- Type: Associative array of custom strings keys (String) to strings
This option is for advanced users only. This is meta information about third-party applications that third-party vendors use for testing purposes.
- Args
-
- Type: Array of strings
Arguments for Amazon EMR to pass to the application.
- Name
-
- Type: string
The name of the application.
- Version
-
- Type: string
The version of the application.
AutoScalingPolicy
Description
An automatic scaling policy for a core instance group or task instance group in an Amazon EMR cluster. An automatic scaling policy defines how an instance group dynamically adds and terminates Amazon EC2 instances in response to the value of a CloudWatch metric. See PutAutoScalingPolicy.
Members
- Constraints
-
- Required: Yes
- Type: ScalingConstraints structure
The upper and lower Amazon EC2 instance limits for an automatic scaling policy. Automatic scaling activity will not cause an instance group to grow above or below these limits.
- Rules
-
- Required: Yes
- Type: Array of ScalingRule structures
The scale-in and scale-out rules that comprise the automatic scaling policy.
AutoScalingPolicyDescription
Description
An automatic scaling policy for a core instance group or task instance group in an Amazon EMR cluster. The automatic scaling policy defines how an instance group dynamically adds and terminates Amazon EC2 instances in response to the value of a CloudWatch metric. See PutAutoScalingPolicy.
Members
- Constraints
-
- Type: ScalingConstraints structure
The upper and lower Amazon EC2 instance limits for an automatic scaling policy. Automatic scaling activity will not cause an instance group to grow above or below these limits.
- Rules
-
- Type: Array of ScalingRule structures
The scale-in and scale-out rules that comprise the automatic scaling policy.
- Status
-
- Type: AutoScalingPolicyStatus structure
The status of an automatic scaling policy.
AutoScalingPolicyStateChangeReason
Description
The reason for an AutoScalingPolicyStatus change.
Members
- Code
-
- Type: string
The code indicating the reason for the change in status.
USER_REQUEST
indicates that the scaling policy status was changed by a user.PROVISION_FAILURE
indicates that the status change was because the policy failed to provision.CLEANUP_FAILURE
indicates an error. - Message
-
- Type: string
A friendly, more verbose message that accompanies an automatic scaling policy state change.
AutoScalingPolicyStatus
Description
The status of an automatic scaling policy.
Members
- State
-
- Type: string
Indicates the status of the automatic scaling policy.
- StateChangeReason
-
- Type: AutoScalingPolicyStateChangeReason structure
The reason for a change in status.
AutoTerminationPolicy
Description
An auto-termination policy for an Amazon EMR cluster. An auto-termination policy defines the amount of idle time in seconds after which a cluster automatically terminates. For alternative cluster termination options, see Control cluster termination.
Members
- IdleTimeout
-
- Type: long (int|float)
Specifies the amount of idle time in seconds after which the cluster automatically terminates. You can specify a minimum of 60 seconds and a maximum of 604800 seconds (seven days).
BlockPublicAccessConfiguration
Description
A configuration for Amazon EMR block public access. When BlockPublicSecurityGroupRules
is set to true
, Amazon EMR prevents cluster creation if one of the cluster's security groups has a rule that allows inbound traffic from 0.0.0.0/0 or ::/0 on a port, unless the port is specified as an exception using PermittedPublicSecurityGroupRuleRanges
.
Members
- BlockPublicSecurityGroupRules
-
- Required: Yes
- Type: boolean
Indicates whether Amazon EMR block public access is enabled (
true
) or disabled (false
). By default, the value isfalse
for accounts that have created Amazon EMR clusters before July 2019. For accounts created after this, the default istrue
. - PermittedPublicSecurityGroupRuleRanges
-
- Type: Array of PortRange structures
Specifies ports and port ranges that are permitted to have security group rules that allow inbound traffic from all public sources. For example, if Port 23 (Telnet) is specified for
PermittedPublicSecurityGroupRuleRanges
, Amazon EMR allows cluster creation if a security group associated with the cluster has a rule that allows inbound traffic on Port 23 from IPv4 0.0.0.0/0 or IPv6 port ::/0 as the source.By default, Port 22, which is used for SSH access to the cluster Amazon EC2 instances, is in the list of
PermittedPublicSecurityGroupRuleRanges
.
BlockPublicAccessConfigurationMetadata
Description
Properties that describe the Amazon Web Services principal that created the BlockPublicAccessConfiguration
using the PutBlockPublicAccessConfiguration
action as well as the date and time that the configuration was created. Each time a configuration for block public access is updated, Amazon EMR updates this metadata.
Members
- CreatedByArn
-
- Required: Yes
- Type: string
The Amazon Resource Name that created or last modified the configuration.
- CreationDateTime
-
- Required: Yes
- Type: timestamp (string|DateTime or anything parsable by strtotime)
The date and time that the configuration was created.
BootstrapActionConfig
Description
Configuration of a bootstrap action.
Members
- Name
-
- Required: Yes
- Type: string
The name of the bootstrap action.
- ScriptBootstrapAction
-
- Required: Yes
- Type: ScriptBootstrapActionConfig structure
The script run by the bootstrap action.
BootstrapActionDetail
Description
Reports the configuration of a bootstrap action in a cluster (job flow).
Members
- BootstrapActionConfig
-
- Type: BootstrapActionConfig structure
A description of the bootstrap action.
CancelStepsInfo
Description
Specification of the status of a CancelSteps request. Available only in Amazon EMR version 4.8.0 and later, excluding version 5.0.0.
Members
- Reason
-
- Type: string
The reason for the failure if the CancelSteps request fails.
- Status
-
- Type: string
The status of a CancelSteps Request. The value may be SUBMITTED or FAILED.
- StepId
-
- Type: string
The encrypted StepId of a step.
CloudWatchAlarmDefinition
Description
The definition of a CloudWatch metric alarm, which determines when an automatic scaling activity is triggered. When the defined alarm conditions are satisfied, scaling activity begins.
Members
- ComparisonOperator
-
- Required: Yes
- Type: string
Determines how the metric specified by
MetricName
is compared to the value specified byThreshold
. - Dimensions
-
- Type: Array of MetricDimension structures
A CloudWatch metric dimension.
- EvaluationPeriods
-
- Type: int
The number of periods, in five-minute increments, during which the alarm condition must exist before the alarm triggers automatic scaling activity. The default value is
1
. - MetricName
-
- Required: Yes
- Type: string
The name of the CloudWatch metric that is watched to determine an alarm condition.
- Namespace
-
- Type: string
The namespace for the CloudWatch metric. The default is
AWS/ElasticMapReduce
. - Period
-
- Required: Yes
- Type: int
The period, in seconds, over which the statistic is applied. CloudWatch metrics for Amazon EMR are emitted every five minutes (300 seconds), so if you specify a CloudWatch metric, specify
300
. - Statistic
-
- Type: string
The statistic to apply to the metric associated with the alarm. The default is
AVERAGE
. - Threshold
-
- Required: Yes
- Type: double
The value against which the specified statistic is compared.
- Unit
-
- Type: string
The unit of measure associated with the CloudWatch metric being watched. The value specified for
Unit
must correspond to the units specified in the CloudWatch metric.
Cluster
Description
The detailed description of the cluster.
Members
- Applications
-
- Type: Array of Application structures
The applications installed on this cluster.
- AutoScalingRole
-
- Type: string
An IAM role for automatic scaling policies. The default role is
EMR_AutoScaling_DefaultRole
. The IAM role provides permissions that the automatic scaling feature requires to launch and terminate Amazon EC2 instances in an instance group. - AutoTerminate
-
- Type: boolean
Specifies whether the cluster should terminate after completing all steps.
- ClusterArn
-
- Type: string
The Amazon Resource Name of the cluster.
- Configurations
-
- Type: Array of Configuration structures
Applies only to Amazon EMR releases 4.x and later. The list of configurations that are supplied to the Amazon EMR cluster.
- CustomAmiId
-
- Type: string
Available only in Amazon EMR releases 5.7.0 and later. The ID of a custom Amazon EBS-backed Linux AMI if the cluster uses a custom AMI.
- EbsRootVolumeIops
-
- Type: int
The IOPS, of the Amazon EBS root device volume of the Linux AMI that is used for each Amazon EC2 instance. Available in Amazon EMR releases 6.15.0 and later.
- EbsRootVolumeSize
-
- Type: int
The size, in GiB, of the Amazon EBS root device volume of the Linux AMI that is used for each Amazon EC2 instance. Available in Amazon EMR releases 4.x and later.
- EbsRootVolumeThroughput
-
- Type: int
The throughput, in MiB/s, of the Amazon EBS root device volume of the Linux AMI that is used for each Amazon EC2 instance. Available in Amazon EMR releases 6.15.0 and later.
- Ec2InstanceAttributes
-
- Type: Ec2InstanceAttributes structure
Provides information about the Amazon EC2 instances in a cluster grouped by category. For example, key name, subnet ID, IAM instance profile, and so on.
- Id
-
- Type: string
The unique identifier for the cluster.
- InstanceCollectionType
-
- Type: string
The instance fleet configuration is available only in Amazon EMR releases 4.8.0 and later, excluding 5.0.x versions.
The instance group configuration of the cluster. A value of
INSTANCE_GROUP
indicates a uniform instance group configuration. A value ofINSTANCE_FLEET
indicates an instance fleets configuration. - KerberosAttributes
-
- Type: KerberosAttributes structure
Attributes for Kerberos configuration when Kerberos authentication is enabled using a security configuration. For more information see Use Kerberos Authentication in the Amazon EMR Management Guide.
- LogEncryptionKmsKeyId
-
- Type: string
The KMS key used for encrypting log files. This attribute is only available with Amazon EMR 5.30.0 and later, excluding Amazon EMR 6.0.0.
- LogUri
-
- Type: string
The path to the Amazon S3 location where logs for this cluster are stored.
- MasterPublicDnsName
-
- Type: string
The DNS name of the master node. If the cluster is on a private subnet, this is the private DNS name. On a public subnet, this is the public DNS name.
- Name
-
- Type: string
The name of the cluster. This parameter can't contain the characters <, >, $, |, or ` (backtick).
- NormalizedInstanceHours
-
- Type: int
An approximation of the cost of the cluster, represented in m1.small/hours. This value is incremented one time for every hour an m1.small instance runs. Larger instances are weighted more, so an Amazon EC2 instance that is roughly four times more expensive would result in the normalized instance hours being incremented by four. This result is only an approximation and does not reflect the actual billing rate.
- OSReleaseLabel
-
- Type: string
The Amazon Linux release specified in a cluster launch RunJobFlow request. If no Amazon Linux release was specified, the default Amazon Linux release is shown in the response.
- OutpostArn
-
- Type: string
The Amazon Resource Name (ARN) of the Outpost where the cluster is launched.
- PlacementGroups
-
- Type: Array of PlacementGroupConfig structures
Placement group configured for an Amazon EMR cluster.
- ReleaseLabel
-
- Type: string
The Amazon EMR release label, which determines the version of open-source application packages installed on the cluster. Release labels are in the form
emr-x.x.x
, where x.x.x is an Amazon EMR release version such asemr-5.14.0
. For more information about Amazon EMR release versions and included application versions and features, see https://docs.aws.amazon.com/emr/latest/ReleaseGuide/. The release label applies only to Amazon EMR releases version 4.0 and later. Earlier versions useAmiVersion
. - RepoUpgradeOnBoot
-
- Type: string
Applies only when
CustomAmiID
is used. Specifies the type of updates that the Amazon Linux AMI package repositories apply when an instance boots using the AMI. - RequestedAmiVersion
-
- Type: string
The AMI version requested for this cluster.
- RunningAmiVersion
-
- Type: string
The AMI version running on this cluster.
- ScaleDownBehavior
-
- Type: string
The way that individual Amazon EC2 instances terminate when an automatic scale-in activity occurs or an instance group is resized.
TERMINATE_AT_INSTANCE_HOUR
indicates that Amazon EMR terminates nodes at the instance-hour boundary, regardless of when the request to terminate the instance was submitted. This option is only available with Amazon EMR 5.1.0 and later and is the default for clusters created using that version.TERMINATE_AT_TASK_COMPLETION
indicates that Amazon EMR adds nodes to a deny list and drains tasks from nodes before terminating the Amazon EC2 instances, regardless of the instance-hour boundary. With either behavior, Amazon EMR removes the least active nodes first and blocks instance termination if it could lead to HDFS corruption.TERMINATE_AT_TASK_COMPLETION
is available only in Amazon EMR releases 4.1.0 and later, and is the default for versions of Amazon EMR earlier than 5.1.0. - SecurityConfiguration
-
- Type: string
The name of the security configuration applied to the cluster.
- ServiceRole
-
- Type: string
The IAM role that Amazon EMR assumes in order to access Amazon Web Services resources on your behalf.
- Status
-
- Type: ClusterStatus structure
The current status details about the cluster.
- StepConcurrencyLevel
-
- Type: int
Specifies the number of steps that can be executed concurrently.
- Tags
-
- Type: Array of Tag structures
A list of tags associated with a cluster.
- TerminationProtected
-
- Type: boolean
Indicates whether Amazon EMR will lock the cluster to prevent the Amazon EC2 instances from being terminated by an API call or user intervention, or in the event of a cluster error.
- UnhealthyNodeReplacement
-
- Type: boolean
Indicates whether Amazon EMR should gracefully replace Amazon EC2 core instances that have degraded within the cluster.
- VisibleToAllUsers
-
- Type: boolean
Indicates whether the cluster is visible to IAM principals in the Amazon Web Services account associated with the cluster. When
true
, IAM principals in the Amazon Web Services account can perform Amazon EMR cluster actions on the cluster that their IAM policies allow. Whenfalse
, only the IAM principal that created the cluster and the Amazon Web Services account root user can perform Amazon EMR actions, regardless of IAM permissions policies attached to other IAM principals.The default value is
true
if a value is not provided when creating a cluster using the Amazon EMR API RunJobFlow command, the CLI create-cluster command, or the Amazon Web Services Management Console.
ClusterStateChangeReason
Description
The reason that the cluster changed to its current state.
Members
- Code
-
- Type: string
The programmatic code for the state change reason.
- Message
-
- Type: string
The descriptive message for the state change reason.
ClusterStatus
Description
The detailed status of the cluster.
Members
- ErrorDetails
-
- Type: Array of ErrorDetail structures
A list of tuples that provides information about the errors that caused a cluster to terminate. This structure can contain up to 10 different
ErrorDetail
tuples. - State
-
- Type: string
The current state of the cluster.
- StateChangeReason
-
- Type: ClusterStateChangeReason structure
The reason for the cluster status change.
- Timeline
-
- Type: ClusterTimeline structure
A timeline that represents the status of a cluster over the lifetime of the cluster.
ClusterSummary
Description
The summary description of the cluster.
Members
- ClusterArn
-
- Type: string
The Amazon Resource Name of the cluster.
- Id
-
- Type: string
The unique identifier for the cluster.
- Name
-
- Type: string
The name of the cluster.
- NormalizedInstanceHours
-
- Type: int
An approximation of the cost of the cluster, represented in m1.small/hours. This value is incremented one time for every hour an m1.small instance runs. Larger instances are weighted more, so an Amazon EC2 instance that is roughly four times more expensive would result in the normalized instance hours being incremented by four. This result is only an approximation and does not reflect the actual billing rate.
- OutpostArn
-
- Type: string
The Amazon Resource Name (ARN) of the Outpost where the cluster is launched.
- Status
-
- Type: ClusterStatus structure
The details about the current status of the cluster.
ClusterTimeline
Description
Represents the timeline of the cluster's lifecycle.
Members
- CreationDateTime
-
- Type: timestamp (string|DateTime or anything parsable by strtotime)
The creation date and time of the cluster.
- EndDateTime
-
- Type: timestamp (string|DateTime or anything parsable by strtotime)
The date and time when the cluster was terminated.
- ReadyDateTime
-
- Type: timestamp (string|DateTime or anything parsable by strtotime)
The date and time when the cluster was ready to run steps.
Command
Description
An entity describing an executable that runs on a cluster.
Members
- Args
-
- Type: Array of strings
Arguments for Amazon EMR to pass to the command for execution.
- Name
-
- Type: string
The name of the command.
- ScriptPath
-
- Type: string
The Amazon S3 location of the command script.
ComputeLimits
Description
The Amazon EC2 unit limits for a managed scaling policy. The managed scaling activity of a cluster can not be above or below these limits. The limit only applies to the core and task nodes. The master node cannot be scaled after initial configuration.
Members
- MaximumCapacityUnits
-
- Required: Yes
- Type: int
The upper boundary of Amazon EC2 units. It is measured through vCPU cores or instances for instance groups and measured through units for instance fleets. Managed scaling activities are not allowed beyond this boundary. The limit only applies to the core and task nodes. The master node cannot be scaled after initial configuration.
- MaximumCoreCapacityUnits
-
- Type: int
The upper boundary of Amazon EC2 units for core node type in a cluster. It is measured through vCPU cores or instances for instance groups and measured through units for instance fleets. The core units are not allowed to scale beyond this boundary. The parameter is used to split capacity allocation between core and task nodes.
- MaximumOnDemandCapacityUnits
-
- Type: int
The upper boundary of On-Demand Amazon EC2 units. It is measured through vCPU cores or instances for instance groups and measured through units for instance fleets. The On-Demand units are not allowed to scale beyond this boundary. The parameter is used to split capacity allocation between On-Demand and Spot Instances.
- MinimumCapacityUnits
-
- Required: Yes
- Type: int
The lower boundary of Amazon EC2 units. It is measured through vCPU cores or instances for instance groups and measured through units for instance fleets. Managed scaling activities are not allowed beyond this boundary. The limit only applies to the core and task nodes. The master node cannot be scaled after initial configuration.
- UnitType
-
- Required: Yes
- Type: string
The unit type used for specifying a managed scaling policy.
Configuration
Description
Amazon EMR releases 4.x or later.
An optional configuration specification to be used when provisioning cluster instances, which can include configurations for applications and software bundled with Amazon EMR. A configuration consists of a classification, properties, and optional nested configurations. A classification refers to an application-specific configuration file. Properties are the settings you want to change in that file. For more information, see Configuring Applications.
Members
- Classification
-
- Type: string
The classification within a configuration.
- Configurations
-
- Type: Array of Configuration structures
A list of additional configurations to apply within a configuration object.
- Properties
-
- Type: Associative array of custom strings keys (String) to strings
A set of properties specified within a configuration classification.
Credentials
Description
The credentials that you can use to connect to cluster endpoints. Credentials consist of a username and a password.
Members
- UsernamePassword
-
- Type: UsernamePassword structure
The username and password that you use to connect to cluster endpoints.
EbsBlockDevice
Description
Configuration of requested EBS block device associated with the instance group.
Members
- Device
-
- Type: string
The device name that is exposed to the instance, such as /dev/sdh.
- VolumeSpecification
-
- Type: VolumeSpecification structure
EBS volume specifications such as volume type, IOPS, size (GiB) and throughput (MiB/s) that are requested for the EBS volume attached to an Amazon EC2 instance in the cluster.
EbsBlockDeviceConfig
Description
Configuration of requested EBS block device associated with the instance group with count of volumes that are associated to every instance.
Members
- VolumeSpecification
-
- Required: Yes
- Type: VolumeSpecification structure
EBS volume specifications such as volume type, IOPS, size (GiB) and throughput (MiB/s) that are requested for the EBS volume attached to an Amazon EC2 instance in the cluster.
- VolumesPerInstance
-
- Type: int
Number of EBS volumes with a specific volume configuration that are associated with every instance in the instance group
EbsConfiguration
Description
The Amazon EBS configuration of a cluster instance.
Members
- EbsBlockDeviceConfigs
-
- Type: Array of EbsBlockDeviceConfig structures
An array of Amazon EBS volume specifications attached to a cluster instance.
- EbsOptimized
-
- Type: boolean
Indicates whether an Amazon EBS volume is EBS-optimized.
EbsVolume
Description
EBS block device that's attached to an Amazon EC2 instance.
Members
- Device
-
- Type: string
The device name that is exposed to the instance, such as /dev/sdh.
- VolumeId
-
- Type: string
The volume identifier of the EBS volume.
Ec2InstanceAttributes
Description
Provides information about the Amazon EC2 instances in a cluster grouped by category. For example, key name, subnet ID, IAM instance profile, and so on.
Members
- AdditionalMasterSecurityGroups
-
- Type: Array of strings
A list of additional Amazon EC2 security group IDs for the master node.
- AdditionalSlaveSecurityGroups
-
- Type: Array of strings
A list of additional Amazon EC2 security group IDs for the core and task nodes.
- Ec2AvailabilityZone
-
- Type: string
The Availability Zone in which the cluster will run.
- Ec2KeyName
-
- Type: string
The name of the Amazon EC2 key pair to use when connecting with SSH into the master node as a user named "hadoop".
- Ec2SubnetId
-
- Type: string
Set this parameter to the identifier of the Amazon VPC subnet where you want the cluster to launch. If you do not specify this value, and your account supports EC2-Classic, the cluster launches in EC2-Classic.
- EmrManagedMasterSecurityGroup
-
- Type: string
The identifier of the Amazon EC2 security group for the master node.
- EmrManagedSlaveSecurityGroup
-
- Type: string
The identifier of the Amazon EC2 security group for the core and task nodes.
- IamInstanceProfile
-
- Type: string
The IAM role that was specified when the cluster was launched. The Amazon EC2 instances of the cluster assume this role.
- RequestedEc2AvailabilityZones
-
- Type: Array of strings
Applies to clusters configured with the instance fleets option. Specifies one or more Availability Zones in which to launch Amazon EC2 cluster instances when the EC2-Classic network configuration is supported. Amazon EMR chooses the Availability Zone with the best fit from among the list of
RequestedEc2AvailabilityZones
, and then launches all cluster instances within that Availability Zone. If you do not specify this value, Amazon EMR chooses the Availability Zone for you.RequestedEc2SubnetIDs
andRequestedEc2AvailabilityZones
cannot be specified together. - RequestedEc2SubnetIds
-
- Type: Array of strings
Applies to clusters configured with the instance fleets option. Specifies the unique identifier of one or more Amazon EC2 subnets in which to launch Amazon EC2 cluster instances. Subnets must exist within the same VPC. Amazon EMR chooses the Amazon EC2 subnet with the best fit from among the list of
RequestedEc2SubnetIds
, and then launches all cluster instances within that Subnet. If this value is not specified, and the account and Region support EC2-Classic networks, the cluster launches instances in the EC2-Classic network and usesRequestedEc2AvailabilityZones
instead of this setting. If EC2-Classic is not supported, and no Subnet is specified, Amazon EMR chooses the subnet for you.RequestedEc2SubnetIDs
andRequestedEc2AvailabilityZones
cannot be specified together. - ServiceAccessSecurityGroup
-
- Type: string
The identifier of the Amazon EC2 security group for the Amazon EMR service to access clusters in VPC private subnets.
ErrorDetail
Description
A tuple that provides information about an error that caused a cluster to terminate.
Members
- ErrorCode
-
- Type: string
The name or code associated with the error.
- ErrorData
-
- Type: Array of stringss
A list of key value pairs that provides contextual information about why an error occured.
- ErrorMessage
-
- Type: string
A message that describes the error.
ExecutionEngineConfig
Description
Specifies the execution engine (cluster) to run the notebook and perform the notebook execution, for example, an Amazon EMR cluster.
Members
- ExecutionRoleArn
-
- Type: string
The execution role ARN required for the notebook execution.
- Id
-
- Required: Yes
- Type: string
The unique identifier of the execution engine. For an Amazon EMR cluster, this is the cluster ID.
- MasterInstanceSecurityGroupId
-
- Type: string
An optional unique ID of an Amazon EC2 security group to associate with the master instance of the Amazon EMR cluster for this notebook execution. For more information see Specifying Amazon EC2 Security Groups for Amazon EMR Notebooks in the EMR Management Guide.
- Type
-
- Type: string
The type of execution engine. A value of
EMR
specifies an Amazon EMR cluster.
FailureDetails
Description
The details of the step failure. The service attempts to detect the root cause for many common failures.
Members
- LogFile
-
- Type: string
The path to the log file where the step failure root cause was originally recorded.
- Message
-
- Type: string
The descriptive message including the error the Amazon EMR service has identified as the cause of step failure. This is text from an error log that describes the root cause of the failure.
- Reason
-
- Type: string
The reason for the step failure. In the case where the service cannot successfully determine the root cause of the failure, it returns "Unknown Error" as a reason.
HadoopJarStepConfig
Description
A job flow step consisting of a JAR file whose main function will be executed. The main function submits a job for Hadoop to execute and waits for the job to finish or fail.
Members
- Args
-
- Type: Array of strings
A list of command line arguments passed to the JAR file's main function when executed.
- Jar
-
- Required: Yes
- Type: string
A path to a JAR file run during the step.
- MainClass
-
- Type: string
The name of the main class in the specified Java file. If not specified, the JAR file should specify a Main-Class in its manifest file.
- Properties
-
- Type: Array of KeyValue structures
A list of Java properties that are set when the step runs. You can use these properties to pass key-value pairs to your main function.
HadoopStepConfig
Description
A cluster step consisting of a JAR file whose main function will be executed. The main function submits a job for Hadoop to execute and waits for the job to finish or fail.
Members
- Args
-
- Type: Array of strings
The list of command line arguments to pass to the JAR file's main function for execution.
- Jar
-
- Type: string
The path to the JAR file that runs during the step.
- MainClass
-
- Type: string
The name of the main class in the specified Java file. If not specified, the JAR file should specify a main class in its manifest file.
- Properties
-
- Type: Associative array of custom strings keys (String) to strings
The list of Java properties that are set when the step runs. You can use these properties to pass key-value pairs to your main function.
Instance
Description
Represents an Amazon EC2 instance provisioned as part of cluster.
Members
- EbsVolumes
-
- Type: Array of EbsVolume structures
The list of Amazon EBS volumes that are attached to this instance.
- Ec2InstanceId
-
- Type: string
The unique identifier of the instance in Amazon EC2.
- Id
-
- Type: string
The unique identifier for the instance in Amazon EMR.
- InstanceFleetId
-
- Type: string
The unique identifier of the instance fleet to which an Amazon EC2 instance belongs.
- InstanceGroupId
-
- Type: string
The identifier of the instance group to which this instance belongs.
- InstanceType
-
- Type: string
The Amazon EC2 instance type, for example
m3.xlarge
. - Market
-
- Type: string
The instance purchasing option. Valid values are
ON_DEMAND
orSPOT
. - PrivateDnsName
-
- Type: string
The private DNS name of the instance.
- PrivateIpAddress
-
- Type: string
The private IP address of the instance.
- PublicDnsName
-
- Type: string
The public DNS name of the instance.
- PublicIpAddress
-
- Type: string
The public IP address of the instance.
- Status
-
- Type: InstanceStatus structure
The current status of the instance.
InstanceFleet
Description
Describes an instance fleet, which is a group of Amazon EC2 instances that host a particular node type (master, core, or task) in an Amazon EMR cluster. Instance fleets can consist of a mix of instance types and On-Demand and Spot Instances, which are provisioned to meet a defined target capacity.
The instance fleet configuration is available only in Amazon EMR releases 4.8.0 and later, excluding 5.0.x versions.
Members
- Context
-
- Type: string
Reserved.
- Id
-
- Type: string
The unique identifier of the instance fleet.
- InstanceFleetType
-
- Type: string
The node type that the instance fleet hosts. Valid values are MASTER, CORE, or TASK.
- InstanceTypeSpecifications
-
- Type: Array of InstanceTypeSpecification structures
An array of specifications for the instance types that comprise an instance fleet.
- LaunchSpecifications
-
- Type: InstanceFleetProvisioningSpecifications structure
Describes the launch specification for an instance fleet.
- Name
-
- Type: string
A friendly name for the instance fleet.
- ProvisionedOnDemandCapacity
-
- Type: int
The number of On-Demand units that have been provisioned for the instance fleet to fulfill
TargetOnDemandCapacity
. This provisioned capacity might be less than or greater thanTargetOnDemandCapacity
. - ProvisionedSpotCapacity
-
- Type: int
The number of Spot units that have been provisioned for this instance fleet to fulfill
TargetSpotCapacity
. This provisioned capacity might be less than or greater thanTargetSpotCapacity
. - ResizeSpecifications
-
- Type: InstanceFleetResizingSpecifications structure
The resize specification for the instance fleet.
- Status
-
- Type: InstanceFleetStatus structure
The current status of the instance fleet.
- TargetOnDemandCapacity
-
- Type: int
The target capacity of On-Demand units for the instance fleet, which determines how many On-Demand Instances to provision. When the instance fleet launches, Amazon EMR tries to provision On-Demand Instances as specified by InstanceTypeConfig. Each instance configuration has a specified
WeightedCapacity
. When an On-Demand Instance is provisioned, theWeightedCapacity
units count toward the target capacity. Amazon EMR provisions instances until the target capacity is totally fulfilled, even if this results in an overage. For example, if there are 2 units remaining to fulfill capacity, and Amazon EMR can only provision an instance with aWeightedCapacity
of 5 units, the instance is provisioned, and the target capacity is exceeded by 3 units. You can use InstanceFleet$ProvisionedOnDemandCapacity to determine the Spot capacity units that have been provisioned for the instance fleet.If not specified or set to 0, only Spot Instances are provisioned for the instance fleet using
TargetSpotCapacity
. At least one ofTargetSpotCapacity
andTargetOnDemandCapacity
should be greater than 0. For a master instance fleet, only one ofTargetSpotCapacity
andTargetOnDemandCapacity
can be specified, and its value must be 1. - TargetSpotCapacity
-
- Type: int
The target capacity of Spot units for the instance fleet, which determines how many Spot Instances to provision. When the instance fleet launches, Amazon EMR tries to provision Spot Instances as specified by InstanceTypeConfig. Each instance configuration has a specified
WeightedCapacity
. When a Spot instance is provisioned, theWeightedCapacity
units count toward the target capacity. Amazon EMR provisions instances until the target capacity is totally fulfilled, even if this results in an overage. For example, if there are 2 units remaining to fulfill capacity, and Amazon EMR can only provision an instance with aWeightedCapacity
of 5 units, the instance is provisioned, and the target capacity is exceeded by 3 units. You can use InstanceFleet$ProvisionedSpotCapacity to determine the Spot capacity units that have been provisioned for the instance fleet.If not specified or set to 0, only On-Demand Instances are provisioned for the instance fleet. At least one of
TargetSpotCapacity
andTargetOnDemandCapacity
should be greater than 0. For a master instance fleet, only one ofTargetSpotCapacity
andTargetOnDemandCapacity
can be specified, and its value must be 1.
InstanceFleetConfig
Description
The configuration that defines an instance fleet.
The instance fleet configuration is available only in Amazon EMR releases 4.8.0 and later, excluding 5.0.x versions.
Members
- Context
-
- Type: string
Reserved.
- InstanceFleetType
-
- Required: Yes
- Type: string
The node type that the instance fleet hosts. Valid values are MASTER, CORE, and TASK.
- InstanceTypeConfigs
-
- Type: Array of InstanceTypeConfig structures
The instance type configurations that define the Amazon EC2 instances in the instance fleet.
- LaunchSpecifications
-
- Type: InstanceFleetProvisioningSpecifications structure
The launch specification for the instance fleet.
- Name
-
- Type: string
The friendly name of the instance fleet.
- ResizeSpecifications
-
- Type: InstanceFleetResizingSpecifications structure
The resize specification for the instance fleet.
- TargetOnDemandCapacity
-
- Type: int
The target capacity of On-Demand units for the instance fleet, which determines how many On-Demand Instances to provision. When the instance fleet launches, Amazon EMR tries to provision On-Demand Instances as specified by InstanceTypeConfig. Each instance configuration has a specified
WeightedCapacity
. When an On-Demand Instance is provisioned, theWeightedCapacity
units count toward the target capacity. Amazon EMR provisions instances until the target capacity is totally fulfilled, even if this results in an overage. For example, if there are 2 units remaining to fulfill capacity, and Amazon EMR can only provision an instance with aWeightedCapacity
of 5 units, the instance is provisioned, and the target capacity is exceeded by 3 units.If not specified or set to 0, only Spot Instances are provisioned for the instance fleet using
TargetSpotCapacity
. At least one ofTargetSpotCapacity
andTargetOnDemandCapacity
should be greater than 0. For a master instance fleet, only one ofTargetSpotCapacity
andTargetOnDemandCapacity
can be specified, and its value must be 1. - TargetSpotCapacity
-
- Type: int
The target capacity of Spot units for the instance fleet, which determines how many Spot Instances to provision. When the instance fleet launches, Amazon EMR tries to provision Spot Instances as specified by InstanceTypeConfig. Each instance configuration has a specified
WeightedCapacity
. When a Spot Instance is provisioned, theWeightedCapacity
units count toward the target capacity. Amazon EMR provisions instances until the target capacity is totally fulfilled, even if this results in an overage. For example, if there are 2 units remaining to fulfill capacity, and Amazon EMR can only provision an instance with aWeightedCapacity
of 5 units, the instance is provisioned, and the target capacity is exceeded by 3 units.If not specified or set to 0, only On-Demand Instances are provisioned for the instance fleet. At least one of
TargetSpotCapacity
andTargetOnDemandCapacity
should be greater than 0. For a master instance fleet, only one ofTargetSpotCapacity
andTargetOnDemandCapacity
can be specified, and its value must be 1.
InstanceFleetModifyConfig
Description
Configuration parameters for an instance fleet modification request.
The instance fleet configuration is available only in Amazon EMR releases 4.8.0 and later, excluding 5.0.x versions.
Members
- Context
-
- Type: string
Reserved.
- InstanceFleetId
-
- Required: Yes
- Type: string
A unique identifier for the instance fleet.
- InstanceTypeConfigs
-
- Type: Array of InstanceTypeConfig structures
An array of InstanceTypeConfig objects that specify how Amazon EMR provisions Amazon EC2 instances when it fulfills On-Demand and Spot capacities. For more information, see InstanceTypeConfig.
- ResizeSpecifications
-
- Type: InstanceFleetResizingSpecifications structure
The resize specification for the instance fleet.
- TargetOnDemandCapacity
-
- Type: int
The target capacity of On-Demand units for the instance fleet. For more information see InstanceFleetConfig$TargetOnDemandCapacity.
- TargetSpotCapacity
-
- Type: int
The target capacity of Spot units for the instance fleet. For more information, see InstanceFleetConfig$TargetSpotCapacity.
InstanceFleetProvisioningSpecifications
Description
The launch specification for On-Demand and Spot Instances in the fleet.
The instance fleet configuration is available only in Amazon EMR releases 4.8.0 and later, excluding 5.0.x versions. On-Demand and Spot instance allocation strategies are available in Amazon EMR releases 5.12.1 and later.
Members
- OnDemandSpecification
-
- Type: OnDemandProvisioningSpecification structure
The launch specification for On-Demand Instances in the instance fleet, which determines the allocation strategy and capacity reservation options.
The instance fleet configuration is available only in Amazon EMR releases 4.8.0 and later, excluding 5.0.x versions. On-Demand Instances allocation strategy is available in Amazon EMR releases 5.12.1 and later.
- SpotSpecification
-
- Type: SpotProvisioningSpecification structure
The launch specification for Spot instances in the fleet, which determines the allocation strategy, defined duration, and provisioning timeout behavior.
InstanceFleetResizingSpecifications
Description
The resize specification for On-Demand and Spot Instances in the fleet.
Members
- OnDemandResizeSpecification
-
- Type: OnDemandResizingSpecification structure
The resize specification for On-Demand Instances in the instance fleet, which contains the allocation strategy, capacity reservation options, and the resize timeout period.
- SpotResizeSpecification
-
- Type: SpotResizingSpecification structure
The resize specification for Spot Instances in the instance fleet, which contains the allocation strategy and the resize timeout period.
InstanceFleetStateChangeReason
Description
Provides status change reason details for the instance fleet.
The instance fleet configuration is available only in Amazon EMR releases 4.8.0 and later, excluding 5.0.x versions.
Members
- Code
-
- Type: string
A code corresponding to the reason the state change occurred.
- Message
-
- Type: string
An explanatory message.
InstanceFleetStatus
Description
The status of the instance fleet.
The instance fleet configuration is available only in Amazon EMR releases 4.8.0 and later, excluding 5.0.x versions.
Members
- State
-
- Type: string
A code representing the instance fleet status.
-
PROVISIONING
—The instance fleet is provisioning Amazon EC2 resources and is not yet ready to run jobs. -
BOOTSTRAPPING
—Amazon EC2 instances and other resources have been provisioned and the bootstrap actions specified for the instances are underway. -
RUNNING
—Amazon EC2 instances and other resources are running. They are either executing jobs or waiting to execute jobs. -
RESIZING
—A resize operation is underway. Amazon EC2 instances are either being added or removed. -
SUSPENDED
—A resize operation could not complete. Existing Amazon EC2 instances are running, but instances can't be added or removed. -
TERMINATING
—The instance fleet is terminating Amazon EC2 instances. -
TERMINATED
—The instance fleet is no longer active, and all Amazon EC2 instances have been terminated.
- StateChangeReason
-
- Type: InstanceFleetStateChangeReason structure
Provides status change reason details for the instance fleet.
- Timeline
-
- Type: InstanceFleetTimeline structure
Provides historical timestamps for the instance fleet, including the time of creation, the time it became ready to run jobs, and the time of termination.
InstanceFleetTimeline
Description
Provides historical timestamps for the instance fleet, including the time of creation, the time it became ready to run jobs, and the time of termination.
The instance fleet configuration is available only in Amazon EMR releases 4.8.0 and later, excluding 5.0.x versions.
Members
- CreationDateTime
-
- Type: timestamp (string|DateTime or anything parsable by strtotime)
The time and date the instance fleet was created.
- EndDateTime
-
- Type: timestamp (string|DateTime or anything parsable by strtotime)
The time and date the instance fleet terminated.
- ReadyDateTime
-
- Type: timestamp (string|DateTime or anything parsable by strtotime)
The time and date the instance fleet was ready to run jobs.
InstanceGroup
Description
This entity represents an instance group, which is a group of instances that have common purpose. For example, CORE instance group is used for HDFS.
Members
- AutoScalingPolicy
-
- Type: AutoScalingPolicyDescription structure
An automatic scaling policy for a core instance group or task instance group in an Amazon EMR cluster. The automatic scaling policy defines how an instance group dynamically adds and terminates Amazon EC2 instances in response to the value of a CloudWatch metric. See PutAutoScalingPolicy.
- BidPrice
-
- Type: string
If specified, indicates that the instance group uses Spot Instances. This is the maximum price you are willing to pay for Spot Instances. Specify
OnDemandPrice
to set the amount equal to the On-Demand price, or specify an amount in USD. - Configurations
-
- Type: Array of Configuration structures
Amazon EMR releases 4.x or later.
The list of configurations supplied for an Amazon EMR cluster instance group. You can specify a separate configuration for each instance group (master, core, and task).
- ConfigurationsVersion
-
- Type: long (int|float)
The version number of the requested configuration specification for this instance group.
- CustomAmiId
-
- Type: string
The custom AMI ID to use for the provisioned instance group.
- EbsBlockDevices
-
- Type: Array of EbsBlockDevice structures
The EBS block devices that are mapped to this instance group.
- EbsOptimized
-
- Type: boolean
If the instance group is EBS-optimized. An Amazon EBS-optimized instance uses an optimized configuration stack and provides additional, dedicated capacity for Amazon EBS I/O.
- Id
-
- Type: string
The identifier of the instance group.
- InstanceGroupType
-
- Type: string
The type of the instance group. Valid values are MASTER, CORE or TASK.
- InstanceType
-
- Type: string
The Amazon EC2 instance type for all instances in the instance group.
- LastSuccessfullyAppliedConfigurations
-
- Type: Array of Configuration structures
A list of configurations that were successfully applied for an instance group last time.
- LastSuccessfullyAppliedConfigurationsVersion
-
- Type: long (int|float)
The version number of a configuration specification that was successfully applied for an instance group last time.
- Market
-
- Type: string
The marketplace to provision instances for this group. Valid values are ON_DEMAND or SPOT.
- Name
-
- Type: string
The name of the instance group.
- RequestedInstanceCount
-
- Type: int
The target number of instances for the instance group.
- RunningInstanceCount
-
- Type: int
The number of instances currently running in this instance group.
- ShrinkPolicy
-
- Type: ShrinkPolicy structure
Policy for customizing shrink operations.
- Status
-
- Type: InstanceGroupStatus structure
The current status of the instance group.
InstanceGroupConfig
Description
Configuration defining a new instance group.
Members
- AutoScalingPolicy
-
- Type: AutoScalingPolicy structure
An automatic scaling policy for a core instance group or task instance group in an Amazon EMR cluster. The automatic scaling policy defines how an instance group dynamically adds and terminates Amazon EC2 instances in response to the value of a CloudWatch metric. See PutAutoScalingPolicy.
- BidPrice
-
- Type: string
If specified, indicates that the instance group uses Spot Instances. This is the maximum price you are willing to pay for Spot Instances. Specify
OnDemandPrice
to set the amount equal to the On-Demand price, or specify an amount in USD. - Configurations
-
- Type: Array of Configuration structures
Amazon EMR releases 4.x or later.
The list of configurations supplied for an Amazon EMR cluster instance group. You can specify a separate configuration for each instance group (master, core, and task).
- CustomAmiId
-
- Type: string
The custom AMI ID to use for the provisioned instance group.
- EbsConfiguration
-
- Type: EbsConfiguration structure
EBS configurations that will be attached to each Amazon EC2 instance in the instance group.
- InstanceCount
-
- Required: Yes
- Type: int
Target number of instances for the instance group.
- InstanceRole
-
- Required: Yes
- Type: string
The role of the instance group in the cluster.
- InstanceType
-
- Required: Yes
- Type: string
The Amazon EC2 instance type for all instances in the instance group.
- Market
-
- Type: string
Market type of the Amazon EC2 instances used to create a cluster node.
- Name
-
- Type: string
Friendly name given to the instance group.
InstanceGroupDetail
Description
Detailed information about an instance group.
Members
- BidPrice
-
- Type: string
If specified, indicates that the instance group uses Spot Instances. This is the maximum price you are willing to pay for Spot Instances. Specify
OnDemandPrice
to set the amount equal to the On-Demand price, or specify an amount in USD. - CreationDateTime
-
- Required: Yes
- Type: timestamp (string|DateTime or anything parsable by strtotime)
The date/time the instance group was created.
- CustomAmiId
-
- Type: string
The custom AMI ID to use for the provisioned instance group.
- EndDateTime
-
- Type: timestamp (string|DateTime or anything parsable by strtotime)
The date/time the instance group was terminated.
- InstanceGroupId
-
- Type: string
Unique identifier for the instance group.
- InstanceRequestCount
-
- Required: Yes
- Type: int
Target number of instances to run in the instance group.
- InstanceRole
-
- Required: Yes
- Type: string
Instance group role in the cluster
- InstanceRunningCount
-
- Required: Yes
- Type: int
Actual count of running instances.
- InstanceType
-
- Required: Yes
- Type: string
Amazon EC2 instance type.
- LastStateChangeReason
-
- Type: string
Details regarding the state of the instance group.
- Market
-
- Required: Yes
- Type: string
Market type of the Amazon EC2 instances used to create a cluster node.
- Name
-
- Type: string
Friendly name for the instance group.
- ReadyDateTime
-
- Type: timestamp (string|DateTime or anything parsable by strtotime)
The date/time the instance group was available to the cluster.
- StartDateTime
-
- Type: timestamp (string|DateTime or anything parsable by strtotime)
The date/time the instance group was started.
- State
-
- Required: Yes
- Type: string
State of instance group. The following values are no longer supported: STARTING, TERMINATED, and FAILED.
InstanceGroupModifyConfig
Description
Modify the size or configurations of an instance group.
Members
- Configurations
-
- Type: Array of Configuration structures
A list of new or modified configurations to apply for an instance group.
- EC2InstanceIdsToTerminate
-
- Type: Array of strings
The Amazon EC2 InstanceIds to terminate. After you terminate the instances, the instance group will not return to its original requested size.
- InstanceCount
-
- Type: int
Target size for the instance group.
- InstanceGroupId
-
- Required: Yes
- Type: string
Unique ID of the instance group to modify.
- ReconfigurationType
-
- Type: string
Type of reconfiguration requested. Valid values are MERGE and OVERWRITE.
- ShrinkPolicy
-
- Type: ShrinkPolicy structure
Policy for customizing shrink operations.
InstanceGroupStateChangeReason
Description
The status change reason details for the instance group.
Members
- Code
-
- Type: string
The programmable code for the state change reason.
- Message
-
- Type: string
The status change reason description.
InstanceGroupStatus
Description
The details of the instance group status.
Members
- State
-
- Type: string
The current state of the instance group.
- StateChangeReason
-
- Type: InstanceGroupStateChangeReason structure
The status change reason details for the instance group.
- Timeline
-
- Type: InstanceGroupTimeline structure
The timeline of the instance group status over time.
InstanceGroupTimeline
Description
The timeline of the instance group lifecycle.
Members
- CreationDateTime
-
- Type: timestamp (string|DateTime or anything parsable by strtotime)
The creation date and time of the instance group.
- EndDateTime
-
- Type: timestamp (string|DateTime or anything parsable by strtotime)
The date and time when the instance group terminated.
- ReadyDateTime
-
- Type: timestamp (string|DateTime or anything parsable by strtotime)
The date and time when the instance group became ready to perform tasks.
InstanceResizePolicy
Description
Custom policy for requesting termination protection or termination of specific instances when shrinking an instance group.
Members
- InstanceTerminationTimeout
-
- Type: int
Decommissioning timeout override for the specific list of instances to be terminated.
- InstancesToProtect
-
- Type: Array of strings
Specific list of instances to be protected when shrinking an instance group.
- InstancesToTerminate
-
- Type: Array of strings
Specific list of instances to be terminated when shrinking an instance group.
InstanceStateChangeReason
Description
The details of the status change reason for the instance.
Members
- Code
-
- Type: string
The programmable code for the state change reason.
- Message
-
- Type: string
The status change reason description.
InstanceStatus
Description
The instance status details.
Members
- State
-
- Type: string
The current state of the instance.
- StateChangeReason
-
- Type: InstanceStateChangeReason structure
The details of the status change reason for the instance.
- Timeline
-
- Type: InstanceTimeline structure
The timeline of the instance status over time.
InstanceTimeline
Description
The timeline of the instance lifecycle.
Members
- CreationDateTime
-
- Type: timestamp (string|DateTime or anything parsable by strtotime)
The creation date and time of the instance.
- EndDateTime
-
- Type: timestamp (string|DateTime or anything parsable by strtotime)
The date and time when the instance was terminated.
- ReadyDateTime
-
- Type: timestamp (string|DateTime or anything parsable by strtotime)
The date and time when the instance was ready to perform tasks.
InstanceTypeConfig
Description
An instance type configuration for each instance type in an instance fleet, which determines the Amazon EC2 instances Amazon EMR attempts to provision to fulfill On-Demand and Spot target capacities. When you use an allocation strategy, you can include a maximum of 30 instance type configurations for a fleet. For more information about how to use an allocation strategy, see Configure Instance Fleets. Without an allocation strategy, you may specify a maximum of five instance type configurations for a fleet.
The instance fleet configuration is available only in Amazon EMR releases 4.8.0 and later, excluding 5.0.x versions.
Members
- BidPrice
-
- Type: string
The bid price for each Amazon EC2 Spot Instance type as defined by
InstanceType
. Expressed in USD. If neitherBidPrice
norBidPriceAsPercentageOfOnDemandPrice
is provided,BidPriceAsPercentageOfOnDemandPrice
defaults to 100%. - BidPriceAsPercentageOfOnDemandPrice
-
- Type: double
The bid price, as a percentage of On-Demand price, for each Amazon EC2 Spot Instance as defined by
InstanceType
. Expressed as a number (for example, 20 specifies 20%). If neitherBidPrice
norBidPriceAsPercentageOfOnDemandPrice
is provided,BidPriceAsPercentageOfOnDemandPrice
defaults to 100%. - Configurations
-
- Type: Array of Configuration structures
A configuration classification that applies when provisioning cluster instances, which can include configurations for applications and software that run on the cluster.
- CustomAmiId
-
- Type: string
The custom AMI ID to use for the instance type.
- EbsConfiguration
-
- Type: EbsConfiguration structure
The configuration of Amazon Elastic Block Store (Amazon EBS) attached to each instance as defined by
InstanceType
. - InstanceType
-
- Required: Yes
- Type: string
An Amazon EC2 instance type, such as
m3.xlarge
. - Priority
-
- Type: double
The priority at which Amazon EMR launches the Amazon EC2 instances with this instance type. Priority starts at 0, which is the highest priority. Amazon EMR considers the highest priority first.
- WeightedCapacity
-
- Type: int
The number of units that a provisioned instance of this type provides toward fulfilling the target capacities defined in InstanceFleetConfig. This value is 1 for a master instance fleet, and must be 1 or greater for core and task instance fleets. Defaults to 1 if not specified.
InstanceTypeSpecification
Description
The configuration specification for each instance type in an instance fleet.
The instance fleet configuration is available only in Amazon EMR releases 4.8.0 and later, excluding 5.0.x versions.
Members
- BidPrice
-
- Type: string
The bid price for each Amazon EC2 Spot Instance type as defined by
InstanceType
. Expressed in USD. - BidPriceAsPercentageOfOnDemandPrice
-
- Type: double
The bid price, as a percentage of On-Demand price, for each Amazon EC2 Spot Instance as defined by
InstanceType
. Expressed as a number (for example, 20 specifies 20%). - Configurations
-
- Type: Array of Configuration structures
A configuration classification that applies when provisioning cluster instances, which can include configurations for applications and software bundled with Amazon EMR.
- CustomAmiId
-
- Type: string
The custom AMI ID to use for the instance type.
- EbsBlockDevices
-
- Type: Array of EbsBlockDevice structures
The configuration of Amazon Elastic Block Store (Amazon EBS) attached to each instance as defined by
InstanceType
. - EbsOptimized
-
- Type: boolean
Evaluates to
TRUE
when the specifiedInstanceType
is EBS-optimized. - InstanceType
-
- Type: string
The Amazon EC2 instance type, for example
m3.xlarge
. - Priority
-
- Type: double
The priority at which Amazon EMR launches the Amazon EC2 instances with this instance type. Priority starts at 0, which is the highest priority. Amazon EMR considers the highest priority first.
- WeightedCapacity
-
- Type: int
The number of units that a provisioned instance of this type provides toward fulfilling the target capacities defined in InstanceFleetConfig. Capacity values represent performance characteristics such as vCPUs, memory, or I/O. If not specified, the default value is 1.
InternalServerError
Description
Indicates that an error occurred while processing the request and that the request was not completed.
Members
InternalServerException
Description
This exception occurs when there is an internal failure in the Amazon EMR service.
Members
- Message
-
- Type: string
The message associated with the exception.
InvalidRequestException
Description
This exception occurs when there is something wrong with user input.
Members
- ErrorCode
-
- Type: string
The error code associated with the exception.
- Message
-
- Type: string
The message associated with the exception.
JobFlowDetail
Description
A description of a cluster (job flow).
Members
- AmiVersion
-
- Type: string
Applies only to Amazon EMR AMI versions 3.x and 2.x. For Amazon EMR releases 4.0 and later,
ReleaseLabel
is used. To specify a custom AMI, useCustomAmiID
. - AutoScalingRole
-
- Type: string
An IAM role for automatic scaling policies. The default role is
EMR_AutoScaling_DefaultRole
. The IAM role provides a way for the automatic scaling feature to get the required permissions it needs to launch and terminate Amazon EC2 instances in an instance group. - BootstrapActions
-
- Type: Array of BootstrapActionDetail structures
A list of the bootstrap actions run by the job flow.
- ExecutionStatusDetail
-
- Required: Yes
- Type: JobFlowExecutionStatusDetail structure
Describes the execution status of the job flow.
- Instances
-
- Required: Yes
- Type: JobFlowInstancesDetail structure
Describes the Amazon EC2 instances of the job flow.
- JobFlowId
-
- Required: Yes
- Type: string
The job flow identifier.
- JobFlowRole
-
- Type: string
The IAM role that was specified when the job flow was launched. The Amazon EC2 instances of the job flow assume this role.
- LogEncryptionKmsKeyId
-
- Type: string
The KMS key used for encrypting log files. This attribute is only available with Amazon EMR 5.30.0 and later, excluding 6.0.0.
- LogUri
-
- Type: string
The location in Amazon S3 where log files for the job are stored.
- Name
-
- Required: Yes
- Type: string
The name of the job flow.
- ScaleDownBehavior
-
- Type: string
The way that individual Amazon EC2 instances terminate when an automatic scale-in activity occurs or an instance group is resized.
TERMINATE_AT_INSTANCE_HOUR
indicates that Amazon EMR terminates nodes at the instance-hour boundary, regardless of when the request to terminate the instance was submitted. This option is only available with Amazon EMR 5.1.0 and later and is the default for clusters created using that version.TERMINATE_AT_TASK_COMPLETION
indicates that Amazon EMR adds nodes to a deny list and drains tasks from nodes before terminating the Amazon EC2 instances, regardless of the instance-hour boundary. With either behavior, Amazon EMR removes the least active nodes first and blocks instance termination if it could lead to HDFS corruption.TERMINATE_AT_TASK_COMPLETION
available only in Amazon EMR releases 4.1.0 and later, and is the default for releases of Amazon EMR earlier than 5.1.0. - ServiceRole
-
- Type: string
The IAM role that is assumed by the Amazon EMR service to access Amazon Web Services resources on your behalf.
- Steps
-
- Type: Array of StepDetail structures
A list of steps run by the job flow.
- SupportedProducts
-
- Type: Array of strings
A list of strings set by third-party software when the job flow is launched. If you are not using third-party software to manage the job flow, this value is empty.
- VisibleToAllUsers
-
- Type: boolean
Indicates whether the cluster is visible to IAM principals in the Amazon Web Services account associated with the cluster. When
true
, IAM principals in the Amazon Web Services account can perform Amazon EMR cluster actions that their IAM policies allow. Whenfalse
, only the IAM principal that created the cluster and the Amazon Web Services account root user can perform Amazon EMR actions, regardless of IAM permissions policies attached to other IAM principals.The default value is
true
if a value is not provided when creating a cluster using the Amazon EMR API RunJobFlow command, the CLI create-cluster command, or the Amazon Web Services Management Console.
JobFlowExecutionStatusDetail
Description
Describes the status of the cluster (job flow).
Members
- CreationDateTime
-
- Required: Yes
- Type: timestamp (string|DateTime or anything parsable by strtotime)
The creation date and time of the job flow.
- EndDateTime
-
- Type: timestamp (string|DateTime or anything parsable by strtotime)
The completion date and time of the job flow.
- LastStateChangeReason
-
- Type: string
Description of the job flow last changed state.
- ReadyDateTime
-
- Type: timestamp (string|DateTime or anything parsable by strtotime)
The date and time when the job flow was ready to start running bootstrap actions.
- StartDateTime
-
- Type: timestamp (string|DateTime or anything parsable by strtotime)
The start date and time of the job flow.
- State
-
- Required: Yes
- Type: string
The state of the job flow.
JobFlowInstancesConfig
Description
A description of the Amazon EC2 instance on which the cluster (job flow) runs. A valid JobFlowInstancesConfig must contain either InstanceGroups or InstanceFleets. They cannot be used together. You may also have MasterInstanceType, SlaveInstanceType, and InstanceCount (all three must be present), but we don't recommend this configuration.
Members
- AdditionalMasterSecurityGroups
-
- Type: Array of strings
A list of additional Amazon EC2 security group IDs for the master node.
- AdditionalSlaveSecurityGroups
-
- Type: Array of strings
A list of additional Amazon EC2 security group IDs for the core and task nodes.
- Ec2KeyName
-
- Type: string
The name of the Amazon EC2 key pair that can be used to connect to the master node using SSH as the user called "hadoop."
- Ec2SubnetId
-
- Type: string
Applies to clusters that use the uniform instance group configuration. To launch the cluster in Amazon Virtual Private Cloud (Amazon VPC), set this parameter to the identifier of the Amazon VPC subnet where you want the cluster to launch. If you do not specify this value and your account supports EC2-Classic, the cluster launches in EC2-Classic.
- Ec2SubnetIds
-
- Type: Array of strings
Applies to clusters that use the instance fleet configuration. When multiple Amazon EC2 subnet IDs are specified, Amazon EMR evaluates them and launches instances in the optimal subnet.
The instance fleet configuration is available only in Amazon EMR releases 4.8.0 and later, excluding 5.0.x versions.
- EmrManagedMasterSecurityGroup
-
- Type: string
The identifier of the Amazon EC2 security group for the master node. If you specify
EmrManagedMasterSecurityGroup
, you must also specifyEmrManagedSlaveSecurityGroup
. - EmrManagedSlaveSecurityGroup
-
- Type: string
The identifier of the Amazon EC2 security group for the core and task nodes. If you specify
EmrManagedSlaveSecurityGroup
, you must also specifyEmrManagedMasterSecurityGroup
. - HadoopVersion
-
- Type: string
Applies only to Amazon EMR release versions earlier than 4.0. The Hadoop version for the cluster. Valid inputs are "0.18" (no longer maintained), "0.20" (no longer maintained), "0.20.205" (no longer maintained), "1.0.3", "2.2.0", or "2.4.0". If you do not set this value, the default of 0.18 is used, unless the
AmiVersion
parameter is set in the RunJobFlow call, in which case the default version of Hadoop for that AMI version is used. - InstanceCount
-
- Type: int
The number of Amazon EC2 instances in the cluster.
- InstanceFleets
-
- Type: Array of InstanceFleetConfig structures
The instance fleet configuration is available only in Amazon EMR releases 4.8.0 and later, excluding 5.0.x versions.
Describes the Amazon EC2 instances and instance configurations for clusters that use the instance fleet configuration.
- InstanceGroups
-
- Type: Array of InstanceGroupConfig structures
Configuration for the instance groups in a cluster.
- KeepJobFlowAliveWhenNoSteps
-
- Type: boolean
Specifies whether the cluster should remain available after completing all steps. Defaults to
false
. For more information about configuring cluster termination, see Control Cluster Termination in the EMR Management Guide. - MasterInstanceType
-
- Type: string
The Amazon EC2 instance type of the master node.
- Placement
-
- Type: PlacementType structure
The Availability Zone in which the cluster runs.
- ServiceAccessSecurityGroup
-
- Type: string
The identifier of the Amazon EC2 security group for the Amazon EMR service to access clusters in VPC private subnets.
- SlaveInstanceType
-
- Type: string
The Amazon EC2 instance type of the core and task nodes.
- TerminationProtected
-
- Type: boolean
Specifies whether to lock the cluster to prevent the Amazon EC2 instances from being terminated by API call, user intervention, or in the event of a job-flow error.
- UnhealthyNodeReplacement
-
- Type: boolean
Indicates whether Amazon EMR should gracefully replace core nodes that have degraded within the cluster.
JobFlowInstancesDetail
Description
Specify the type of Amazon EC2 instances that the cluster (job flow) runs on.
Members
- Ec2KeyName
-
- Type: string
The name of an Amazon EC2 key pair that can be used to connect to the master node using SSH.
- Ec2SubnetId
-
- Type: string
For clusters launched within Amazon Virtual Private Cloud, this is the identifier of the subnet where the cluster was launched.
- HadoopVersion
-
- Type: string
The Hadoop version for the cluster.
- InstanceCount
-
- Required: Yes
- Type: int
The number of Amazon EC2 instances in the cluster. If the value is 1, the same instance serves as both the master and core and task node. If the value is greater than 1, one instance is the master node and all others are core and task nodes.
- InstanceGroups
-
- Type: Array of InstanceGroupDetail structures
Details about the instance groups in a cluster.
- KeepJobFlowAliveWhenNoSteps
-
- Type: boolean
Specifies whether the cluster should remain available after completing all steps.
- MasterInstanceId
-
- Type: string
The Amazon EC2 instance identifier of the master node.
- MasterInstanceType
-
- Required: Yes
- Type: string
The Amazon EC2 master node instance type.
- MasterPublicDnsName
-
- Type: string
The DNS name of the master node. If the cluster is on a private subnet, this is the private DNS name. On a public subnet, this is the public DNS name.
- NormalizedInstanceHours
-
- Type: int
An approximation of the cost of the cluster, represented in m1.small/hours. This value is increased one time for every hour that an m1.small instance runs. Larger instances are weighted more heavily, so an Amazon EC2 instance that is roughly four times more expensive would result in the normalized instance hours being increased incrementally four times. This result is only an approximation and does not reflect the actual billing rate.
- Placement
-
- Type: PlacementType structure
The Amazon EC2 Availability Zone for the cluster.
- SlaveInstanceType
-
- Required: Yes
- Type: string
The Amazon EC2 core and task node instance type.
- TerminationProtected
-
- Type: boolean
Specifies whether the Amazon EC2 instances in the cluster are protected from termination by API calls, user intervention, or in the event of a job-flow error.
- UnhealthyNodeReplacement
-
- Type: boolean
Indicates whether Amazon EMR should gracefully replace core nodes that have degraded within the cluster.
KerberosAttributes
Description
Attributes for Kerberos configuration when Kerberos authentication is enabled using a security configuration. For more information see Use Kerberos Authentication in the Amazon EMR Management Guide.
Members
- ADDomainJoinPassword
-
- Type: string
The Active Directory password for
ADDomainJoinUser
. - ADDomainJoinUser
-
- Type: string
Required only when establishing a cross-realm trust with an Active Directory domain. A user with sufficient privileges to join resources to the domain.
- CrossRealmTrustPrincipalPassword
-
- Type: string
Required only when establishing a cross-realm trust with a KDC in a different realm. The cross-realm principal password, which must be identical across realms.
- KdcAdminPassword
-
- Required: Yes
- Type: string
The password used within the cluster for the kadmin service on the cluster-dedicated KDC, which maintains Kerberos principals, password policies, and keytabs for the cluster.
- Realm
-
- Required: Yes
- Type: string
The name of the Kerberos realm to which all nodes in a cluster belong. For example,
EC2.INTERNAL
.
KeyValue
Description
A key-value pair.
Members
- Key
-
- Type: string
The unique identifier of a key-value pair.
- Value
-
- Type: string
The value part of the identified key.
ManagedScalingPolicy
Description
Managed scaling policy for an Amazon EMR cluster. The policy specifies the limits for resources that can be added or terminated from a cluster. The policy only applies to the core and task nodes. The master node cannot be scaled after initial configuration.
Members
- ComputeLimits
-
- Type: ComputeLimits structure
The Amazon EC2 unit limits for a managed scaling policy. The managed scaling activity of a cluster is not allowed to go above or below these limits. The limit only applies to the core and task nodes. The master node cannot be scaled after initial configuration.
MetricDimension
Description
A CloudWatch dimension, which is specified using a Key
(known as a Name
in CloudWatch), Value
pair. By default, Amazon EMR uses one dimension whose Key
is JobFlowID
and Value
is a variable representing the cluster ID, which is ${emr.clusterId}
. This enables the rule to bootstrap when the cluster ID becomes available.
Members
- Key
-
- Type: string
The dimension name.
- Value
-
- Type: string
The dimension value.
NotebookExecution
Description
A notebook execution. An execution is a specific instance that an Amazon EMR Notebook is run using the StartNotebookExecution
action.
Members
- Arn
-
- Type: string
The Amazon Resource Name (ARN) of the notebook execution.
- EditorId
-
- Type: string
The unique identifier of the Amazon EMR Notebook that is used for the notebook execution.
- EndTime
-
- Type: timestamp (string|DateTime or anything parsable by strtotime)
The timestamp when notebook execution ended.
- EnvironmentVariables
-
- Type: Associative array of custom strings keys (XmlStringMaxLen256) to strings
The environment variables associated with the notebook execution.
- ExecutionEngine
-
- Type: ExecutionEngineConfig structure
The execution engine, such as an Amazon EMR cluster, used to run the Amazon EMR notebook and perform the notebook execution.
- LastStateChangeReason
-
- Type: string
The reason for the latest status change of the notebook execution.
- NotebookExecutionId
-
- Type: string
The unique identifier of a notebook execution.
- NotebookExecutionName
-
- Type: string
A name for the notebook execution.
- NotebookInstanceSecurityGroupId
-
- Type: string
The unique identifier of the Amazon EC2 security group associated with the Amazon EMR Notebook instance. For more information see Specifying Amazon EC2 Security Groups for Amazon EMR Notebooks in the Amazon EMR Management Guide.
- NotebookParams
-
- Type: string
Input parameters in JSON format passed to the Amazon EMR Notebook at runtime for execution.
- NotebookS3Location
-
- Type: NotebookS3LocationForOutput structure
The Amazon S3 location that stores the notebook execution input.
- OutputNotebookFormat
-
- Type: string
The output format for the notebook execution.
- OutputNotebookS3Location
-
- Type: OutputNotebookS3LocationForOutput structure
The Amazon S3 location for the notebook execution output.
- OutputNotebookURI
-
- Type: string
The location of the notebook execution's output file in Amazon S3.
- StartTime
-
- Type: timestamp (string|DateTime or anything parsable by strtotime)
The timestamp when notebook execution started.
- Status
-
- Type: string
The status of the notebook execution.
-
START_PENDING
indicates that the cluster has received the execution request but execution has not begun. -
STARTING
indicates that the execution is starting on the cluster. -
RUNNING
indicates that the execution is being processed by the cluster. -
FINISHING
indicates that execution processing is in the final stages. -
FINISHED
indicates that the execution has completed without error. -
FAILING
indicates that the execution is failing and will not finish successfully. -
FAILED
indicates that the execution failed. -
STOP_PENDING
indicates that the cluster has received aStopNotebookExecution
request and the stop is pending. -
STOPPING
indicates that the cluster is in the process of stopping the execution as a result of aStopNotebookExecution
request. -
STOPPED
indicates that the execution stopped because of aStopNotebookExecution
request.
- Tags
-
- Type: Array of Tag structures
A list of tags associated with a notebook execution. Tags are user-defined key-value pairs that consist of a required key string with a maximum of 128 characters and an optional value string with a maximum of 256 characters.
NotebookExecutionSummary
Description
Details for a notebook execution. The details include information such as the unique ID and status of the notebook execution.
Members
- EditorId
-
- Type: string
The unique identifier of the editor associated with the notebook execution.
- EndTime
-
- Type: timestamp (string|DateTime or anything parsable by strtotime)
The timestamp when notebook execution started.
- ExecutionEngineId
-
- Type: string
The unique ID of the execution engine for the notebook execution.
- NotebookExecutionId
-
- Type: string
The unique identifier of the notebook execution.
- NotebookExecutionName
-
- Type: string
The name of the notebook execution.
- NotebookS3Location
-
- Type: NotebookS3LocationForOutput structure
The Amazon S3 location that stores the notebook execution input.
- StartTime
-
- Type: timestamp (string|DateTime or anything parsable by strtotime)
The timestamp when notebook execution started.
- Status
-
- Type: string
The status of the notebook execution.
-
START_PENDING
indicates that the cluster has received the execution request but execution has not begun. -
STARTING
indicates that the execution is starting on the cluster. -
RUNNING
indicates that the execution is being processed by the cluster. -
FINISHING
indicates that execution processing is in the final stages. -
FINISHED
indicates that the execution has completed without error. -
FAILING
indicates that the execution is failing and will not finish successfully. -
FAILED
indicates that the execution failed. -
STOP_PENDING
indicates that the cluster has received aStopNotebookExecution
request and the stop is pending. -
STOPPING
indicates that the cluster is in the process of stopping the execution as a result of aStopNotebookExecution
request. -
STOPPED
indicates that the execution stopped because of aStopNotebookExecution
request.
NotebookS3LocationForOutput
Description
The Amazon S3 location that stores the notebook execution input.
Members
- Bucket
-
- Type: string
The Amazon S3 bucket that stores the notebook execution input.
- Key
-
- Type: string
The key to the Amazon S3 location that stores the notebook execution input.
NotebookS3LocationFromInput
Description
The Amazon S3 location that stores the notebook execution input.
Members
- Bucket
-
- Type: string
The Amazon S3 bucket that stores the notebook execution input.
- Key
-
- Type: string
The key to the Amazon S3 location that stores the notebook execution input.
OSRelease
Description
The Amazon Linux release specified for a cluster in the RunJobFlow request.
Members
- Label
-
- Type: string
The Amazon Linux release specified for a cluster in the RunJobFlow request. The format is as shown in Amazon Linux 2 Release Notes . For example, 2.0.20220218.1.
OnDemandCapacityReservationOptions
Description
Describes the strategy for using unused Capacity Reservations for fulfilling On-Demand capacity.
Members
- CapacityReservationPreference
-
- Type: string
Indicates the instance's Capacity Reservation preferences. Possible preferences include:
-
open
- The instance can run in any open Capacity Reservation that has matching attributes (instance type, platform, Availability Zone). -
none
- The instance avoids running in a Capacity Reservation even if one is available. The instance runs as an On-Demand Instance.
- CapacityReservationResourceGroupArn
-
- Type: string
The ARN of the Capacity Reservation resource group in which to run the instance.
- UsageStrategy
-
- Type: string
Indicates whether to use unused Capacity Reservations for fulfilling On-Demand capacity.
If you specify
use-capacity-reservations-first
, the fleet uses unused Capacity Reservations to fulfill On-Demand capacity up to the target On-Demand capacity. If multiple instance pools have unused Capacity Reservations, the On-Demand allocation strategy (lowest-price
) is applied. If the number of unused Capacity Reservations is less than the On-Demand target capacity, the remaining On-Demand target capacity is launched according to the On-Demand allocation strategy (lowest-price
).If you do not specify a value, the fleet fulfills the On-Demand capacity according to the chosen On-Demand allocation strategy.
OnDemandProvisioningSpecification
Description
The launch specification for On-Demand Instances in the instance fleet, which determines the allocation strategy.
The instance fleet configuration is available only in Amazon EMR releases 4.8.0 and later, excluding 5.0.x versions. On-Demand Instances allocation strategy is available in Amazon EMR releases 5.12.1 and later.
Members
- AllocationStrategy
-
- Required: Yes
- Type: string
Specifies the strategy to use in launching On-Demand instance fleets. Available options are
lowest-price
andprioritized
.lowest-price
specifies to launch the instances with the lowest price first, andprioritized
specifies that Amazon EMR should launch the instances with the highest priority first. The default islowest-price
. - CapacityReservationOptions
-
- Type: OnDemandCapacityReservationOptions structure
The launch specification for On-Demand instances in the instance fleet, which determines the allocation strategy.
OnDemandResizingSpecification
Description
The resize specification for On-Demand Instances in the instance fleet, which contains the resize timeout period.
Members
- AllocationStrategy
-
- Type: string
Specifies the allocation strategy to use to launch On-Demand instances during a resize. The default is
lowest-price
. - CapacityReservationOptions
-
- Type: OnDemandCapacityReservationOptions structure
Describes the strategy for using unused Capacity Reservations for fulfilling On-Demand capacity.
- TimeoutDurationMinutes
-
- Type: int
On-Demand resize timeout in minutes. If On-Demand Instances are not provisioned within this time, the resize workflow stops. The minimum value is 5 minutes, and the maximum value is 10,080 minutes (7 days). The timeout applies to all resize workflows on the Instance Fleet. The resize could be triggered by Amazon EMR Managed Scaling or by the customer (via Amazon EMR Console, Amazon EMR CLI modify-instance-fleet or Amazon EMR SDK ModifyInstanceFleet API) or by Amazon EMR due to Amazon EC2 Spot Reclamation.
OutputNotebookS3LocationForOutput
Description
The Amazon S3 location that stores the notebook execution output.
Members
- Bucket
-
- Type: string
The Amazon S3 bucket that stores the notebook execution output.
- Key
-
- Type: string
The key to the Amazon S3 location that stores the notebook execution output.
OutputNotebookS3LocationFromInput
Description
The Amazon S3 location that stores the notebook execution output.
Members
- Bucket
-
- Type: string
The Amazon S3 bucket that stores the notebook execution output.
- Key
-
- Type: string
The key to the Amazon S3 location that stores the notebook execution output.
PlacementGroupConfig
Description
Placement group configuration for an Amazon EMR cluster. The configuration specifies the placement strategy that can be applied to instance roles during cluster creation.
To use this configuration, consider attaching managed policy AmazonElasticMapReducePlacementGroupPolicy to the Amazon EMR role.
Members
- InstanceRole
-
- Required: Yes
- Type: string
Role of the instance in the cluster.
Starting with Amazon EMR release 5.23.0, the only supported instance role is
MASTER
. - PlacementStrategy
-
- Type: string
Amazon EC2 Placement Group strategy associated with instance role.
Starting with Amazon EMR release 5.23.0, the only supported placement strategy is
SPREAD
for theMASTER
instance role.
PlacementType
Description
The Amazon EC2 Availability Zone configuration of the cluster (job flow).
Members
- AvailabilityZone
-
- Type: string
The Amazon EC2 Availability Zone for the cluster.
AvailabilityZone
is used for uniform instance groups, whileAvailabilityZones
(plural) is used for instance fleets. - AvailabilityZones
-
- Type: Array of strings
When multiple Availability Zones are specified, Amazon EMR evaluates them and launches instances in the optimal Availability Zone.
AvailabilityZones
is used for instance fleets, whileAvailabilityZone
(singular) is used for uniform instance groups.The instance fleet configuration is available only in Amazon EMR releases 4.8.0 and later, excluding 5.0.x versions.
PortRange
Description
A list of port ranges that are permitted to allow inbound traffic from all public IP addresses. To specify a single port, use the same value for MinRange
and MaxRange
.
Members
- MaxRange
-
- Type: int
The smallest port number in a specified range of port numbers.
- MinRange
-
- Required: Yes
- Type: int
The smallest port number in a specified range of port numbers.
ReleaseLabelFilter
Description
The release label filters by application or version prefix.
Members
- Application
-
- Type: string
Optional release label application filter. For example,
spark@2.1.0
. - Prefix
-
- Type: string
Optional release label version prefix filter. For example,
emr-5
.
ScalingAction
Description
The type of adjustment the automatic scaling activity makes when triggered, and the periodicity of the adjustment.
Members
- Market
-
- Type: string
Not available for instance groups. Instance groups use the market type specified for the group.
- SimpleScalingPolicyConfiguration
-
- Required: Yes
- Type: SimpleScalingPolicyConfiguration structure
The type of adjustment the automatic scaling activity makes when triggered, and the periodicity of the adjustment.
ScalingConstraints
Description
The upper and lower Amazon EC2 instance limits for an automatic scaling policy. Automatic scaling activities triggered by automatic scaling rules will not cause an instance group to grow above or below these limits.
Members
- MaxCapacity
-
- Required: Yes
- Type: int
The upper boundary of Amazon EC2 instances in an instance group beyond which scaling activities are not allowed to grow. Scale-out activities will not add instances beyond this boundary.
- MinCapacity
-
- Required: Yes
- Type: int
The lower boundary of Amazon EC2 instances in an instance group below which scaling activities are not allowed to shrink. Scale-in activities will not terminate instances below this boundary.
ScalingRule
Description
A scale-in or scale-out rule that defines scaling activity, including the CloudWatch metric alarm that triggers activity, how Amazon EC2 instances are added or removed, and the periodicity of adjustments. The automatic scaling policy for an instance group can comprise one or more automatic scaling rules.
Members
- Action
-
- Required: Yes
- Type: ScalingAction structure
The conditions that trigger an automatic scaling activity.
- Description
-
- Type: string
A friendly, more verbose description of the automatic scaling rule.
- Name
-
- Required: Yes
- Type: string
The name used to identify an automatic scaling rule. Rule names must be unique within a scaling policy.
- Trigger
-
- Required: Yes
- Type: ScalingTrigger structure
The CloudWatch alarm definition that determines when automatic scaling activity is triggered.
ScalingTrigger
Description
The conditions that trigger an automatic scaling activity.
Members
- CloudWatchAlarmDefinition
-
- Required: Yes
- Type: CloudWatchAlarmDefinition structure
The definition of a CloudWatch metric alarm. When the defined alarm conditions are met along with other trigger parameters, scaling activity begins.
ScriptBootstrapActionConfig
Description
Configuration of the script to run during a bootstrap action.
Members
- Args
-
- Type: Array of strings
A list of command line arguments to pass to the bootstrap action script.
- Path
-
- Required: Yes
- Type: string
Location in Amazon S3 of the script to run during a bootstrap action.
SecurityConfigurationSummary
Description
The creation date and time, and name, of a security configuration.
Members
- CreationDateTime
-
- Type: timestamp (string|DateTime or anything parsable by strtotime)
The date and time the security configuration was created.
- Name
-
- Type: string
The name of the security configuration.
SessionMappingDetail
Description
Details for an Amazon EMR Studio session mapping including creation time, user or group ID, Studio ID, and so on.
Members
- CreationTime
-
- Type: timestamp (string|DateTime or anything parsable by strtotime)
The time the session mapping was created.
- IdentityId
-
- Type: string
The globally unique identifier (GUID) of the user or group.
- IdentityName
-
- Type: string
The name of the user or group. For more information, see UserName and DisplayName in the IAM Identity Center Identity Store API Reference.
- IdentityType
-
- Type: string
Specifies whether the identity mapped to the Amazon EMR Studio is a user or a group.
- LastModifiedTime
-
- Type: timestamp (string|DateTime or anything parsable by strtotime)
The time the session mapping was last modified.
- SessionPolicyArn
-
- Type: string
The Amazon Resource Name (ARN) of the session policy associated with the user or group.
- StudioId
-
- Type: string
The ID of the Amazon EMR Studio.
SessionMappingSummary
Description
Details for an Amazon EMR Studio session mapping. The details do not include the time the session mapping was last modified.
Members
- CreationTime
-
- Type: timestamp (string|DateTime or anything parsable by strtotime)
The time the session mapping was created.
- IdentityId
-
- Type: string
The globally unique identifier (GUID) of the user or group from the IAM Identity Center Identity Store.
- IdentityName
-
- Type: string
The name of the user or group. For more information, see UserName and DisplayName in the IAM Identity Center Identity Store API Reference.
- IdentityType
-
- Type: string
Specifies whether the identity mapped to the Amazon EMR Studio is a user or a group.
- SessionPolicyArn
-
- Type: string
The Amazon Resource Name (ARN) of the session policy associated with the user or group.
- StudioId
-
- Type: string
The ID of the Amazon EMR Studio.
ShrinkPolicy
Description
Policy for customizing shrink operations. Allows configuration of decommissioning timeout and targeted instance shrinking.
Members
- DecommissionTimeout
-
- Type: int
The desired timeout for decommissioning an instance. Overrides the default YARN decommissioning timeout.
- InstanceResizePolicy
-
- Type: InstanceResizePolicy structure
Custom policy for requesting termination protection or termination of specific instances when shrinking an instance group.
SimpleScalingPolicyConfiguration
Description
An automatic scaling configuration, which describes how the policy adds or removes instances, the cooldown period, and the number of Amazon EC2 instances that will be added each time the CloudWatch metric alarm condition is satisfied.
Members
- AdjustmentType
-
- Type: string
The way in which Amazon EC2 instances are added (if
ScalingAdjustment
is a positive number) or terminated (ifScalingAdjustment
is a negative number) each time the scaling activity is triggered.CHANGE_IN_CAPACITY
is the default.CHANGE_IN_CAPACITY
indicates that the Amazon EC2 instance count increments or decrements byScalingAdjustment
, which should be expressed as an integer.PERCENT_CHANGE_IN_CAPACITY
indicates the instance count increments or decrements by the percentage specified byScalingAdjustment
, which should be expressed as an integer. For example, 20 indicates an increase in 20% increments of cluster capacity.EXACT_CAPACITY
indicates the scaling activity results in an instance group with the number of Amazon EC2 instances specified byScalingAdjustment
, which should be expressed as a positive integer. - CoolDown
-
- Type: int
The amount of time, in seconds, after a scaling activity completes before any further trigger-related scaling activities can start. The default value is 0.
- ScalingAdjustment
-
- Required: Yes
- Type: int
The amount by which to scale in or scale out, based on the specified
AdjustmentType
. A positive value adds to the instance group's Amazon EC2 instance count while a negative number removes instances. IfAdjustmentType
is set toEXACT_CAPACITY
, the number should only be a positive integer. IfAdjustmentType
is set toPERCENT_CHANGE_IN_CAPACITY
, the value should express the percentage as an integer. For example, -20 indicates a decrease in 20% increments of cluster capacity.
SimplifiedApplication
Description
The returned release label application names or versions.
Members
- Name
-
- Type: string
The returned release label application name. For example,
hadoop
. - Version
-
- Type: string
The returned release label application version. For example,
3.2.1
.
SpotProvisioningSpecification
Description
The launch specification for Spot Instances in the instance fleet, which determines the defined duration, provisioning timeout behavior, and allocation strategy.
The instance fleet configuration is available only in Amazon EMR releases 4.8.0 and later, excluding 5.0.x versions. Spot Instance allocation strategy is available in Amazon EMR releases 5.12.1 and later.
Spot Instances with a defined duration (also known as Spot blocks) are no longer available to new customers from July 1, 2021. For customers who have previously used the feature, we will continue to support Spot Instances with a defined duration until December 31, 2022.
Members
- AllocationStrategy
-
- Type: string
Specifies one of the following strategies to launch Spot Instance fleets:
capacity-optimized
,price-capacity-optimized
,lowest-price
, ordiversified
, andcapacity-optimized-prioritized
. For more information on the provisioning strategies, see Allocation strategies for Spot Instances in the Amazon EC2 User Guide for Linux Instances.When you launch a Spot Instance fleet with the old console, it automatically launches with the
capacity-optimized
strategy. You can't change the allocation strategy from the old console. - BlockDurationMinutes
-
- Type: int
The defined duration for Spot Instances (also known as Spot blocks) in minutes. When specified, the Spot Instance does not terminate before the defined duration expires, and defined duration pricing for Spot Instances applies. Valid values are 60, 120, 180, 240, 300, or 360. The duration period starts as soon as a Spot Instance receives its instance ID. At the end of the duration, Amazon EC2 marks the Spot Instance for termination and provides a Spot Instance termination notice, which gives the instance a two-minute warning before it terminates.
Spot Instances with a defined duration (also known as Spot blocks) are no longer available to new customers from July 1, 2021. For customers who have previously used the feature, we will continue to support Spot Instances with a defined duration until December 31, 2022.
- TimeoutAction
-
- Required: Yes
- Type: string
The action to take when
TargetSpotCapacity
has not been fulfilled when theTimeoutDurationMinutes
has expired; that is, when all Spot Instances could not be provisioned within the Spot provisioning timeout. Valid values areTERMINATE_CLUSTER
andSWITCH_TO_ON_DEMAND
. SWITCH_TO_ON_DEMAND specifies that if no Spot Instances are available, On-Demand Instances should be provisioned to fulfill any remaining Spot capacity. - TimeoutDurationMinutes
-
- Required: Yes
- Type: int
The Spot provisioning timeout period in minutes. If Spot Instances are not provisioned within this time period, the
TimeOutAction
is taken. Minimum value is 5 and maximum value is 1440. The timeout applies only during initial provisioning, when the cluster is first created.
SpotResizingSpecification
Description
The resize specification for Spot Instances in the instance fleet, which contains the resize timeout period.
Members
- AllocationStrategy
-
- Type: string
Specifies the allocation strategy to use to launch Spot instances during a resize. If you run Amazon EMR releases 6.9.0 or higher, the default is
price-capacity-optimized
. If you run Amazon EMR releases 6.8.0 or lower, the default iscapacity-optimized
. - TimeoutDurationMinutes
-
- Type: int
Spot resize timeout in minutes. If Spot Instances are not provisioned within this time, the resize workflow will stop provisioning of Spot instances. Minimum value is 5 minutes and maximum value is 10,080 minutes (7 days). The timeout applies to all resize workflows on the Instance Fleet. The resize could be triggered by Amazon EMR Managed Scaling or by the customer (via Amazon EMR Console, Amazon EMR CLI modify-instance-fleet or Amazon EMR SDK ModifyInstanceFleet API) or by Amazon EMR due to Amazon EC2 Spot Reclamation.
Step
Description
This represents a step in a cluster.
Members
- ActionOnFailure
-
- Type: string
The action to take when the cluster step fails. Possible values are
TERMINATE_CLUSTER
,CANCEL_AND_WAIT
, andCONTINUE
.TERMINATE_JOB_FLOW
is provided for backward compatibility. We recommend usingTERMINATE_CLUSTER
instead.If a cluster's
StepConcurrencyLevel
is greater than1
, do not useAddJobFlowSteps
to submit a step with this parameter set toCANCEL_AND_WAIT
orTERMINATE_CLUSTER
. The step is not submitted and the action fails with a message that theActionOnFailure
setting is not valid.If you change a cluster's
StepConcurrencyLevel
to be greater than 1 while a step is running, theActionOnFailure
parameter may not behave as you expect. In this case, for a step that fails with this parameter set toCANCEL_AND_WAIT
, pending steps and the running step are not canceled; for a step that fails with this parameter set toTERMINATE_CLUSTER
, the cluster does not terminate. - Config
-
- Type: HadoopStepConfig structure
The Hadoop job configuration of the cluster step.
- ExecutionRoleArn
-
- Type: string
The Amazon Resource Name (ARN) of the runtime role for a step on the cluster. The runtime role can be a cross-account IAM role. The runtime role ARN is a combination of account ID, role name, and role type using the following format:
arn:partition:service:region:account:resource
.For example,
arn:aws:IAM::1234567890:role/ReadOnly
is a correctly formatted runtime role ARN. - Id
-
- Type: string
The identifier of the cluster step.
- Name
-
- Type: string
The name of the cluster step.
- Status
-
- Type: StepStatus structure
The current execution status details of the cluster step.
StepConfig
Description
Specification for a cluster (job flow) step.
Members
- ActionOnFailure
-
- Type: string
The action to take when the step fails. Use one of the following values:
-
TERMINATE_CLUSTER
- Shuts down the cluster. -
CANCEL_AND_WAIT
- Cancels any pending steps and returns the cluster to theWAITING
state. -
CONTINUE
- Continues to the next step in the queue. -
TERMINATE_JOB_FLOW
- Shuts down the cluster.TERMINATE_JOB_FLOW
is provided for backward compatibility. We recommend usingTERMINATE_CLUSTER
instead.
If a cluster's
StepConcurrencyLevel
is greater than1
, do not useAddJobFlowSteps
to submit a step with this parameter set toCANCEL_AND_WAIT
orTERMINATE_CLUSTER
. The step is not submitted and the action fails with a message that theActionOnFailure
setting is not valid.If you change a cluster's
StepConcurrencyLevel
to be greater than 1 while a step is running, theActionOnFailure
parameter may not behave as you expect. In this case, for a step that fails with this parameter set toCANCEL_AND_WAIT
, pending steps and the running step are not canceled; for a step that fails with this parameter set toTERMINATE_CLUSTER
, the cluster does not terminate. - HadoopJarStep
-
- Required: Yes
- Type: HadoopJarStepConfig structure
The JAR file used for the step.
- Name
-
- Required: Yes
- Type: string
The name of the step.
StepDetail
Description
Combines the execution state and configuration of a step.
Members
- ExecutionStatusDetail
-
- Required: Yes
- Type: StepExecutionStatusDetail structure
The description of the step status.
- StepConfig
-
- Required: Yes
- Type: StepConfig structure
The step configuration.
StepExecutionStatusDetail
Description
The execution state of a step.
Members
- CreationDateTime
-
- Required: Yes
- Type: timestamp (string|DateTime or anything parsable by strtotime)
The creation date and time of the step.
- EndDateTime
-
- Type: timestamp (string|DateTime or anything parsable by strtotime)
The completion date and time of the step.
- LastStateChangeReason
-
- Type: string
A description of the step's current state.
- StartDateTime
-
- Type: timestamp (string|DateTime or anything parsable by strtotime)
The start date and time of the step.
- State
-
- Required: Yes
- Type: string
The state of the step.
StepStateChangeReason
Description
The details of the step state change reason.
Members
- Code
-
- Type: string
The programmable code for the state change reason. Note: Currently, the service provides no code for the state change.
- Message
-
- Type: string
The descriptive message for the state change reason.
StepStatus
Description
The execution status details of the cluster step.
Members
- FailureDetails
-
- Type: FailureDetails structure
The details for the step failure including reason, message, and log file path where the root cause was identified.
- State
-
- Type: string
The execution state of the cluster step.
- StateChangeReason
-
- Type: StepStateChangeReason structure
The reason for the step execution status change.
- Timeline
-
- Type: StepTimeline structure
The timeline of the cluster step status over time.
StepSummary
Description
The summary of the cluster step.
Members
- ActionOnFailure
-
- Type: string
The action to take when the cluster step fails. Possible values are TERMINATE_CLUSTER, CANCEL_AND_WAIT, and CONTINUE. TERMINATE_JOB_FLOW is available for backward compatibility.
- Config
-
- Type: HadoopStepConfig structure
The Hadoop job configuration of the cluster step.
- Id
-
- Type: string
The identifier of the cluster step.
- Name
-
- Type: string
The name of the cluster step.
- Status
-
- Type: StepStatus structure
The current execution status details of the cluster step.
StepTimeline
Description
The timeline of the cluster step lifecycle.
Members
- CreationDateTime
-
- Type: timestamp (string|DateTime or anything parsable by strtotime)
The date and time when the cluster step was created.
- EndDateTime
-
- Type: timestamp (string|DateTime or anything parsable by strtotime)
The date and time when the cluster step execution completed or failed.
- StartDateTime
-
- Type: timestamp (string|DateTime or anything parsable by strtotime)
The date and time when the cluster step execution started.
Studio
Description
Details for an Amazon EMR Studio including ID, creation time, name, and so on.
Members
- AuthMode
-
- Type: string
Specifies whether the Amazon EMR Studio authenticates users with IAM or IAM Identity Center.
- CreationTime
-
- Type: timestamp (string|DateTime or anything parsable by strtotime)
The time the Amazon EMR Studio was created.
- DefaultS3Location
-
- Type: string
The Amazon S3 location to back up Amazon EMR Studio Workspaces and notebook files.
- Description
-
- Type: string
The detailed description of the Amazon EMR Studio.
- EncryptionKeyArn
-
- Type: string
The KMS key identifier (ARN) used to encrypt Amazon EMR Studio workspace and notebook files when backed up to Amazon S3.
- EngineSecurityGroupId
-
- Type: string
The ID of the Engine security group associated with the Amazon EMR Studio. The Engine security group allows inbound network traffic from resources in the Workspace security group.
- IdcInstanceArn
-
- Type: string
The ARN of the IAM Identity Center instance the Studio application belongs to.
- IdcUserAssignment
-
- Type: string
Indicates whether the Studio has
REQUIRED
orOPTIONAL
IAM Identity Center user assignment. If the value is set toREQUIRED
, users must be explicitly assigned to the Studio application to access the Studio. - IdpAuthUrl
-
- Type: string
Your identity provider's authentication endpoint. Amazon EMR Studio redirects federated users to this endpoint for authentication when logging in to a Studio with the Studio URL.
- IdpRelayStateParameterName
-
- Type: string
The name of your identity provider's
RelayState
parameter. - Name
-
- Type: string
The name of the Amazon EMR Studio.
- ServiceRole
-
- Type: string
The name of the IAM role assumed by the Amazon EMR Studio.
- StudioArn
-
- Type: string
The Amazon Resource Name (ARN) of the Amazon EMR Studio.
- StudioId
-
- Type: string
The ID of the Amazon EMR Studio.
- SubnetIds
-
- Type: Array of strings
The list of IDs of the subnets associated with the Amazon EMR Studio.
- Tags
-
- Type: Array of Tag structures
A list of tags associated with the Amazon EMR Studio.
- TrustedIdentityPropagationEnabled
-
- Type: boolean
Indicates whether the Studio has Trusted identity propagation enabled. The default value is
false
. - Url
-
- Type: string
The unique access URL of the Amazon EMR Studio.
- UserRole
-
- Type: string
The name of the IAM role assumed by users logged in to the Amazon EMR Studio. A Studio only requires a
UserRole
when you use IAM authentication. - VpcId
-
- Type: string
The ID of the VPC associated with the Amazon EMR Studio.
- WorkspaceSecurityGroupId
-
- Type: string
The ID of the Workspace security group associated with the Amazon EMR Studio. The Workspace security group allows outbound network traffic to resources in the Engine security group and to the internet.
StudioSummary
Description
Details for an Amazon EMR Studio, including ID, Name, VPC, and Description. To fetch additional details such as subnets, IAM roles, security groups, and tags for the Studio, use the DescribeStudio API.
Members
- AuthMode
-
- Type: string
Specifies whether the Studio authenticates users using IAM or IAM Identity Center.
- CreationTime
-
- Type: timestamp (string|DateTime or anything parsable by strtotime)
The time when the Amazon EMR Studio was created.
- Description
-
- Type: string
The detailed description of the Amazon EMR Studio.
- Name
-
- Type: string
The name of the Amazon EMR Studio.
- StudioId
-
- Type: string
The ID of the Amazon EMR Studio.
- Url
-
- Type: string
The unique access URL of the Amazon EMR Studio.
- VpcId
-
- Type: string
The ID of the Virtual Private Cloud (Amazon VPC) associated with the Amazon EMR Studio.
SupportedInstanceType
Description
An instance type that the specified Amazon EMR release supports.
Members
- Architecture
-
- Type: string
The CPU architecture, for example
X86_64
orAARCH64
. - EbsOptimizedAvailable
-
- Type: boolean
Indicates whether the
SupportedInstanceType
supports Amazon EBS optimization. - EbsOptimizedByDefault
-
- Type: boolean
Indicates whether the
SupportedInstanceType
uses Amazon EBS optimization by default. - EbsStorageOnly
-
- Type: boolean
Indicates whether the
SupportedInstanceType
only supports Amazon EBS. - InstanceFamilyId
-
- Type: string
The Amazon EC2 family and generation for the
SupportedInstanceType
. - Is64BitsOnly
-
- Type: boolean
Indicates whether the
SupportedInstanceType
only supports 64-bit architecture. - MemoryGB
-
- Type: float
The amount of memory that is available to Amazon EMR from the
SupportedInstanceType
. The kernel and hypervisor software consume some memory, so this value might be lower than the overall memory for the instance type. - NumberOfDisks
-
- Type: int
Number of disks for the
SupportedInstanceType
. This value is0
for Amazon EBS-only instance types. - StorageGB
-
- Type: int
StorageGB
represents the storage capacity of theSupportedInstanceType
. This value is0
for Amazon EBS-only instance types. - Type
-
- Type: string
The Amazon EC2 instance type, for example
m5.xlarge
, of theSupportedInstanceType
. - VCPU
-
- Type: int
The number of vCPUs available for the
SupportedInstanceType
.
SupportedProductConfig
Description
The list of supported product configurations that allow user-supplied arguments. Amazon EMR accepts these arguments and forwards them to the corresponding installation script as bootstrap action arguments.
Members
- Args
-
- Type: Array of strings
The list of user-supplied arguments.
- Name
-
- Type: string
The name of the product configuration.
Tag
Description
A key-value pair containing user-defined metadata that you can associate with an Amazon EMR resource. Tags make it easier to associate clusters in various ways, such as grouping clusters to track your Amazon EMR resource allocation costs. For more information, see Tag Clusters.
Members
- Key
-
- Type: string
A user-defined key, which is the minimum required information for a valid tag. For more information, see Tag.
- Value
-
- Type: string
A user-defined value, which is optional in a tag. For more information, see Tag Clusters.
UsernamePassword
Description
The username and password that you use to connect to cluster endpoints.
Members
- Password
-
- Type: string
The password associated with the temporary credentials that you use to connect to cluster endpoints.
- Username
-
- Type: string
The username associated with the temporary credentials that you use to connect to cluster endpoints.
VolumeSpecification
Description
EBS volume specifications such as volume type, IOPS, size (GiB) and throughput (MiB/s) that are requested for the EBS volume attached to an Amazon EC2 instance in the cluster.
Members
- Iops
-
- Type: int
The number of I/O operations per second (IOPS) that the volume supports.
- SizeInGB
-
- Required: Yes
- Type: int
The volume size, in gibibytes (GiB). This can be a number from 1 - 1024. If the volume type is EBS-optimized, the minimum value is 10.
- Throughput
-
- Type: int
The throughput, in mebibyte per second (MiB/s). This optional parameter can be a number from 125 - 1000 and is valid only for gp3 volumes.
- VolumeType
-
- Required: Yes
- Type: string
The volume type. Volume types supported are gp3, gp2, io1, st1, sc1, and standard.