EmrCreateCluster¶
-
class
aws_cdk.aws_stepfunctions_tasks.
EmrCreateCluster
(scope, id, *, instances, name, additional_info=None, applications=None, auto_scaling_role=None, bootstrap_actions=None, cluster_role=None, configurations=None, custom_ami_id=None, ebs_root_volume_size=None, kerberos_attributes=None, log_uri=None, release_label=None, scale_down_behavior=None, security_configuration=None, service_role=None, step_concurrency_level=None, tags=None, visible_to_all_users=None, comment=None, heartbeat=None, input_path=None, integration_pattern=None, output_path=None, result_path=None, result_selector=None, timeout=None)¶ Bases:
aws_cdk.aws_stepfunctions.TaskStateBase
A Step Functions Task to create an EMR Cluster.
The ClusterConfiguration is defined as Parameters in the state machine definition.
OUTPUT: the ClusterId.
- ExampleMetadata
infused
Example:
cluster_role = iam.Role(self, "ClusterRole", assumed_by=iam.ServicePrincipal("ec2.amazonaws.com") ) service_role = iam.Role(self, "ServiceRole", assumed_by=iam.ServicePrincipal("elasticmapreduce.amazonaws.com") ) auto_scaling_role = iam.Role(self, "AutoScalingRole", assumed_by=iam.ServicePrincipal("elasticmapreduce.amazonaws.com") ) auto_scaling_role.assume_role_policy.add_statements( iam.PolicyStatement( effect=iam.Effect.ALLOW, principals=[ iam.ServicePrincipal("application-autoscaling.amazonaws.com") ], actions=["sts:AssumeRole" ] )) tasks.EmrCreateCluster(self, "Create Cluster", instances=tasks.EmrCreateCluster.InstancesConfigProperty(), cluster_role=cluster_role, name=sfn.TaskInput.from_json_path_at("$.ClusterName").value, service_role=service_role, auto_scaling_role=auto_scaling_role )
- Parameters
scope (
Construct
) –id (
str
) –instances (
InstancesConfigProperty
) – A specification of the number and type of Amazon EC2 instances.name (
str
) – The Name of the Cluster.additional_info (
Optional
[str
]) – A JSON string for selecting additional features. Default: - Noneapplications (
Optional
[Sequence
[ApplicationConfigProperty
]]) – A case-insensitive list of applications for Amazon EMR to install and configure when launching the cluster. Default: - EMR selected defaultauto_scaling_role (
Optional
[IRole
]) – An IAM role for automatic scaling policies. Default: - A role will be created.bootstrap_actions (
Optional
[Sequence
[BootstrapActionConfigProperty
]]) – A list of bootstrap actions to run before Hadoop starts on the cluster nodes. Default: - Nonecluster_role (
Optional
[IRole
]) – Also called instance profile and EC2 role. An IAM role for an EMR cluster. The EC2 instances of the cluster assume this role. This attribute has been renamed from jobFlowRole to clusterRole to align with other ERM/StepFunction integration parameters. Default: - - A Role will be createdconfigurations (
Optional
[Sequence
[ConfigurationProperty
]]) – The list of configurations supplied for the EMR cluster you are creating. Default: - Nonecustom_ami_id (
Optional
[str
]) – The ID of a custom Amazon EBS-backed Linux AMI. Default: - Noneebs_root_volume_size (
Optional
[Size
]) – The size of the EBS root device volume of the Linux AMI that is used for each EC2 instance. Default: - EMR selected defaultkerberos_attributes (
Optional
[KerberosAttributesProperty
]) – Attributes for Kerberos configuration when Kerberos authentication is enabled using a security configuration. Default: - Nonelog_uri (
Optional
[str
]) – The location in Amazon S3 to write the log files of the job flow. Default: - Nonerelease_label (
Optional
[str
]) – The Amazon EMR release label, which determines the version of open-source application packages installed on the cluster. Default: - EMR selected defaultscale_down_behavior (
Optional
[EmrClusterScaleDownBehavior
]) – Specifies the way that individual Amazon EC2 instances terminate when an automatic scale-in activity occurs or an instance group is resized. Default: - EMR selected defaultsecurity_configuration (
Optional
[str
]) – The name of a security configuration to apply to the cluster. Default: - Noneservice_role (
Optional
[IRole
]) – The IAM role that will be assumed by the Amazon EMR service to access AWS resources on your behalf. Default: - A role will be created that Amazon EMR service can assume.step_concurrency_level (
Union
[int
,float
,None
]) – Specifies the step concurrency level to allow multiple steps to run in parallel. Requires EMR release label 5.28.0 or above. Must be in range [1, 256]. Default: 1 - no step concurrency allowedtags (
Optional
[Mapping
[str
,str
]]) – A list of tags to associate with a cluster and propagate to Amazon EC2 instances. Default: - Nonevisible_to_all_users (
Optional
[bool
]) – A value of true indicates that all IAM users in the AWS account can perform cluster actions if they have the proper IAM policy permissions. Default: truecomment (
Optional
[str
]) – An optional description for this state. Default: - No commentheartbeat (
Optional
[Duration
]) – Timeout for the heartbeat. Default: - Noneinput_path (
Optional
[str
]) – JSONPath expression to select part of the state to be the input to this state. May also be the special value JsonPath.DISCARD, which will cause the effective input to be the empty object {}. Default: - The entire task input (JSON path ‘$’)integration_pattern (
Optional
[IntegrationPattern
]) – AWS Step Functions integrates with services directly in the Amazon States Language. You can control these AWS services using service integration patterns Default: -IntegrationPattern.REQUEST_RESPONSE
for most tasks.IntegrationPattern.RUN_JOB
for the following exceptions:BatchSubmitJob
,EmrAddStep
,EmrCreateCluster
,EmrTerminationCluster
, andEmrContainersStartJobRun
.output_path (
Optional
[str
]) – JSONPath expression to select select a portion of the state output to pass to the next state. May also be the special value JsonPath.DISCARD, which will cause the effective output to be the empty object {}. Default: - The entire JSON node determined by the state input, the task result, and resultPath is passed to the next state (JSON path ‘$’)result_path (
Optional
[str
]) – JSONPath expression to indicate where to inject the state’s output. May also be the special value JsonPath.DISCARD, which will cause the state’s input to become its output. Default: - Replaces the entire input with the result (JSON path ‘$’)result_selector (
Optional
[Mapping
[str
,Any
]]) – The JSON that will replace the state’s raw result and become the effective result before ResultPath is applied. You can use ResultSelector to create a payload with values that are static or selected from the state’s raw result. Default: - Nonetimeout (
Optional
[Duration
]) – Timeout for the state machine. Default: - None
Methods
-
add_catch
(handler, *, errors=None, result_path=None)¶ Add a recovery handler for this state.
When a particular error occurs, execution will continue at the error handler instead of failing the state machine execution.
- Parameters
handler (
IChainable
) –errors (
Optional
[Sequence
[str
]]) – Errors to recover from by going to the given state. A list of error strings to retry, which can be either predefined errors (for example Errors.NoChoiceMatched) or a self-defined error. Default: All errorsresult_path (
Optional
[str
]) – JSONPath expression to indicate where to inject the error data. May also be the special value DISCARD, which will cause the error data to be discarded. Default: $
- Return type
-
add_prefix
(x)¶ Add a prefix to the stateId of this state.
- Parameters
x (
str
) –- Return type
None
-
add_retry
(*, backoff_rate=None, errors=None, interval=None, max_attempts=None)¶ Add retry configuration for this state.
This controls if and how the execution will be retried if a particular error occurs.
- Parameters
backoff_rate (
Union
[int
,float
,None
]) – Multiplication for how much longer the wait interval gets on every retry. Default: 2errors (
Optional
[Sequence
[str
]]) – Errors to retry. A list of error strings to retry, which can be either predefined errors (for example Errors.NoChoiceMatched) or a self-defined error. Default: All errorsinterval (
Optional
[Duration
]) – How many seconds to wait initially before retrying. Default: Duration.seconds(1)max_attempts (
Union
[int
,float
,None
]) – How many times to retry this particular error. May be 0 to disable retry for specific errors (in case you have a catch-all retry policy). Default: 3
- Return type
-
bind_to_graph
(graph)¶ Register this state as part of the given graph.
Don’t call this. It will be called automatically when you work with states normally.
- Parameters
graph (
StateGraph
) –- Return type
None
-
metric
(metric_name, *, account=None, color=None, dimensions=None, dimensions_map=None, label=None, period=None, region=None, statistic=None, unit=None)¶ Return the given named metric for this Task.
- Parameters
metric_name (
str
) –account (
Optional
[str
]) – Account which this metric comes from. Default: - Deployment account.color (
Optional
[str
]) – The hex color code, prefixed with ‘#’ (e.g. ‘#00ff00’), to use when this metric is rendered on a graph. TheColor
class has a set of standard colors that can be used here. Default: - Automatic colordimensions (
Optional
[Mapping
[str
,Any
]]) – (deprecated) Dimensions of the metric. Default: - No dimensions.dimensions_map (
Optional
[Mapping
[str
,str
]]) – Dimensions of the metric. Default: - No dimensions.label (
Optional
[str
]) – Label for this metric when added to a Graph in a Dashboard. You can use dynamic labels to show summary information about the entire displayed time series in the legend. For example, if you use:: [max: ${MAX}] MyMetric As the metric label, the maximum value in the visible range will be shown next to the time series name in the graph’s legend. Default: - No labelperiod (
Optional
[Duration
]) – The period over which the specified statistic is applied. Default: Duration.minutes(5)region (
Optional
[str
]) – Region which this metric comes from. Default: - Deployment region.statistic (
Optional
[str
]) – What function to use for aggregating. Can be one of the following: - “Minimum” | “min” - “Maximum” | “max” - “Average” | “avg” - “Sum” | “sum” - “SampleCount | “n” - “pNN.NN” Default: Averageunit (
Optional
[Unit
]) – Unit used to filter the metric stream. Only refer to datums emitted to the metric stream with the given unit and ignore all others. Only useful when datums are being emitted to the same metric stream under different units. The default is to use all matric datums in the stream, regardless of unit, which is recommended in nearly all cases. CloudWatch does not honor this property for graphs. Default: - All metric datums in the given metric stream
- Default
sum over 5 minutes
- Return type
-
metric_failed
(*, account=None, color=None, dimensions=None, dimensions_map=None, label=None, period=None, region=None, statistic=None, unit=None)¶ Metric for the number of times this activity fails.
- Parameters
account (
Optional
[str
]) – Account which this metric comes from. Default: - Deployment account.color (
Optional
[str
]) – The hex color code, prefixed with ‘#’ (e.g. ‘#00ff00’), to use when this metric is rendered on a graph. TheColor
class has a set of standard colors that can be used here. Default: - Automatic colordimensions (
Optional
[Mapping
[str
,Any
]]) – (deprecated) Dimensions of the metric. Default: - No dimensions.dimensions_map (
Optional
[Mapping
[str
,str
]]) – Dimensions of the metric. Default: - No dimensions.label (
Optional
[str
]) –Label for this metric when added to a Graph in a Dashboard. You can use dynamic labels to show summary information about the entire displayed time series in the legend. For example, if you use:: [max: ${MAX}] MyMetric As the metric label, the maximum value in the visible range will be shown next to the time series name in the graph’s legend. Default: - No label
period (
Optional
[Duration
]) – The period over which the specified statistic is applied. Default: Duration.minutes(5)region (
Optional
[str
]) – Region which this metric comes from. Default: - Deployment region.statistic (
Optional
[str
]) – What function to use for aggregating. Can be one of the following: - “Minimum” | “min” - “Maximum” | “max” - “Average” | “avg” - “Sum” | “sum” - “SampleCount | “n” - “pNN.NN” Default: Averageunit (
Optional
[Unit
]) – Unit used to filter the metric stream. Only refer to datums emitted to the metric stream with the given unit and ignore all others. Only useful when datums are being emitted to the same metric stream under different units. The default is to use all matric datums in the stream, regardless of unit, which is recommended in nearly all cases. CloudWatch does not honor this property for graphs. Default: - All metric datums in the given metric stream
- Default
sum over 5 minutes
- Return type
-
metric_heartbeat_timed_out
(*, account=None, color=None, dimensions=None, dimensions_map=None, label=None, period=None, region=None, statistic=None, unit=None)¶ Metric for the number of times the heartbeat times out for this activity.
- Parameters
account (
Optional
[str
]) – Account which this metric comes from. Default: - Deployment account.color (
Optional
[str
]) – The hex color code, prefixed with ‘#’ (e.g. ‘#00ff00’), to use when this metric is rendered on a graph. TheColor
class has a set of standard colors that can be used here. Default: - Automatic colordimensions (
Optional
[Mapping
[str
,Any
]]) – (deprecated) Dimensions of the metric. Default: - No dimensions.dimensions_map (
Optional
[Mapping
[str
,str
]]) – Dimensions of the metric. Default: - No dimensions.label (
Optional
[str
]) –Label for this metric when added to a Graph in a Dashboard. You can use dynamic labels to show summary information about the entire displayed time series in the legend. For example, if you use:: [max: ${MAX}] MyMetric As the metric label, the maximum value in the visible range will be shown next to the time series name in the graph’s legend. Default: - No label
period (
Optional
[Duration
]) – The period over which the specified statistic is applied. Default: Duration.minutes(5)region (
Optional
[str
]) – Region which this metric comes from. Default: - Deployment region.statistic (
Optional
[str
]) – What function to use for aggregating. Can be one of the following: - “Minimum” | “min” - “Maximum” | “max” - “Average” | “avg” - “Sum” | “sum” - “SampleCount | “n” - “pNN.NN” Default: Averageunit (
Optional
[Unit
]) – Unit used to filter the metric stream. Only refer to datums emitted to the metric stream with the given unit and ignore all others. Only useful when datums are being emitted to the same metric stream under different units. The default is to use all matric datums in the stream, regardless of unit, which is recommended in nearly all cases. CloudWatch does not honor this property for graphs. Default: - All metric datums in the given metric stream
- Default
sum over 5 minutes
- Return type
-
metric_run_time
(*, account=None, color=None, dimensions=None, dimensions_map=None, label=None, period=None, region=None, statistic=None, unit=None)¶ The interval, in milliseconds, between the time the Task starts and the time it closes.
- Parameters
account (
Optional
[str
]) – Account which this metric comes from. Default: - Deployment account.color (
Optional
[str
]) – The hex color code, prefixed with ‘#’ (e.g. ‘#00ff00’), to use when this metric is rendered on a graph. TheColor
class has a set of standard colors that can be used here. Default: - Automatic colordimensions (
Optional
[Mapping
[str
,Any
]]) – (deprecated) Dimensions of the metric. Default: - No dimensions.dimensions_map (
Optional
[Mapping
[str
,str
]]) – Dimensions of the metric. Default: - No dimensions.label (
Optional
[str
]) –Label for this metric when added to a Graph in a Dashboard. You can use dynamic labels to show summary information about the entire displayed time series in the legend. For example, if you use:: [max: ${MAX}] MyMetric As the metric label, the maximum value in the visible range will be shown next to the time series name in the graph’s legend. Default: - No label
period (
Optional
[Duration
]) – The period over which the specified statistic is applied. Default: Duration.minutes(5)region (
Optional
[str
]) – Region which this metric comes from. Default: - Deployment region.statistic (
Optional
[str
]) – What function to use for aggregating. Can be one of the following: - “Minimum” | “min” - “Maximum” | “max” - “Average” | “avg” - “Sum” | “sum” - “SampleCount | “n” - “pNN.NN” Default: Averageunit (
Optional
[Unit
]) – Unit used to filter the metric stream. Only refer to datums emitted to the metric stream with the given unit and ignore all others. Only useful when datums are being emitted to the same metric stream under different units. The default is to use all matric datums in the stream, regardless of unit, which is recommended in nearly all cases. CloudWatch does not honor this property for graphs. Default: - All metric datums in the given metric stream
- Default
average over 5 minutes
- Return type
-
metric_schedule_time
(*, account=None, color=None, dimensions=None, dimensions_map=None, label=None, period=None, region=None, statistic=None, unit=None)¶ The interval, in milliseconds, for which the activity stays in the schedule state.
- Parameters
account (
Optional
[str
]) – Account which this metric comes from. Default: - Deployment account.color (
Optional
[str
]) – The hex color code, prefixed with ‘#’ (e.g. ‘#00ff00’), to use when this metric is rendered on a graph. TheColor
class has a set of standard colors that can be used here. Default: - Automatic colordimensions (
Optional
[Mapping
[str
,Any
]]) – (deprecated) Dimensions of the metric. Default: - No dimensions.dimensions_map (
Optional
[Mapping
[str
,str
]]) – Dimensions of the metric. Default: - No dimensions.label (
Optional
[str
]) –Label for this metric when added to a Graph in a Dashboard. You can use dynamic labels to show summary information about the entire displayed time series in the legend. For example, if you use:: [max: ${MAX}] MyMetric As the metric label, the maximum value in the visible range will be shown next to the time series name in the graph’s legend. Default: - No label
period (
Optional
[Duration
]) – The period over which the specified statistic is applied. Default: Duration.minutes(5)region (
Optional
[str
]) – Region which this metric comes from. Default: - Deployment region.statistic (
Optional
[str
]) – What function to use for aggregating. Can be one of the following: - “Minimum” | “min” - “Maximum” | “max” - “Average” | “avg” - “Sum” | “sum” - “SampleCount | “n” - “pNN.NN” Default: Averageunit (
Optional
[Unit
]) – Unit used to filter the metric stream. Only refer to datums emitted to the metric stream with the given unit and ignore all others. Only useful when datums are being emitted to the same metric stream under different units. The default is to use all matric datums in the stream, regardless of unit, which is recommended in nearly all cases. CloudWatch does not honor this property for graphs. Default: - All metric datums in the given metric stream
- Default
average over 5 minutes
- Return type
-
metric_scheduled
(*, account=None, color=None, dimensions=None, dimensions_map=None, label=None, period=None, region=None, statistic=None, unit=None)¶ Metric for the number of times this activity is scheduled.
- Parameters
account (
Optional
[str
]) – Account which this metric comes from. Default: - Deployment account.color (
Optional
[str
]) – The hex color code, prefixed with ‘#’ (e.g. ‘#00ff00’), to use when this metric is rendered on a graph. TheColor
class has a set of standard colors that can be used here. Default: - Automatic colordimensions (
Optional
[Mapping
[str
,Any
]]) – (deprecated) Dimensions of the metric. Default: - No dimensions.dimensions_map (
Optional
[Mapping
[str
,str
]]) – Dimensions of the metric. Default: - No dimensions.label (
Optional
[str
]) –Label for this metric when added to a Graph in a Dashboard. You can use dynamic labels to show summary information about the entire displayed time series in the legend. For example, if you use:: [max: ${MAX}] MyMetric As the metric label, the maximum value in the visible range will be shown next to the time series name in the graph’s legend. Default: - No label
period (
Optional
[Duration
]) – The period over which the specified statistic is applied. Default: Duration.minutes(5)region (
Optional
[str
]) – Region which this metric comes from. Default: - Deployment region.statistic (
Optional
[str
]) – What function to use for aggregating. Can be one of the following: - “Minimum” | “min” - “Maximum” | “max” - “Average” | “avg” - “Sum” | “sum” - “SampleCount | “n” - “pNN.NN” Default: Averageunit (
Optional
[Unit
]) – Unit used to filter the metric stream. Only refer to datums emitted to the metric stream with the given unit and ignore all others. Only useful when datums are being emitted to the same metric stream under different units. The default is to use all matric datums in the stream, regardless of unit, which is recommended in nearly all cases. CloudWatch does not honor this property for graphs. Default: - All metric datums in the given metric stream
- Default
sum over 5 minutes
- Return type
-
metric_started
(*, account=None, color=None, dimensions=None, dimensions_map=None, label=None, period=None, region=None, statistic=None, unit=None)¶ Metric for the number of times this activity is started.
- Parameters
account (
Optional
[str
]) – Account which this metric comes from. Default: - Deployment account.color (
Optional
[str
]) – The hex color code, prefixed with ‘#’ (e.g. ‘#00ff00’), to use when this metric is rendered on a graph. TheColor
class has a set of standard colors that can be used here. Default: - Automatic colordimensions (
Optional
[Mapping
[str
,Any
]]) – (deprecated) Dimensions of the metric. Default: - No dimensions.dimensions_map (
Optional
[Mapping
[str
,str
]]) – Dimensions of the metric. Default: - No dimensions.label (
Optional
[str
]) –Label for this metric when added to a Graph in a Dashboard. You can use dynamic labels to show summary information about the entire displayed time series in the legend. For example, if you use:: [max: ${MAX}] MyMetric As the metric label, the maximum value in the visible range will be shown next to the time series name in the graph’s legend. Default: - No label
period (
Optional
[Duration
]) – The period over which the specified statistic is applied. Default: Duration.minutes(5)region (
Optional
[str
]) – Region which this metric comes from. Default: - Deployment region.statistic (
Optional
[str
]) – What function to use for aggregating. Can be one of the following: - “Minimum” | “min” - “Maximum” | “max” - “Average” | “avg” - “Sum” | “sum” - “SampleCount | “n” - “pNN.NN” Default: Averageunit (
Optional
[Unit
]) – Unit used to filter the metric stream. Only refer to datums emitted to the metric stream with the given unit and ignore all others. Only useful when datums are being emitted to the same metric stream under different units. The default is to use all matric datums in the stream, regardless of unit, which is recommended in nearly all cases. CloudWatch does not honor this property for graphs. Default: - All metric datums in the given metric stream
- Default
sum over 5 minutes
- Return type
-
metric_succeeded
(*, account=None, color=None, dimensions=None, dimensions_map=None, label=None, period=None, region=None, statistic=None, unit=None)¶ Metric for the number of times this activity succeeds.
- Parameters
account (
Optional
[str
]) – Account which this metric comes from. Default: - Deployment account.color (
Optional
[str
]) – The hex color code, prefixed with ‘#’ (e.g. ‘#00ff00’), to use when this metric is rendered on a graph. TheColor
class has a set of standard colors that can be used here. Default: - Automatic colordimensions (
Optional
[Mapping
[str
,Any
]]) – (deprecated) Dimensions of the metric. Default: - No dimensions.dimensions_map (
Optional
[Mapping
[str
,str
]]) – Dimensions of the metric. Default: - No dimensions.label (
Optional
[str
]) –Label for this metric when added to a Graph in a Dashboard. You can use dynamic labels to show summary information about the entire displayed time series in the legend. For example, if you use:: [max: ${MAX}] MyMetric As the metric label, the maximum value in the visible range will be shown next to the time series name in the graph’s legend. Default: - No label
period (
Optional
[Duration
]) – The period over which the specified statistic is applied. Default: Duration.minutes(5)region (
Optional
[str
]) – Region which this metric comes from. Default: - Deployment region.statistic (
Optional
[str
]) – What function to use for aggregating. Can be one of the following: - “Minimum” | “min” - “Maximum” | “max” - “Average” | “avg” - “Sum” | “sum” - “SampleCount | “n” - “pNN.NN” Default: Averageunit (
Optional
[Unit
]) – Unit used to filter the metric stream. Only refer to datums emitted to the metric stream with the given unit and ignore all others. Only useful when datums are being emitted to the same metric stream under different units. The default is to use all matric datums in the stream, regardless of unit, which is recommended in nearly all cases. CloudWatch does not honor this property for graphs. Default: - All metric datums in the given metric stream
- Default
sum over 5 minutes
- Return type
-
metric_time
(*, account=None, color=None, dimensions=None, dimensions_map=None, label=None, period=None, region=None, statistic=None, unit=None)¶ The interval, in milliseconds, between the time the activity is scheduled and the time it closes.
- Parameters
account (
Optional
[str
]) – Account which this metric comes from. Default: - Deployment account.color (
Optional
[str
]) – The hex color code, prefixed with ‘#’ (e.g. ‘#00ff00’), to use when this metric is rendered on a graph. TheColor
class has a set of standard colors that can be used here. Default: - Automatic colordimensions (
Optional
[Mapping
[str
,Any
]]) – (deprecated) Dimensions of the metric. Default: - No dimensions.dimensions_map (
Optional
[Mapping
[str
,str
]]) – Dimensions of the metric. Default: - No dimensions.label (
Optional
[str
]) –Label for this metric when added to a Graph in a Dashboard. You can use dynamic labels to show summary information about the entire displayed time series in the legend. For example, if you use:: [max: ${MAX}] MyMetric As the metric label, the maximum value in the visible range will be shown next to the time series name in the graph’s legend. Default: - No label
period (
Optional
[Duration
]) – The period over which the specified statistic is applied. Default: Duration.minutes(5)region (
Optional
[str
]) – Region which this metric comes from. Default: - Deployment region.statistic (
Optional
[str
]) – What function to use for aggregating. Can be one of the following: - “Minimum” | “min” - “Maximum” | “max” - “Average” | “avg” - “Sum” | “sum” - “SampleCount | “n” - “pNN.NN” Default: Averageunit (
Optional
[Unit
]) – Unit used to filter the metric stream. Only refer to datums emitted to the metric stream with the given unit and ignore all others. Only useful when datums are being emitted to the same metric stream under different units. The default is to use all matric datums in the stream, regardless of unit, which is recommended in nearly all cases. CloudWatch does not honor this property for graphs. Default: - All metric datums in the given metric stream
- Default
average over 5 minutes
- Return type
-
metric_timed_out
(*, account=None, color=None, dimensions=None, dimensions_map=None, label=None, period=None, region=None, statistic=None, unit=None)¶ Metric for the number of times this activity times out.
- Parameters
account (
Optional
[str
]) – Account which this metric comes from. Default: - Deployment account.color (
Optional
[str
]) – The hex color code, prefixed with ‘#’ (e.g. ‘#00ff00’), to use when this metric is rendered on a graph. TheColor
class has a set of standard colors that can be used here. Default: - Automatic colordimensions (
Optional
[Mapping
[str
,Any
]]) – (deprecated) Dimensions of the metric. Default: - No dimensions.dimensions_map (
Optional
[Mapping
[str
,str
]]) – Dimensions of the metric. Default: - No dimensions.label (
Optional
[str
]) –Label for this metric when added to a Graph in a Dashboard. You can use dynamic labels to show summary information about the entire displayed time series in the legend. For example, if you use:: [max: ${MAX}] MyMetric As the metric label, the maximum value in the visible range will be shown next to the time series name in the graph’s legend. Default: - No label
period (
Optional
[Duration
]) – The period over which the specified statistic is applied. Default: Duration.minutes(5)region (
Optional
[str
]) – Region which this metric comes from. Default: - Deployment region.statistic (
Optional
[str
]) – What function to use for aggregating. Can be one of the following: - “Minimum” | “min” - “Maximum” | “max” - “Average” | “avg” - “Sum” | “sum” - “SampleCount | “n” - “pNN.NN” Default: Averageunit (
Optional
[Unit
]) – Unit used to filter the metric stream. Only refer to datums emitted to the metric stream with the given unit and ignore all others. Only useful when datums are being emitted to the same metric stream under different units. The default is to use all matric datums in the stream, regardless of unit, which is recommended in nearly all cases. CloudWatch does not honor this property for graphs. Default: - All metric datums in the given metric stream
- Default
sum over 5 minutes
- Return type
-
next
(next)¶ Continue normal execution with the given state.
- Parameters
next (
IChainable
) –- Return type
-
to_state_json
()¶ Return the Amazon States Language object for this state.
- Return type
Mapping
[Any
,Any
]
-
to_string
()¶ Returns a string representation of this construct.
- Return type
str
Attributes
-
auto_scaling_role
¶ The autoscaling role for the EMR Cluster.
Only available after task has been added to a state machine.
- Return type
-
cluster_role
¶ The instance role for the EMR Cluster.
Only available after task has been added to a state machine.
- Return type
-
id
¶ Descriptive identifier for this chainable.
- Return type
str
-
node
¶ The construct tree node associated with this construct.
- Return type
-
service_role
¶ The service role for the EMR Cluster.
Only available after task has been added to a state machine.
- Return type
-
state_id
¶ Tokenized string that evaluates to the state’s ID.
- Return type
str
Static Methods
-
classmethod
filter_nextables
(states)¶ Return only the states that allow chaining from an array of states.
-
classmethod
find_reachable_end_states
(start, *, include_error_handlers=None)¶ Find the set of end states states reachable through transitions from the given start state.
-
classmethod
find_reachable_states
(start, *, include_error_handlers=None)¶ Find the set of states reachable through transitions from the given start state.
This does not retrieve states from within sub-graphs, such as states within a Parallel state’s branch.
-
classmethod
is_construct
(x)¶ Return whether the given object is a Construct.
- Parameters
x (
Any
) –- Return type
bool
-
classmethod
prefix_states
(root, prefix)¶ Add a prefix to the stateId of all States found in a construct tree.
- Parameters
root (
IConstruct
) –prefix (
str
) –
- Return type
None
ApplicationConfigProperty¶
-
class
EmrCreateCluster.
ApplicationConfigProperty
(*, name, additional_info=None, args=None, version=None)¶ Bases:
object
Properties for the EMR Cluster Applications.
Applies to Amazon EMR releases 4.0 and later. A case-insensitive list of applications for Amazon EMR to install and configure when launching the cluster.
See the RunJobFlow API for complete documentation on input parameters
- Parameters
name (
str
) – The name of the application.additional_info (
Optional
[Mapping
[str
,str
]]) – This option is for advanced users only. This is meta information about third-party applications that third-party vendors use for testing purposes. Default: No additionalInfoargs (
Optional
[Sequence
[str
]]) – Arguments for Amazon EMR to pass to the application. Default: No argsversion (
Optional
[str
]) – The version of the application. Default: No version
- See
https://docs.aws.amazon.com/emr/latest/APIReference/API_Application.html
- ExampleMetadata
fixture=_generated
Example:
# The code below shows an example of how to instantiate this type. # The values are placeholders you should change. import aws_cdk.aws_stepfunctions_tasks as stepfunctions_tasks application_config_property = stepfunctions_tasks.EmrCreateCluster.ApplicationConfigProperty( name="name", # the properties below are optional additional_info={ "additional_info_key": "additionalInfo" }, args=["args"], version="version" )
Attributes
-
additional_info
¶ This option is for advanced users only.
This is meta information about third-party applications that third-party vendors use for testing purposes.
- Default
No additionalInfo
- Return type
Optional
[Mapping
[str
,str
]]
-
args
¶ Arguments for Amazon EMR to pass to the application.
- Default
No args
- Return type
Optional
[List
[str
]]
-
name
¶ The name of the application.
- Return type
str
-
version
¶ The version of the application.
- Default
No version
- Return type
Optional
[str
]
AutoScalingPolicyProperty¶
-
class
EmrCreateCluster.
AutoScalingPolicyProperty
(*, constraints, rules)¶ Bases:
object
An automatic scaling policy for a core instance group or task instance group in an Amazon EMR cluster.
- Parameters
constraints (
ScalingConstraintsProperty
) – The upper and lower EC2 instance limits for an automatic scaling policy. Automatic scaling activity will not cause an instance group to grow above or below these limits.rules (
Sequence
[ScalingRuleProperty
]) – The scale-in and scale-out rules that comprise the automatic scaling policy.
- See
https://docs.aws.amazon.com/emr/latest/APIReference/API_AutoScalingPolicy.html
- ExampleMetadata
fixture=_generated
Example:
# The code below shows an example of how to instantiate this type. # The values are placeholders you should change. import aws_cdk.aws_stepfunctions_tasks as stepfunctions_tasks import aws_cdk.core as cdk auto_scaling_policy_property = stepfunctions_tasks.EmrCreateCluster.AutoScalingPolicyProperty( constraints=stepfunctions_tasks.EmrCreateCluster.ScalingConstraintsProperty( max_capacity=123, min_capacity=123 ), rules=[stepfunctions_tasks.EmrCreateCluster.ScalingRuleProperty( action=stepfunctions_tasks.EmrCreateCluster.ScalingActionProperty( simple_scaling_policy_configuration=stepfunctions_tasks.EmrCreateCluster.SimpleScalingPolicyConfigurationProperty( scaling_adjustment=123, # the properties below are optional adjustment_type=stepfunctions_tasks.EmrCreateCluster.ScalingAdjustmentType.CHANGE_IN_CAPACITY, cool_down=123 ), # the properties below are optional market=stepfunctions_tasks.EmrCreateCluster.InstanceMarket.ON_DEMAND ), name="name", trigger=stepfunctions_tasks.EmrCreateCluster.ScalingTriggerProperty( cloud_watch_alarm_definition=stepfunctions_tasks.EmrCreateCluster.CloudWatchAlarmDefinitionProperty( comparison_operator=stepfunctions_tasks.EmrCreateCluster.CloudWatchAlarmComparisonOperator.GREATER_THAN_OR_EQUAL, metric_name="metricName", period=cdk.Duration.minutes(30), # the properties below are optional dimensions=[stepfunctions_tasks.EmrCreateCluster.MetricDimensionProperty( key="key", value="value" )], evaluation_periods=123, namespace="namespace", statistic=stepfunctions_tasks.EmrCreateCluster.CloudWatchAlarmStatistic.SAMPLE_COUNT, threshold=123, unit=stepfunctions_tasks.EmrCreateCluster.CloudWatchAlarmUnit.NONE ) ), # the properties below are optional description="description" )] )
Attributes
-
constraints
¶ The upper and lower EC2 instance limits for an automatic scaling policy.
Automatic scaling activity will not cause an instance group to grow above or below these limits.
- Return type
-
rules
¶ The scale-in and scale-out rules that comprise the automatic scaling policy.
- Return type
List
[ScalingRuleProperty
]
BootstrapActionConfigProperty¶
-
class
EmrCreateCluster.
BootstrapActionConfigProperty
(*, name, script_bootstrap_action)¶ Bases:
object
Configuration of a bootstrap action.
See the RunJobFlow API for complete documentation on input parameters
- Parameters
name (
str
) – The name of the bootstrap action.script_bootstrap_action (
ScriptBootstrapActionConfigProperty
) – The script run by the bootstrap action.
- See
https://docs.aws.amazon.com/emr/latest/APIReference/API_BootstrapActionConfig.html
- ExampleMetadata
fixture=_generated
Example:
# The code below shows an example of how to instantiate this type. # The values are placeholders you should change. import aws_cdk.aws_stepfunctions_tasks as stepfunctions_tasks bootstrap_action_config_property = stepfunctions_tasks.EmrCreateCluster.BootstrapActionConfigProperty( name="name", script_bootstrap_action=stepfunctions_tasks.EmrCreateCluster.ScriptBootstrapActionConfigProperty( path="path", # the properties below are optional args=["args"] ) )
Attributes
-
name
¶ The name of the bootstrap action.
- Return type
str
-
script_bootstrap_action
¶ The script run by the bootstrap action.
- Return type
CloudWatchAlarmComparisonOperator¶
CloudWatchAlarmDefinitionProperty¶
-
class
EmrCreateCluster.
CloudWatchAlarmDefinitionProperty
(*, comparison_operator, metric_name, period, dimensions=None, evaluation_periods=None, namespace=None, statistic=None, threshold=None, unit=None)¶ Bases:
object
The definition of a CloudWatch metric alarm, which determines when an automatic scaling activity is triggered.
When the defined alarm conditions are satisfied, scaling activity begins.
- Parameters
comparison_operator (
CloudWatchAlarmComparisonOperator
) – Determines how the metric specified by MetricName is compared to the value specified by Threshold.metric_name (
str
) – The name of the CloudWatch metric that is watched to determine an alarm condition.period (
Duration
) – The period, in seconds, over which the statistic is applied. EMR CloudWatch metrics are emitted every five minutes (300 seconds), so if an EMR CloudWatch metric is specified, specify 300.dimensions (
Optional
[Sequence
[MetricDimensionProperty
]]) – A CloudWatch metric dimension. Default: - No dimensionsevaluation_periods (
Union
[int
,float
,None
]) – The number of periods, in five-minute increments, during which the alarm condition must exist before the alarm triggers automatic scaling activity. Default: 1namespace (
Optional
[str
]) – The namespace for the CloudWatch metric. Default: ‘AWS/ElasticMapReduce’statistic (
Optional
[CloudWatchAlarmStatistic
]) – The statistic to apply to the metric associated with the alarm. Default: CloudWatchAlarmStatistic.AVERAGEthreshold (
Union
[int
,float
,None
]) – The value against which the specified statistic is compared. Default: - Noneunit (
Optional
[CloudWatchAlarmUnit
]) – The unit of measure associated with the CloudWatch metric being watched. The value specified for Unit must correspond to the units specified in the CloudWatch metric. Default: CloudWatchAlarmUnit.NONE
- See
https://docs.aws.amazon.com/emr/latest/APIReference/API_CloudWatchAlarmDefinition.html
- ExampleMetadata
fixture=_generated
Example:
# The code below shows an example of how to instantiate this type. # The values are placeholders you should change. import aws_cdk.aws_stepfunctions_tasks as stepfunctions_tasks import aws_cdk.core as cdk cloud_watch_alarm_definition_property = stepfunctions_tasks.EmrCreateCluster.CloudWatchAlarmDefinitionProperty( comparison_operator=stepfunctions_tasks.EmrCreateCluster.CloudWatchAlarmComparisonOperator.GREATER_THAN_OR_EQUAL, metric_name="metricName", period=cdk.Duration.minutes(30), # the properties below are optional dimensions=[stepfunctions_tasks.EmrCreateCluster.MetricDimensionProperty( key="key", value="value" )], evaluation_periods=123, namespace="namespace", statistic=stepfunctions_tasks.EmrCreateCluster.CloudWatchAlarmStatistic.SAMPLE_COUNT, threshold=123, unit=stepfunctions_tasks.EmrCreateCluster.CloudWatchAlarmUnit.NONE )
Attributes
-
comparison_operator
¶ Determines how the metric specified by MetricName is compared to the value specified by Threshold.
- Return type
-
dimensions
¶ A CloudWatch metric dimension.
- Default
No dimensions
- Return type
Optional
[List
[MetricDimensionProperty
]]
-
evaluation_periods
¶ The number of periods, in five-minute increments, during which the alarm condition must exist before the alarm triggers automatic scaling activity.
- Default
1
- Return type
Union
[int
,float
,None
]
-
metric_name
¶ The name of the CloudWatch metric that is watched to determine an alarm condition.
- Return type
str
-
namespace
¶ The namespace for the CloudWatch metric.
- Default
‘AWS/ElasticMapReduce’
- Return type
Optional
[str
]
-
period
¶ The period, in seconds, over which the statistic is applied.
EMR CloudWatch metrics are emitted every five minutes (300 seconds), so if an EMR CloudWatch metric is specified, specify 300.
- Return type
-
statistic
¶ The statistic to apply to the metric associated with the alarm.
- Default
CloudWatchAlarmStatistic.AVERAGE
- Return type
Optional
[CloudWatchAlarmStatistic
]
-
threshold
¶ The value against which the specified statistic is compared.
- Default
None
- Return type
Union
[int
,float
,None
]
-
unit
¶ The unit of measure associated with the CloudWatch metric being watched.
The value specified for Unit must correspond to the units specified in the CloudWatch metric.
- Default
CloudWatchAlarmUnit.NONE
- Return type
Optional
[CloudWatchAlarmUnit
]
CloudWatchAlarmStatistic¶
CloudWatchAlarmUnit¶
-
class
EmrCreateCluster.
CloudWatchAlarmUnit
(value)¶ Bases:
enum.Enum
CloudWatch Alarm Units.
Attributes
-
BITS
¶ BITS.
-
BITS_PER_SECOND
¶ BITS_PER_SECOND.
-
BYTES
¶ BYTES.
-
BYTES_PER_SECOND
¶ BYTES_PER_SECOND.
-
COUNT
¶ COUNT.
-
COUNT_PER_SECOND
¶ COUNT_PER_SECOND.
-
GIGA_BITS
¶ GIGA_BITS.
-
GIGA_BITS_PER_SECOND
¶ GIGA_BITS_PER_SECOND.
-
GIGA_BYTES
¶ GIGA_BYTES.
-
GIGA_BYTES_PER_SECOND
¶ GIGA_BYTES_PER_SECOND.
-
KILO_BITS
¶ KILO_BITS.
-
KILO_BITS_PER_SECOND
¶ KILO_BITS_PER_SECOND.
-
KILO_BYTES
¶ KILO_BYTES.
-
KILO_BYTES_PER_SECOND
¶ KILO_BYTES_PER_SECOND.
-
MEGA_BITS
¶ MEGA_BITS.
-
MEGA_BITS_PER_SECOND
¶ MEGA_BITS_PER_SECOND.
-
MEGA_BYTES
¶ MEGA_BYTES.
-
MEGA_BYTES_PER_SECOND
¶ MEGA_BYTES_PER_SECOND.
-
MICRO_SECONDS
¶ MICRO_SECONDS.
-
MILLI_SECONDS
¶ MILLI_SECONDS.
-
NONE
¶ NONE.
-
PERCENT
¶ PERCENT.
-
SECONDS
¶ SECONDS.
-
TERA_BITS
¶ TERA_BITS.
-
TERA_BITS_PER_SECOND
¶ TERA_BITS_PER_SECOND.
-
TERA_BYTES
¶ TERA_BYTES.
-
TERA_BYTES_PER_SECOND
¶ TERA_BYTES_PER_SECOND.
-
ConfigurationProperty¶
-
class
EmrCreateCluster.
ConfigurationProperty
(*, classification=None, configurations=None, properties=None)¶ Bases:
object
An optional configuration specification to be used when provisioning cluster instances, which can include configurations for applications and software bundled with Amazon EMR.
See the RunJobFlow API for complete documentation on input parameters
- Parameters
classification (
Optional
[str
]) – The classification within a configuration. Default: No classificationconfigurations (
Optional
[Sequence
[ConfigurationProperty
]]) – A list of additional configurations to apply within a configuration object. Default: No configurationsproperties (
Optional
[Mapping
[str
,str
]]) – A set of properties specified within a configuration classification. Default: No properties
- See
https://docs.aws.amazon.com/emr/latest/APIReference/API_Configuration.html
- ExampleMetadata
fixture=_generated
Example:
# The code below shows an example of how to instantiate this type. # The values are placeholders you should change. import aws_cdk.aws_stepfunctions_tasks as stepfunctions_tasks # configuration_property_: stepfunctions_tasks.EmrCreateCluster.ConfigurationProperty configuration_property = stepfunctions_tasks.EmrCreateCluster.ConfigurationProperty( classification="classification", configurations=[stepfunctions_tasks.EmrCreateCluster.ConfigurationProperty( classification="classification", configurations=[configuration_property_], properties={ "properties_key": "properties" } )], properties={ "properties_key": "properties" } )
Attributes
-
classification
¶ The classification within a configuration.
- Default
No classification
- Return type
Optional
[str
]
-
configurations
¶ A list of additional configurations to apply within a configuration object.
- Default
No configurations
- Return type
Optional
[List
[ConfigurationProperty
]]
-
properties
¶ A set of properties specified within a configuration classification.
- Default
No properties
- Return type
Optional
[Mapping
[str
,str
]]
EbsBlockDeviceConfigProperty¶
-
class
EmrCreateCluster.
EbsBlockDeviceConfigProperty
(*, volume_specification, volumes_per_instance=None)¶ Bases:
object
Configuration of requested EBS block device associated with the instance group with count of volumes that will be associated to every instance.
- Parameters
volume_specification (
VolumeSpecificationProperty
) – EBS volume specifications such as volume type, IOPS, and size (GiB) that will be requested for the EBS volume attached to an EC2 instance in the cluster.volumes_per_instance (
Union
[int
,float
,None
]) – Number of EBS volumes with a specific volume configuration that will be associated with every instance in the instance group. Default: EMR selected default
- See
https://docs.aws.amazon.com/emr/latest/APIReference/API_EbsBlockDeviceConfig.html
- ExampleMetadata
fixture=_generated
Example:
# The code below shows an example of how to instantiate this type. # The values are placeholders you should change. import aws_cdk.aws_stepfunctions_tasks as stepfunctions_tasks import aws_cdk.core as cdk # size: cdk.Size ebs_block_device_config_property = stepfunctions_tasks.EmrCreateCluster.EbsBlockDeviceConfigProperty( volume_specification=stepfunctions_tasks.EmrCreateCluster.VolumeSpecificationProperty( volume_size=size, volume_type=stepfunctions_tasks.EmrCreateCluster.EbsBlockDeviceVolumeType.GP2, # the properties below are optional iops=123 ), # the properties below are optional volumes_per_instance=123 )
Attributes
-
volume_specification
¶ EBS volume specifications such as volume type, IOPS, and size (GiB) that will be requested for the EBS volume attached to an EC2 instance in the cluster.
- Return type
-
volumes_per_instance
¶ Number of EBS volumes with a specific volume configuration that will be associated with every instance in the instance group.
- Default
EMR selected default
- Return type
Union
[int
,float
,None
]
EbsBlockDeviceVolumeType¶
EbsConfigurationProperty¶
-
class
EmrCreateCluster.
EbsConfigurationProperty
(*, ebs_block_device_configs=None, ebs_optimized=None)¶ Bases:
object
The Amazon EBS configuration of a cluster instance.
- Parameters
ebs_block_device_configs (
Optional
[Sequence
[EbsBlockDeviceConfigProperty
]]) – An array of Amazon EBS volume specifications attached to a cluster instance. Default: - Noneebs_optimized (
Optional
[bool
]) – Indicates whether an Amazon EBS volume is EBS-optimized. Default: - EMR selected default
- See
https://docs.aws.amazon.com/emr/latest/APIReference/API_EbsConfiguration.html
- ExampleMetadata
fixture=_generated
Example:
# The code below shows an example of how to instantiate this type. # The values are placeholders you should change. import aws_cdk.aws_stepfunctions_tasks as stepfunctions_tasks import aws_cdk.core as cdk # size: cdk.Size ebs_configuration_property = stepfunctions_tasks.EmrCreateCluster.EbsConfigurationProperty( ebs_block_device_configs=[stepfunctions_tasks.EmrCreateCluster.EbsBlockDeviceConfigProperty( volume_specification=stepfunctions_tasks.EmrCreateCluster.VolumeSpecificationProperty( volume_size=size, volume_type=stepfunctions_tasks.EmrCreateCluster.EbsBlockDeviceVolumeType.GP2, # the properties below are optional iops=123 ), # the properties below are optional volumes_per_instance=123 )], ebs_optimized=False )
Attributes
-
ebs_block_device_configs
¶ An array of Amazon EBS volume specifications attached to a cluster instance.
- Default
None
- Return type
Optional
[List
[EbsBlockDeviceConfigProperty
]]
-
ebs_optimized
¶ Indicates whether an Amazon EBS volume is EBS-optimized.
- Default
EMR selected default
- Return type
Optional
[bool
]
EmrClusterScaleDownBehavior¶
-
class
EmrCreateCluster.
EmrClusterScaleDownBehavior
(value)¶ Bases:
enum.Enum
The Cluster ScaleDownBehavior specifies the way that individual Amazon EC2 instances terminate when an automatic scale-in activity occurs or an instance group is resized.
Attributes
-
TERMINATE_AT_INSTANCE_HOUR
¶ Indicates that Amazon EMR terminates nodes at the instance-hour boundary, regardless of when the request to terminate the instance was submitted.
This option is only available with Amazon EMR 5.1.0 and later and is the default for clusters created using that version
-
TERMINATE_AT_TASK_COMPLETION
¶ Indicates that Amazon EMR adds nodes to a deny list and drains tasks from nodes before terminating the Amazon EC2 instances, regardless of the instance-hour boundary.
-
InstanceFleetConfigProperty¶
-
class
EmrCreateCluster.
InstanceFleetConfigProperty
(*, instance_fleet_type, instance_type_configs=None, launch_specifications=None, name=None, target_on_demand_capacity=None, target_spot_capacity=None)¶ Bases:
object
The configuration that defines an instance fleet.
- Parameters
instance_fleet_type (
InstanceRoleType
) – The node type that the instance fleet hosts. Valid values are MASTER,CORE,and TASK.instance_type_configs (
Optional
[Sequence
[InstanceTypeConfigProperty
]]) – The instance type configurations that define the EC2 instances in the instance fleet. Default: No instanceTpeConfigslaunch_specifications (
Optional
[InstanceFleetProvisioningSpecificationsProperty
]) – The launch specification for the instance fleet. Default: No launchSpecificationsname (
Optional
[str
]) – The friendly name of the instance fleet. Default: No nametarget_on_demand_capacity (
Union
[int
,float
,None
]) – The target capacity of On-Demand units for the instance fleet, which determines how many On-Demand instances to provision. Default: No targetOnDemandCapacitytarget_spot_capacity (
Union
[int
,float
,None
]) – The target capacity of Spot units for the instance fleet, which determines how many Spot instances to provision. Default: No targetSpotCapacity
- See
https://docs.aws.amazon.com/emr/latest/APIReference/API_InstanceFleetConfig.html
- ExampleMetadata
fixture=_generated
Example:
# The code below shows an example of how to instantiate this type. # The values are placeholders you should change. import aws_cdk.aws_stepfunctions_tasks as stepfunctions_tasks import aws_cdk.core as cdk # configuration_property_: stepfunctions_tasks.EmrCreateCluster.ConfigurationProperty # size: cdk.Size instance_fleet_config_property = stepfunctions_tasks.EmrCreateCluster.InstanceFleetConfigProperty( instance_fleet_type=stepfunctions_tasks.EmrCreateCluster.InstanceRoleType.MASTER, # the properties below are optional instance_type_configs=[stepfunctions_tasks.EmrCreateCluster.InstanceTypeConfigProperty( instance_type="instanceType", # the properties below are optional bid_price="bidPrice", bid_price_as_percentage_of_on_demand_price=123, configurations=[stepfunctions_tasks.EmrCreateCluster.ConfigurationProperty( classification="classification", configurations=[configuration_property_], properties={ "properties_key": "properties" } )], ebs_configuration=stepfunctions_tasks.EmrCreateCluster.EbsConfigurationProperty( ebs_block_device_configs=[stepfunctions_tasks.EmrCreateCluster.EbsBlockDeviceConfigProperty( volume_specification=stepfunctions_tasks.EmrCreateCluster.VolumeSpecificationProperty( volume_size=size, volume_type=stepfunctions_tasks.EmrCreateCluster.EbsBlockDeviceVolumeType.GP2, # the properties below are optional iops=123 ), # the properties below are optional volumes_per_instance=123 )], ebs_optimized=False ), weighted_capacity=123 )], launch_specifications=stepfunctions_tasks.EmrCreateCluster.InstanceFleetProvisioningSpecificationsProperty( spot_specification=stepfunctions_tasks.EmrCreateCluster.SpotProvisioningSpecificationProperty( timeout_action=stepfunctions_tasks.EmrCreateCluster.SpotTimeoutAction.SWITCH_TO_ON_DEMAND, timeout_duration_minutes=123, # the properties below are optional allocation_strategy=stepfunctions_tasks.EmrCreateCluster.SpotAllocationStrategy.CAPACITY_OPTIMIZED, block_duration_minutes=123 ) ), name="name", target_on_demand_capacity=123, target_spot_capacity=123 )
Attributes
-
instance_fleet_type
¶ The node type that the instance fleet hosts.
Valid values are MASTER,CORE,and TASK.
- Return type
-
instance_type_configs
¶ The instance type configurations that define the EC2 instances in the instance fleet.
- Default
No instanceTpeConfigs
- Return type
Optional
[List
[InstanceTypeConfigProperty
]]
-
launch_specifications
¶ The launch specification for the instance fleet.
- Default
No launchSpecifications
- Return type
-
name
¶ The friendly name of the instance fleet.
- Default
No name
- Return type
Optional
[str
]
-
target_on_demand_capacity
¶ The target capacity of On-Demand units for the instance fleet, which determines how many On-Demand instances to provision.
- Default
No targetOnDemandCapacity
- Return type
Union
[int
,float
,None
]
-
target_spot_capacity
¶ The target capacity of Spot units for the instance fleet, which determines how many Spot instances to provision.
- Default
No targetSpotCapacity
- Return type
Union
[int
,float
,None
]
InstanceFleetProvisioningSpecificationsProperty¶
-
class
EmrCreateCluster.
InstanceFleetProvisioningSpecificationsProperty
(*, spot_specification)¶ Bases:
object
The launch specification for Spot instances in the fleet, which determines the defined duration and provisioning timeout behavior.
- Parameters
spot_specification (
SpotProvisioningSpecificationProperty
) – The launch specification for Spot instances in the fleet, which determines the defined duration and provisioning timeout behavior.- See
https://docs.aws.amazon.com/emr/latest/APIReference/API_InstanceFleetProvisioningSpecifications.html
- ExampleMetadata
fixture=_generated
Example:
# The code below shows an example of how to instantiate this type. # The values are placeholders you should change. import aws_cdk.aws_stepfunctions_tasks as stepfunctions_tasks instance_fleet_provisioning_specifications_property = stepfunctions_tasks.EmrCreateCluster.InstanceFleetProvisioningSpecificationsProperty( spot_specification=stepfunctions_tasks.EmrCreateCluster.SpotProvisioningSpecificationProperty( timeout_action=stepfunctions_tasks.EmrCreateCluster.SpotTimeoutAction.SWITCH_TO_ON_DEMAND, timeout_duration_minutes=123, # the properties below are optional allocation_strategy=stepfunctions_tasks.EmrCreateCluster.SpotAllocationStrategy.CAPACITY_OPTIMIZED, block_duration_minutes=123 ) )
Attributes
-
spot_specification
¶ The launch specification for Spot instances in the fleet, which determines the defined duration and provisioning timeout behavior.
- Return type
InstanceGroupConfigProperty¶
-
class
EmrCreateCluster.
InstanceGroupConfigProperty
(*, instance_count, instance_role, instance_type, auto_scaling_policy=None, bid_price=None, configurations=None, ebs_configuration=None, market=None, name=None)¶ Bases:
object
Configuration defining a new instance group.
- Parameters
instance_count (
Union
[int
,float
]) – Target number of instances for the instance group.instance_role (
InstanceRoleType
) – The role of the instance group in the cluster.instance_type (
str
) – The EC2 instance type for all instances in the instance group.auto_scaling_policy (
Optional
[AutoScalingPolicyProperty
]) – An automatic scaling policy for a core instance group or task instance group in an Amazon EMR cluster. Default: - Nonebid_price (
Optional
[str
]) – The bid price for each EC2 Spot instance type as defined by InstanceType. Expressed in USD. Default: - Noneconfigurations (
Optional
[Sequence
[ConfigurationProperty
]]) – The list of configurations supplied for an EMR cluster instance group. Default: - Noneebs_configuration (
Optional
[EbsConfigurationProperty
]) – EBS configurations that will be attached to each EC2 instance in the instance group. Default: - Nonemarket (
Optional
[InstanceMarket
]) – Market type of the EC2 instances used to create a cluster node. Default: - EMR selected defaultname (
Optional
[str
]) – Friendly name given to the instance group. Default: - None
- See
https://docs.aws.amazon.com/emr/latest/APIReference/API_InstanceGroupConfig.html
- ExampleMetadata
fixture=_generated
Example:
# The code below shows an example of how to instantiate this type. # The values are placeholders you should change. import aws_cdk.aws_stepfunctions_tasks as stepfunctions_tasks import aws_cdk.core as cdk # configuration_property_: stepfunctions_tasks.EmrCreateCluster.ConfigurationProperty # size: cdk.Size instance_group_config_property = stepfunctions_tasks.EmrCreateCluster.InstanceGroupConfigProperty( instance_count=123, instance_role=stepfunctions_tasks.EmrCreateCluster.InstanceRoleType.MASTER, instance_type="instanceType", # the properties below are optional auto_scaling_policy=stepfunctions_tasks.EmrCreateCluster.AutoScalingPolicyProperty( constraints=stepfunctions_tasks.EmrCreateCluster.ScalingConstraintsProperty( max_capacity=123, min_capacity=123 ), rules=[stepfunctions_tasks.EmrCreateCluster.ScalingRuleProperty( action=stepfunctions_tasks.EmrCreateCluster.ScalingActionProperty( simple_scaling_policy_configuration=stepfunctions_tasks.EmrCreateCluster.SimpleScalingPolicyConfigurationProperty( scaling_adjustment=123, # the properties below are optional adjustment_type=stepfunctions_tasks.EmrCreateCluster.ScalingAdjustmentType.CHANGE_IN_CAPACITY, cool_down=123 ), # the properties below are optional market=stepfunctions_tasks.EmrCreateCluster.InstanceMarket.ON_DEMAND ), name="name", trigger=stepfunctions_tasks.EmrCreateCluster.ScalingTriggerProperty( cloud_watch_alarm_definition=stepfunctions_tasks.EmrCreateCluster.CloudWatchAlarmDefinitionProperty( comparison_operator=stepfunctions_tasks.EmrCreateCluster.CloudWatchAlarmComparisonOperator.GREATER_THAN_OR_EQUAL, metric_name="metricName", period=cdk.Duration.minutes(30), # the properties below are optional dimensions=[stepfunctions_tasks.EmrCreateCluster.MetricDimensionProperty( key="key", value="value" )], evaluation_periods=123, namespace="namespace", statistic=stepfunctions_tasks.EmrCreateCluster.CloudWatchAlarmStatistic.SAMPLE_COUNT, threshold=123, unit=stepfunctions_tasks.EmrCreateCluster.CloudWatchAlarmUnit.NONE ) ), # the properties below are optional description="description" )] ), bid_price="bidPrice", configurations=[stepfunctions_tasks.EmrCreateCluster.ConfigurationProperty( classification="classification", configurations=[configuration_property_], properties={ "properties_key": "properties" } )], ebs_configuration=stepfunctions_tasks.EmrCreateCluster.EbsConfigurationProperty( ebs_block_device_configs=[stepfunctions_tasks.EmrCreateCluster.EbsBlockDeviceConfigProperty( volume_specification=stepfunctions_tasks.EmrCreateCluster.VolumeSpecificationProperty( volume_size=size, volume_type=stepfunctions_tasks.EmrCreateCluster.EbsBlockDeviceVolumeType.GP2, # the properties below are optional iops=123 ), # the properties below are optional volumes_per_instance=123 )], ebs_optimized=False ), market=stepfunctions_tasks.EmrCreateCluster.InstanceMarket.ON_DEMAND, name="name" )
Attributes
-
auto_scaling_policy
¶ An automatic scaling policy for a core instance group or task instance group in an Amazon EMR cluster.
- Default
None
- Return type
Optional
[AutoScalingPolicyProperty
]
-
bid_price
¶ The bid price for each EC2 Spot instance type as defined by InstanceType.
Expressed in USD.
- Default
None
- Return type
Optional
[str
]
-
configurations
¶ The list of configurations supplied for an EMR cluster instance group.
- Default
None
- Return type
Optional
[List
[ConfigurationProperty
]]
-
ebs_configuration
¶ EBS configurations that will be attached to each EC2 instance in the instance group.
- Default
None
- Return type
Optional
[EbsConfigurationProperty
]
-
instance_count
¶ Target number of instances for the instance group.
- Return type
Union
[int
,float
]
-
instance_role
¶ The role of the instance group in the cluster.
- Return type
-
instance_type
¶ The EC2 instance type for all instances in the instance group.
- Return type
str
-
market
¶ Market type of the EC2 instances used to create a cluster node.
- Default
EMR selected default
- Return type
Optional
[InstanceMarket
]
-
name
¶ Friendly name given to the instance group.
- Default
None
- Return type
Optional
[str
]
InstanceMarket¶
InstanceRoleType¶
InstanceTypeConfigProperty¶
-
class
EmrCreateCluster.
InstanceTypeConfigProperty
(*, instance_type, bid_price=None, bid_price_as_percentage_of_on_demand_price=None, configurations=None, ebs_configuration=None, weighted_capacity=None)¶ Bases:
object
An instance type configuration for each instance type in an instance fleet, which determines the EC2 instances Amazon EMR attempts to provision to fulfill On-Demand and Spot target capacities.
- Parameters
instance_type (
str
) – An EC2 instance type.bid_price (
Optional
[str
]) – The bid price for each EC2 Spot instance type as defined by InstanceType. Expressed in USD. Default: - Nonebid_price_as_percentage_of_on_demand_price (
Union
[int
,float
,None
]) – The bid price, as a percentage of On-Demand price. Default: - Noneconfigurations (
Optional
[Sequence
[ConfigurationProperty
]]) – A configuration classification that applies when provisioning cluster instances, which can include configurations for applications and software that run on the cluster. Default: - Noneebs_configuration (
Optional
[EbsConfigurationProperty
]) – The configuration of Amazon Elastic Block Storage (EBS) attached to each instance as defined by InstanceType. Default: - Noneweighted_capacity (
Union
[int
,float
,None
]) – The number of units that a provisioned instance of this type provides toward fulfilling the target capacities defined in the InstanceFleetConfig. Default: - None
- See
https://docs.aws.amazon.com/emr/latest/APIReference/API_InstanceTypeConfig.html
- ExampleMetadata
fixture=_generated
Example:
# The code below shows an example of how to instantiate this type. # The values are placeholders you should change. import aws_cdk.aws_stepfunctions_tasks as stepfunctions_tasks import aws_cdk.core as cdk # configuration_property_: stepfunctions_tasks.EmrCreateCluster.ConfigurationProperty # size: cdk.Size instance_type_config_property = stepfunctions_tasks.EmrCreateCluster.InstanceTypeConfigProperty( instance_type="instanceType", # the properties below are optional bid_price="bidPrice", bid_price_as_percentage_of_on_demand_price=123, configurations=[stepfunctions_tasks.EmrCreateCluster.ConfigurationProperty( classification="classification", configurations=[configuration_property_], properties={ "properties_key": "properties" } )], ebs_configuration=stepfunctions_tasks.EmrCreateCluster.EbsConfigurationProperty( ebs_block_device_configs=[stepfunctions_tasks.EmrCreateCluster.EbsBlockDeviceConfigProperty( volume_specification=stepfunctions_tasks.EmrCreateCluster.VolumeSpecificationProperty( volume_size=size, volume_type=stepfunctions_tasks.EmrCreateCluster.EbsBlockDeviceVolumeType.GP2, # the properties below are optional iops=123 ), # the properties below are optional volumes_per_instance=123 )], ebs_optimized=False ), weighted_capacity=123 )
Attributes
-
bid_price
¶ The bid price for each EC2 Spot instance type as defined by InstanceType.
Expressed in USD.
- Default
None
- Return type
Optional
[str
]
-
bid_price_as_percentage_of_on_demand_price
¶ The bid price, as a percentage of On-Demand price.
- Default
None
- Return type
Union
[int
,float
,None
]
-
configurations
¶ A configuration classification that applies when provisioning cluster instances, which can include configurations for applications and software that run on the cluster.
- Default
None
- Return type
Optional
[List
[ConfigurationProperty
]]
-
ebs_configuration
¶ The configuration of Amazon Elastic Block Storage (EBS) attached to each instance as defined by InstanceType.
- Default
None
- Return type
Optional
[EbsConfigurationProperty
]
-
instance_type
¶ An EC2 instance type.
- Return type
str
-
weighted_capacity
¶ The number of units that a provisioned instance of this type provides toward fulfilling the target capacities defined in the InstanceFleetConfig.
- Default
None
- Return type
Union
[int
,float
,None
]
InstancesConfigProperty¶
-
class
EmrCreateCluster.
InstancesConfigProperty
(*, additional_master_security_groups=None, additional_slave_security_groups=None, ec2_key_name=None, ec2_subnet_id=None, ec2_subnet_ids=None, emr_managed_master_security_group=None, emr_managed_slave_security_group=None, hadoop_version=None, instance_count=None, instance_fleets=None, instance_groups=None, master_instance_type=None, placement=None, service_access_security_group=None, slave_instance_type=None, termination_protected=None)¶ Bases:
object
A specification of the number and type of Amazon EC2 instances.
See the RunJobFlow API for complete documentation on input parameters
- Parameters
additional_master_security_groups (
Optional
[Sequence
[str
]]) – A list of additional Amazon EC2 security group IDs for the master node. Default: - Noneadditional_slave_security_groups (
Optional
[Sequence
[str
]]) – A list of additional Amazon EC2 security group IDs for the core and task nodes. Default: - Noneec2_key_name (
Optional
[str
]) – The name of the EC2 key pair that can be used to ssh to the master node as the user called “hadoop.”. Default: - Noneec2_subnet_id (
Optional
[str
]) – Applies to clusters that use the uniform instance group configuration. To launch the cluster in Amazon Virtual Private Cloud (Amazon VPC), set this parameter to the identifier of the Amazon VPC subnet where you want the cluster to launch. Default: EMR selected defaultec2_subnet_ids (
Optional
[Sequence
[str
]]) – Applies to clusters that use the instance fleet configuration. When multiple EC2 subnet IDs are specified, Amazon EMR evaluates them and launches instances in the optimal subnet. Default: EMR selected defaultemr_managed_master_security_group (
Optional
[str
]) – The identifier of the Amazon EC2 security group for the master node. Default: - Noneemr_managed_slave_security_group (
Optional
[str
]) – The identifier of the Amazon EC2 security group for the core and task nodes. Default: - Nonehadoop_version (
Optional
[str
]) – Applies only to Amazon EMR release versions earlier than 4.0. The Hadoop version for the cluster. Default: - 0.18 if the AmiVersion parameter is not set. If AmiVersion is set, the version of Hadoop for that AMI version is used.instance_count (
Union
[int
,float
,None
]) – The number of EC2 instances in the cluster. Default: 0instance_fleets (
Optional
[Sequence
[InstanceFleetConfigProperty
]]) – Describes the EC2 instances and instance configurations for clusters that use the instance fleet configuration. The instance fleet configuration is available only in Amazon EMR versions 4.8.0 and later, excluding 5.0.x versions. Default: - Noneinstance_groups (
Optional
[Sequence
[InstanceGroupConfigProperty
]]) – Configuration for the instance groups in a cluster. Default: - Nonemaster_instance_type (
Optional
[str
]) – The EC2 instance type of the master node. Default: - Noneplacement (
Optional
[PlacementTypeProperty
]) – The Availability Zone in which the cluster runs. Default: - EMR selected defaultservice_access_security_group (
Optional
[str
]) – The identifier of the Amazon EC2 security group for the Amazon EMR service to access clusters in VPC private subnets. Default: - Noneslave_instance_type (
Optional
[str
]) – The EC2 instance type of the core and task nodes. Default: - Nonetermination_protected (
Optional
[bool
]) – Specifies whether to lock the cluster to prevent the Amazon EC2 instances from being terminated by API call, user intervention, or in the event of a job-flow error. Default: false
- See
https://docs.aws.amazon.com/emr/latest/APIReference/API_JobFlowInstancesConfig.html
- ExampleMetadata
infused
Example:
cluster_role = iam.Role(self, "ClusterRole", assumed_by=iam.ServicePrincipal("ec2.amazonaws.com") ) service_role = iam.Role(self, "ServiceRole", assumed_by=iam.ServicePrincipal("elasticmapreduce.amazonaws.com") ) auto_scaling_role = iam.Role(self, "AutoScalingRole", assumed_by=iam.ServicePrincipal("elasticmapreduce.amazonaws.com") ) auto_scaling_role.assume_role_policy.add_statements( iam.PolicyStatement( effect=iam.Effect.ALLOW, principals=[ iam.ServicePrincipal("application-autoscaling.amazonaws.com") ], actions=["sts:AssumeRole" ] )) tasks.EmrCreateCluster(self, "Create Cluster", instances=tasks.EmrCreateCluster.InstancesConfigProperty(), cluster_role=cluster_role, name=sfn.TaskInput.from_json_path_at("$.ClusterName").value, service_role=service_role, auto_scaling_role=auto_scaling_role )
Attributes
-
additional_master_security_groups
¶ A list of additional Amazon EC2 security group IDs for the master node.
- Default
None
- Return type
Optional
[List
[str
]]
-
additional_slave_security_groups
¶ A list of additional Amazon EC2 security group IDs for the core and task nodes.
- Default
None
- Return type
Optional
[List
[str
]]
-
ec2_key_name
¶ The name of the EC2 key pair that can be used to ssh to the master node as the user called “hadoop.”.
- Default
None
- Return type
Optional
[str
]
-
ec2_subnet_id
¶ Applies to clusters that use the uniform instance group configuration.
To launch the cluster in Amazon Virtual Private Cloud (Amazon VPC), set this parameter to the identifier of the Amazon VPC subnet where you want the cluster to launch.
- Default
EMR selected default
- Return type
Optional
[str
]
-
ec2_subnet_ids
¶ Applies to clusters that use the instance fleet configuration.
When multiple EC2 subnet IDs are specified, Amazon EMR evaluates them and launches instances in the optimal subnet.
- Default
EMR selected default
- Return type
Optional
[List
[str
]]
-
emr_managed_master_security_group
¶ The identifier of the Amazon EC2 security group for the master node.
- Default
None
- Return type
Optional
[str
]
-
emr_managed_slave_security_group
¶ The identifier of the Amazon EC2 security group for the core and task nodes.
- Default
None
- Return type
Optional
[str
]
-
hadoop_version
¶ Applies only to Amazon EMR release versions earlier than 4.0. The Hadoop version for the cluster.
- Default
0.18 if the AmiVersion parameter is not set. If AmiVersion is set, the version of Hadoop for that AMI version is used.
- Return type
Optional
[str
]
-
instance_count
¶ The number of EC2 instances in the cluster.
- Default
0
- Return type
Union
[int
,float
,None
]
-
instance_fleets
¶ Describes the EC2 instances and instance configurations for clusters that use the instance fleet configuration.
The instance fleet configuration is available only in Amazon EMR versions 4.8.0 and later, excluding 5.0.x versions.
- Default
None
- Return type
Optional
[List
[InstanceFleetConfigProperty
]]
-
instance_groups
¶ Configuration for the instance groups in a cluster.
- Default
None
- Return type
Optional
[List
[InstanceGroupConfigProperty
]]
-
master_instance_type
¶ The EC2 instance type of the master node.
- Default
None
- Return type
Optional
[str
]
-
placement
¶ The Availability Zone in which the cluster runs.
- Default
EMR selected default
- Return type
Optional
[PlacementTypeProperty
]
-
service_access_security_group
¶ The identifier of the Amazon EC2 security group for the Amazon EMR service to access clusters in VPC private subnets.
- Default
None
- Return type
Optional
[str
]
-
slave_instance_type
¶ The EC2 instance type of the core and task nodes.
- Default
None
- Return type
Optional
[str
]
-
termination_protected
¶ Specifies whether to lock the cluster to prevent the Amazon EC2 instances from being terminated by API call, user intervention, or in the event of a job-flow error.
- Default
false
- Return type
Optional
[bool
]
KerberosAttributesProperty¶
-
class
EmrCreateCluster.
KerberosAttributesProperty
(*, realm, ad_domain_join_password=None, ad_domain_join_user=None, cross_realm_trust_principal_password=None, kdc_admin_password=None)¶ Bases:
object
Attributes for Kerberos configuration when Kerberos authentication is enabled using a security configuration.
See the RunJobFlow API for complete documentation on input parameters
- Parameters
realm (
str
) – The name of the Kerberos realm to which all nodes in a cluster belong. For example, EC2.INTERNAL.ad_domain_join_password (
Optional
[str
]) – The Active Directory password for ADDomainJoinUser. Default: No adDomainJoinPasswordad_domain_join_user (
Optional
[str
]) – Required only when establishing a cross-realm trust with an Active Directory domain. A user with sufficient privileges to join resources to the domain. Default: No adDomainJoinUsercross_realm_trust_principal_password (
Optional
[str
]) – Required only when establishing a cross-realm trust with a KDC in a different realm. The cross-realm principal password, which must be identical across realms. Default: No crossRealmTrustPrincipalPasswordkdc_admin_password (
Optional
[str
]) – The password used within the cluster for the kadmin service on the cluster-dedicated KDC, which maintains Kerberos principals, password policies, and keytabs for the cluster. Default: No kdcAdminPassword
- See
https://docs.aws.amazon.com/emr/latest/APIReference/API_KerberosAttributes.html
- ExampleMetadata
fixture=_generated
Example:
# The code below shows an example of how to instantiate this type. # The values are placeholders you should change. import aws_cdk.aws_stepfunctions_tasks as stepfunctions_tasks kerberos_attributes_property = stepfunctions_tasks.EmrCreateCluster.KerberosAttributesProperty( realm="realm", # the properties below are optional ad_domain_join_password="adDomainJoinPassword", ad_domain_join_user="adDomainJoinUser", cross_realm_trust_principal_password="crossRealmTrustPrincipalPassword", kdc_admin_password="kdcAdminPassword" )
Attributes
-
ad_domain_join_password
¶ The Active Directory password for ADDomainJoinUser.
- Default
No adDomainJoinPassword
- Return type
Optional
[str
]
-
ad_domain_join_user
¶ Required only when establishing a cross-realm trust with an Active Directory domain.
A user with sufficient privileges to join resources to the domain.
- Default
No adDomainJoinUser
- Return type
Optional
[str
]
-
cross_realm_trust_principal_password
¶ Required only when establishing a cross-realm trust with a KDC in a different realm.
The cross-realm principal password, which must be identical across realms.
- Default
No crossRealmTrustPrincipalPassword
- Return type
Optional
[str
]
-
kdc_admin_password
¶ The password used within the cluster for the kadmin service on the cluster-dedicated KDC, which maintains Kerberos principals, password policies, and keytabs for the cluster.
- Default
No kdcAdminPassword
- Return type
Optional
[str
]
-
realm
¶ The name of the Kerberos realm to which all nodes in a cluster belong.
For example, EC2.INTERNAL.
- Return type
str
MetricDimensionProperty¶
-
class
EmrCreateCluster.
MetricDimensionProperty
(*, key, value)¶ Bases:
object
A CloudWatch dimension, which is specified using a Key (known as a Name in CloudWatch), Value pair.
By default, Amazon EMR uses one dimension whose Key is JobFlowID and Value is a variable representing the cluster ID, which is ${emr.clusterId}. This enables the rule to bootstrap when the cluster ID becomes available
- Parameters
key (
str
) – The dimension name.value (
str
) – The dimension value.
- See
https://docs.aws.amazon.com/emr/latest/APIReference/API_MetricDimension.html
- ExampleMetadata
fixture=_generated
Example:
# The code below shows an example of how to instantiate this type. # The values are placeholders you should change. import aws_cdk.aws_stepfunctions_tasks as stepfunctions_tasks metric_dimension_property = stepfunctions_tasks.EmrCreateCluster.MetricDimensionProperty( key="key", value="value" )
Attributes
-
key
¶ The dimension name.
- Return type
str
-
value
¶ The dimension value.
- Return type
str
PlacementTypeProperty¶
-
class
EmrCreateCluster.
PlacementTypeProperty
(*, availability_zone=None, availability_zones=None)¶ Bases:
object
The Amazon EC2 Availability Zone configuration of the cluster (job flow).
- Parameters
availability_zone (
Optional
[str
]) – The Amazon EC2 Availability Zone for the cluster. AvailabilityZone is used for uniform instance groups, while AvailabilityZones (plural) is used for instance fleets. Default: - EMR selected defaultavailability_zones (
Optional
[Sequence
[str
]]) – When multiple Availability Zones are specified, Amazon EMR evaluates them and launches instances in the optimal Availability Zone. AvailabilityZones is used for instance fleets, while AvailabilityZone (singular) is used for uniform instance groups. Default: - EMR selected default
- See
https://docs.aws.amazon.com/emr/latest/APIReference/API_PlacementType.html
- ExampleMetadata
fixture=_generated
Example:
# The code below shows an example of how to instantiate this type. # The values are placeholders you should change. import aws_cdk.aws_stepfunctions_tasks as stepfunctions_tasks placement_type_property = stepfunctions_tasks.EmrCreateCluster.PlacementTypeProperty( availability_zone="availabilityZone", availability_zones=["availabilityZones"] )
Attributes
-
availability_zone
¶ The Amazon EC2 Availability Zone for the cluster.
AvailabilityZone is used for uniform instance groups, while AvailabilityZones (plural) is used for instance fleets.
- Default
EMR selected default
- Return type
Optional
[str
]
-
availability_zones
¶ When multiple Availability Zones are specified, Amazon EMR evaluates them and launches instances in the optimal Availability Zone.
AvailabilityZones is used for instance fleets, while AvailabilityZone (singular) is used for uniform instance groups.
- Default
EMR selected default
- Return type
Optional
[List
[str
]]
ScalingActionProperty¶
-
class
EmrCreateCluster.
ScalingActionProperty
(*, simple_scaling_policy_configuration, market=None)¶ Bases:
object
The type of adjustment the automatic scaling activity makes when triggered, and the periodicity of the adjustment.
And an automatic scaling configuration, which describes how the policy adds or removes instances, the cooldown period, and the number of EC2 instances that will be added each time the CloudWatch metric alarm condition is satisfied.
- Parameters
simple_scaling_policy_configuration (
SimpleScalingPolicyConfigurationProperty
) – The type of adjustment the automatic scaling activity makes when triggered, and the periodicity of the adjustment.market (
Optional
[InstanceMarket
]) – Not available for instance groups. Instance groups use the market type specified for the group. Default: - EMR selected default
- See
https://docs.aws.amazon.com/emr/latest/APIReference/API_ScalingAction.html
- ExampleMetadata
fixture=_generated
Example:
# The code below shows an example of how to instantiate this type. # The values are placeholders you should change. import aws_cdk.aws_stepfunctions_tasks as stepfunctions_tasks scaling_action_property = stepfunctions_tasks.EmrCreateCluster.ScalingActionProperty( simple_scaling_policy_configuration=stepfunctions_tasks.EmrCreateCluster.SimpleScalingPolicyConfigurationProperty( scaling_adjustment=123, # the properties below are optional adjustment_type=stepfunctions_tasks.EmrCreateCluster.ScalingAdjustmentType.CHANGE_IN_CAPACITY, cool_down=123 ), # the properties below are optional market=stepfunctions_tasks.EmrCreateCluster.InstanceMarket.ON_DEMAND )
Attributes
-
market
¶ Not available for instance groups.
Instance groups use the market type specified for the group.
- Default
EMR selected default
- Return type
Optional
[InstanceMarket
]
-
simple_scaling_policy_configuration
¶ The type of adjustment the automatic scaling activity makes when triggered, and the periodicity of the adjustment.
- Return type
ScalingAdjustmentType¶
ScalingConstraintsProperty¶
-
class
EmrCreateCluster.
ScalingConstraintsProperty
(*, max_capacity, min_capacity)¶ Bases:
object
The upper and lower EC2 instance limits for an automatic scaling policy.
Automatic scaling activities triggered by automatic scaling rules will not cause an instance group to grow above or below these limits.
- Parameters
max_capacity (
Union
[int
,float
]) – The upper boundary of EC2 instances in an instance group beyond which scaling activities are not allowed to grow. Scale-out activities will not add instances beyond this boundary.min_capacity (
Union
[int
,float
]) – The lower boundary of EC2 instances in an instance group below which scaling activities are not allowed to shrink. Scale-in activities will not terminate instances below this boundary.
- See
https://docs.aws.amazon.com/emr/latest/APIReference/API_ScalingConstraints.html
- ExampleMetadata
fixture=_generated
Example:
# The code below shows an example of how to instantiate this type. # The values are placeholders you should change. import aws_cdk.aws_stepfunctions_tasks as stepfunctions_tasks scaling_constraints_property = stepfunctions_tasks.EmrCreateCluster.ScalingConstraintsProperty( max_capacity=123, min_capacity=123 )
Attributes
-
max_capacity
¶ The upper boundary of EC2 instances in an instance group beyond which scaling activities are not allowed to grow.
Scale-out activities will not add instances beyond this boundary.
- Return type
Union
[int
,float
]
-
min_capacity
¶ The lower boundary of EC2 instances in an instance group below which scaling activities are not allowed to shrink.
Scale-in activities will not terminate instances below this boundary.
- Return type
Union
[int
,float
]
ScalingRuleProperty¶
-
class
EmrCreateCluster.
ScalingRuleProperty
(*, action, name, trigger, description=None)¶ Bases:
object
A scale-in or scale-out rule that defines scaling activity, including the CloudWatch metric alarm that triggers activity, how EC2 instances are added or removed, and the periodicity of adjustments.
- Parameters
action (
ScalingActionProperty
) – The conditions that trigger an automatic scaling activity.name (
str
) – The name used to identify an automatic scaling rule. Rule names must be unique within a scaling policy.trigger (
ScalingTriggerProperty
) – The CloudWatch alarm definition that determines when automatic scaling activity is triggered.description (
Optional
[str
]) – A friendly, more verbose description of the automatic scaling rule. Default: - None
- See
https://docs.aws.amazon.com/emr/latest/APIReference/API_ScalingRule.html
- ExampleMetadata
fixture=_generated
Example:
# The code below shows an example of how to instantiate this type. # The values are placeholders you should change. import aws_cdk.aws_stepfunctions_tasks as stepfunctions_tasks import aws_cdk.core as cdk scaling_rule_property = stepfunctions_tasks.EmrCreateCluster.ScalingRuleProperty( action=stepfunctions_tasks.EmrCreateCluster.ScalingActionProperty( simple_scaling_policy_configuration=stepfunctions_tasks.EmrCreateCluster.SimpleScalingPolicyConfigurationProperty( scaling_adjustment=123, # the properties below are optional adjustment_type=stepfunctions_tasks.EmrCreateCluster.ScalingAdjustmentType.CHANGE_IN_CAPACITY, cool_down=123 ), # the properties below are optional market=stepfunctions_tasks.EmrCreateCluster.InstanceMarket.ON_DEMAND ), name="name", trigger=stepfunctions_tasks.EmrCreateCluster.ScalingTriggerProperty( cloud_watch_alarm_definition=stepfunctions_tasks.EmrCreateCluster.CloudWatchAlarmDefinitionProperty( comparison_operator=stepfunctions_tasks.EmrCreateCluster.CloudWatchAlarmComparisonOperator.GREATER_THAN_OR_EQUAL, metric_name="metricName", period=cdk.Duration.minutes(30), # the properties below are optional dimensions=[stepfunctions_tasks.EmrCreateCluster.MetricDimensionProperty( key="key", value="value" )], evaluation_periods=123, namespace="namespace", statistic=stepfunctions_tasks.EmrCreateCluster.CloudWatchAlarmStatistic.SAMPLE_COUNT, threshold=123, unit=stepfunctions_tasks.EmrCreateCluster.CloudWatchAlarmUnit.NONE ) ), # the properties below are optional description="description" )
Attributes
-
action
¶ The conditions that trigger an automatic scaling activity.
- Return type
-
description
¶ A friendly, more verbose description of the automatic scaling rule.
- Default
None
- Return type
Optional
[str
]
-
name
¶ The name used to identify an automatic scaling rule.
Rule names must be unique within a scaling policy.
- Return type
str
-
trigger
¶ The CloudWatch alarm definition that determines when automatic scaling activity is triggered.
- Return type
ScalingTriggerProperty¶
-
class
EmrCreateCluster.
ScalingTriggerProperty
(*, cloud_watch_alarm_definition)¶ Bases:
object
The conditions that trigger an automatic scaling activity and the definition of a CloudWatch metric alarm.
When the defined alarm conditions are met along with other trigger parameters, scaling activity begins.
- Parameters
cloud_watch_alarm_definition (
CloudWatchAlarmDefinitionProperty
) – The definition of a CloudWatch metric alarm. When the defined alarm conditions are met along with other trigger parameters, scaling activity begins.- See
https://docs.aws.amazon.com/emr/latest/APIReference/API_ScalingTrigger.html
- ExampleMetadata
fixture=_generated
Example:
# The code below shows an example of how to instantiate this type. # The values are placeholders you should change. import aws_cdk.aws_stepfunctions_tasks as stepfunctions_tasks import aws_cdk.core as cdk scaling_trigger_property = stepfunctions_tasks.EmrCreateCluster.ScalingTriggerProperty( cloud_watch_alarm_definition=stepfunctions_tasks.EmrCreateCluster.CloudWatchAlarmDefinitionProperty( comparison_operator=stepfunctions_tasks.EmrCreateCluster.CloudWatchAlarmComparisonOperator.GREATER_THAN_OR_EQUAL, metric_name="metricName", period=cdk.Duration.minutes(30), # the properties below are optional dimensions=[stepfunctions_tasks.EmrCreateCluster.MetricDimensionProperty( key="key", value="value" )], evaluation_periods=123, namespace="namespace", statistic=stepfunctions_tasks.EmrCreateCluster.CloudWatchAlarmStatistic.SAMPLE_COUNT, threshold=123, unit=stepfunctions_tasks.EmrCreateCluster.CloudWatchAlarmUnit.NONE ) )
Attributes
-
cloud_watch_alarm_definition
¶ The definition of a CloudWatch metric alarm.
When the defined alarm conditions are met along with other trigger parameters, scaling activity begins.
- Return type
ScriptBootstrapActionConfigProperty¶
-
class
EmrCreateCluster.
ScriptBootstrapActionConfigProperty
(*, path, args=None)¶ Bases:
object
Configuration of the script to run during a bootstrap action.
- Parameters
path (
str
) – Location of the script to run during a bootstrap action. Can be either a location in Amazon S3 or on a local file system.args (
Optional
[Sequence
[str
]]) – A list of command line arguments to pass to the bootstrap action script. Default: No args
- See
https://docs.aws.amazon.com/emr/latest/APIReference/API_ScriptBootstrapActionConfig.html
- ExampleMetadata
fixture=_generated
Example:
# The code below shows an example of how to instantiate this type. # The values are placeholders you should change. import aws_cdk.aws_stepfunctions_tasks as stepfunctions_tasks script_bootstrap_action_config_property = stepfunctions_tasks.EmrCreateCluster.ScriptBootstrapActionConfigProperty( path="path", # the properties below are optional args=["args"] )
Attributes
-
args
¶ A list of command line arguments to pass to the bootstrap action script.
- Default
No args
- Return type
Optional
[List
[str
]]
-
path
¶ Location of the script to run during a bootstrap action.
Can be either a location in Amazon S3 or on a local file system.
- Return type
str
SimpleScalingPolicyConfigurationProperty¶
-
class
EmrCreateCluster.
SimpleScalingPolicyConfigurationProperty
(*, scaling_adjustment, adjustment_type=None, cool_down=None)¶ Bases:
object
An automatic scaling configuration, which describes how the policy adds or removes instances, the cooldown period, and the number of EC2 instances that will be added each time the CloudWatch metric alarm condition is satisfied.
- Parameters
scaling_adjustment (
Union
[int
,float
]) – The amount by which to scale in or scale out, based on the specified AdjustmentType. A positive value adds to the instance group’s EC2 instance count while a negative number removes instances. If AdjustmentType is set to EXACT_CAPACITY, the number should only be a positive integer.adjustment_type (
Optional
[ScalingAdjustmentType
]) – The way in which EC2 instances are added (if ScalingAdjustment is a positive number) or terminated (if ScalingAdjustment is a negative number) each time the scaling activity is triggered. Default: - Nonecool_down (
Union
[int
,float
,None
]) – The amount of time, in seconds, after a scaling activity completes before any further trigger-related scaling activities can start. Default: 0
- See
https://docs.aws.amazon.com/emr/latest/APIReference/API_SimpleScalingPolicyConfiguration.html
- ExampleMetadata
fixture=_generated
Example:
# The code below shows an example of how to instantiate this type. # The values are placeholders you should change. import aws_cdk.aws_stepfunctions_tasks as stepfunctions_tasks simple_scaling_policy_configuration_property = stepfunctions_tasks.EmrCreateCluster.SimpleScalingPolicyConfigurationProperty( scaling_adjustment=123, # the properties below are optional adjustment_type=stepfunctions_tasks.EmrCreateCluster.ScalingAdjustmentType.CHANGE_IN_CAPACITY, cool_down=123 )
Attributes
-
adjustment_type
¶ The way in which EC2 instances are added (if ScalingAdjustment is a positive number) or terminated (if ScalingAdjustment is a negative number) each time the scaling activity is triggered.
- Default
None
- Return type
Optional
[ScalingAdjustmentType
]
-
cool_down
¶ The amount of time, in seconds, after a scaling activity completes before any further trigger-related scaling activities can start.
- Default
0
- Return type
Union
[int
,float
,None
]
-
scaling_adjustment
¶ The amount by which to scale in or scale out, based on the specified AdjustmentType.
A positive value adds to the instance group’s EC2 instance count while a negative number removes instances. If AdjustmentType is set to EXACT_CAPACITY, the number should only be a positive integer.
- Return type
Union
[int
,float
]
SpotAllocationStrategy¶
-
class
EmrCreateCluster.
SpotAllocationStrategy
(value)¶ Bases:
enum.Enum
Spot Allocation Strategies.
Specifies the strategy to use in launching Spot Instance fleets. For example, “capacity-optimized” launches instances from Spot Instance pools with optimal capacity for the number of instances that are launching.
Attributes
-
CAPACITY_OPTIMIZED
¶ Capacity-optimized, which launches instances from Spot Instance pools with optimal capacity for the number of instances that are launching.
-
SpotProvisioningSpecificationProperty¶
-
class
EmrCreateCluster.
SpotProvisioningSpecificationProperty
(*, timeout_action, timeout_duration_minutes, allocation_strategy=None, block_duration_minutes=None)¶ Bases:
object
The launch specification for Spot instances in the instance fleet, which determines the defined duration and provisioning timeout behavior.
- Parameters
timeout_action (
SpotTimeoutAction
) – The action to take when TargetSpotCapacity has not been fulfilled when the TimeoutDurationMinutes has expired.timeout_duration_minutes (
Union
[int
,float
]) – The spot provisioning timeout period in minutes.allocation_strategy (
Optional
[SpotAllocationStrategy
]) – Specifies the strategy to use in launching Spot Instance fleets. Default: - No allocation strategy, i.e. spot instance type will be chosen based on current price onlyblock_duration_minutes (
Union
[int
,float
,None
]) – The defined duration for Spot instances (also known as Spot blocks) in minutes. Default: - No blockDurationMinutes
- See
https://docs.aws.amazon.com/emr/latest/APIReference/API_SpotProvisioningSpecification.html
- ExampleMetadata
fixture=_generated
Example:
# The code below shows an example of how to instantiate this type. # The values are placeholders you should change. import aws_cdk.aws_stepfunctions_tasks as stepfunctions_tasks spot_provisioning_specification_property = stepfunctions_tasks.EmrCreateCluster.SpotProvisioningSpecificationProperty( timeout_action=stepfunctions_tasks.EmrCreateCluster.SpotTimeoutAction.SWITCH_TO_ON_DEMAND, timeout_duration_minutes=123, # the properties below are optional allocation_strategy=stepfunctions_tasks.EmrCreateCluster.SpotAllocationStrategy.CAPACITY_OPTIMIZED, block_duration_minutes=123 )
Attributes
-
allocation_strategy
¶ Specifies the strategy to use in launching Spot Instance fleets.
- Default
No allocation strategy, i.e. spot instance type will be chosen based on current price only
- Return type
Optional
[SpotAllocationStrategy
]
-
block_duration_minutes
¶ The defined duration for Spot instances (also known as Spot blocks) in minutes.
- Default
No blockDurationMinutes
- Return type
Union
[int
,float
,None
]
-
timeout_action
¶ The action to take when TargetSpotCapacity has not been fulfilled when the TimeoutDurationMinutes has expired.
- Return type
-
timeout_duration_minutes
¶ The spot provisioning timeout period in minutes.
- Return type
Union
[int
,float
]
SpotTimeoutAction¶
VolumeSpecificationProperty¶
-
class
EmrCreateCluster.
VolumeSpecificationProperty
(*, volume_size, volume_type, iops=None)¶ Bases:
object
EBS volume specifications such as volume type, IOPS, and size (GiB) that will be requested for the EBS volume attached to an EC2 instance in the cluster.
- Parameters
volume_size (
Size
) – The volume size. If the volume type is EBS-optimized, the minimum value is 10GiB. Maximum size is 1TiBvolume_type (
EbsBlockDeviceVolumeType
) – The volume type. Volume types supported are gp2, io1, standard.iops (
Union
[int
,float
,None
]) – The number of I/O operations per second (IOPS) that the volume supports. Default: - EMR selected default
- See
https://docs.aws.amazon.com/emr/latest/APIReference/API_VolumeSpecification.html
- ExampleMetadata
fixture=_generated
Example:
# The code below shows an example of how to instantiate this type. # The values are placeholders you should change. import aws_cdk.aws_stepfunctions_tasks as stepfunctions_tasks import aws_cdk.core as cdk # size: cdk.Size volume_specification_property = stepfunctions_tasks.EmrCreateCluster.VolumeSpecificationProperty( volume_size=size, volume_type=stepfunctions_tasks.EmrCreateCluster.EbsBlockDeviceVolumeType.GP2, # the properties below are optional iops=123 )
Attributes
-
iops
¶ The number of I/O operations per second (IOPS) that the volume supports.
- Default
EMR selected default
- Return type
Union
[int
,float
,None
]
-
volume_size
¶ The volume size.
If the volume type is EBS-optimized, the minimum value is 10GiB. Maximum size is 1TiB
- Return type
-
volume_type
¶ The volume type.
Volume types supported are gp2, io1, standard.
- Return type