JobDriver

class aws_cdk.aws_stepfunctions_tasks.JobDriver(*, spark_submit_job_driver)

Bases: object

Specify the driver that the EMR Containers job runs on.

The job driver is used to provide an input for the job that will be run.

Parameters

spark_submit_job_driver (Union[SparkSubmitJobDriver, Dict[str, Any]]) – The job driver parameters specified for spark submit.

ExampleMetadata

infused

Example:

tasks.EmrContainersStartJobRun(self, "EMR Containers Start Job Run",
    virtual_cluster=tasks.VirtualClusterInput.from_virtual_cluster_id("de92jdei2910fwedz"),
    release_label=tasks.ReleaseLabel.EMR_6_2_0,
    job_name="EMR-Containers-Job",
    job_driver=tasks.JobDriver(
        spark_submit_job_driver=tasks.SparkSubmitJobDriver(
            entry_point=sfn.TaskInput.from_text("local:///usr/lib/spark/examples/src/main/python/pi.py")
        )
    ),
    application_config=[tasks.ApplicationConfiguration(
        classification=tasks.Classification.SPARK_DEFAULTS,
        properties={
            "spark.executor.instances": "1",
            "spark.executor.memory": "512M"
        }
    )]
)

Attributes

spark_submit_job_driver

The job driver parameters specified for spark submit.

See

https://docs.aws.amazon.com/emr-on-eks/latest/APIReference/API_SparkSubmitJobDriver.html

Return type

SparkSubmitJobDriver