JobExecutable

class aws_cdk.aws_glue_alpha.JobExecutable(*args: Any, **kwargs)

Bases: object

(experimental) The executable properties related to the Glue job’s GlueVersion, JobType and code.

Stability:

experimental

ExampleMetadata:

infused

Example:

glue.Job(self, "EnableSparkUI",
    job_name="EtlJobWithSparkUIPrefix",
    spark_uI=glue.SparkUIProps(
        enabled=True
    ),
    executable=glue.JobExecutable.python_etl(
        glue_version=glue.GlueVersion.V3_0,
        python_version=glue.PythonVersion.THREE,
        script=glue.Code.from_asset(path.join(__dirname, "job-script", "hello_world.py"))
    )
)

Methods

bind()

(experimental) Called during Job initialization to get JobExecutableConfig.

Stability:

experimental

Return type:

JobExecutableConfig

Static Methods

classmethod of(*, glue_version, language, script, type, class_name=None, extra_files=None, extra_jars=None, extra_jars_first=None, extra_python_files=None, python_version=None, runtime=None, s3_python_modules=None)

(experimental) Create a custom JobExecutable.

Parameters:
  • glue_version (GlueVersion) – (experimental) Glue version.

  • language (JobLanguage) – (experimental) The language of the job (Scala or Python). Equivalent to a job parameter --job-language.

  • script (Code) – (experimental) The script that is executed by a job.

  • type (JobType) – (experimental) Specify the type of the job whether it’s an Apache Spark ETL or streaming one or if it’s a Python shell job.

  • class_name (Optional[str]) – (experimental) The Scala class that serves as the entry point for the job. This applies only if your the job langauage is Scala. Equivalent to a job parameter --class. Default: - no scala className specified

  • extra_files (Optional[Sequence[Code]]) – (experimental) Additional files, such as configuration files that AWS Glue copies to the working directory of your script before executing it. Equivalent to a job parameter --extra-files. Default: - no extra files specified.

  • extra_jars (Optional[Sequence[Code]]) – (experimental) Additional Java .jar files that AWS Glue adds to the Java classpath before executing your script. Equivalent to a job parameter --extra-jars. Default: - no extra jars specified.

  • extra_jars_first (Optional[bool]) – (experimental) Setting this value to true prioritizes the customer’s extra JAR files in the classpath. Equivalent to a job parameter --user-jars-first. Default: - extra jars are not prioritized.

  • extra_python_files (Optional[Sequence[Code]]) – (experimental) Additional Python files that AWS Glue adds to the Python path before executing your script. Equivalent to a job parameter --extra-py-files. Default: - no extra python files specified.

  • python_version (Optional[PythonVersion]) – (experimental) The Python version to use. Default: - no python version specified

  • runtime (Optional[Runtime]) – (experimental) The Runtime to use. Default: - no runtime specified

  • s3_python_modules (Optional[Sequence[Code]]) – (experimental) Additional Python modules that AWS Glue adds to the Python path before executing your script. Equivalent to a job parameter --s3-py-modules. Default: - no extra python files specified.

Stability:

experimental

Return type:

JobExecutable

classmethod python_etl(*, glue_version, python_version, script, extra_files=None, extra_jars=None, extra_jars_first=None, extra_python_files=None, runtime=None)

(experimental) Create Python executable props for Apache Spark ETL job.

Parameters:
  • glue_version (GlueVersion) – (experimental) Glue version.

  • python_version (PythonVersion) – (experimental) The Python version to use.

  • script (Code) – (experimental) The script that executes a job.

  • extra_files (Optional[Sequence[Code]]) – (experimental) Additional files, such as configuration files that AWS Glue copies to the working directory of your script before executing it. Only individual files are supported, directories are not supported. Equivalent to a job parameter --extra-files. Default: [] - no extra files are copied to the working directory

  • extra_jars (Optional[Sequence[Code]]) – (experimental) Additional Java .jar files that AWS Glue adds to the Java classpath before executing your script. Only individual files are supported, directories are not supported. Equivalent to a job parameter --extra-jars. Default: [] - no extra jars are added to the classpath

  • extra_jars_first (Optional[bool]) – (experimental) Setting this value to true prioritizes the customer’s extra JAR files in the classpath. Equivalent to a job parameter --user-jars-first. Default: false - priority is not given to user-provided jars

  • extra_python_files (Optional[Sequence[Code]]) – (experimental) Additional Python files that AWS Glue adds to the Python path before executing your script. Only individual files are supported, directories are not supported. Equivalent to a job parameter --extra-py-files. Default: - no extra python files and argument is not set

  • runtime (Optional[Runtime]) – (experimental) Runtime. It is required for Ray jobs.

Stability:

experimental

Return type:

JobExecutable

classmethod python_ray(*, glue_version, python_version, script, extra_files=None, runtime=None, s3_python_modules=None)

(experimental) Create Python executable props for Ray jobs.

Parameters:
  • glue_version (GlueVersion) – (experimental) Glue version.

  • python_version (PythonVersion) – (experimental) The Python version to use.

  • script (Code) – (experimental) The script that executes a job.

  • extra_files (Optional[Sequence[Code]]) – (experimental) Additional files, such as configuration files that AWS Glue copies to the working directory of your script before executing it. Only individual files are supported, directories are not supported. Equivalent to a job parameter --extra-files. Default: [] - no extra files are copied to the working directory

  • runtime (Optional[Runtime]) – (experimental) Runtime. It is required for Ray jobs.

  • s3_python_modules (Optional[Sequence[Code]]) – (experimental) Additional Python modules that AWS Glue adds to the Python path before executing your script. Equivalent to a job parameter --s3-py-modules. Default: - no extra python files and argument is not set

Stability:

experimental

Return type:

JobExecutable

classmethod python_shell(*, glue_version, python_version, script, extra_files=None, extra_python_files=None, runtime=None)

(experimental) Create Python executable props for python shell jobs.

Parameters:
  • glue_version (GlueVersion) – (experimental) Glue version.

  • python_version (PythonVersion) – (experimental) The Python version to use.

  • script (Code) – (experimental) The script that executes a job.

  • extra_files (Optional[Sequence[Code]]) – (experimental) Additional files, such as configuration files that AWS Glue copies to the working directory of your script before executing it. Only individual files are supported, directories are not supported. Equivalent to a job parameter --extra-files. Default: [] - no extra files are copied to the working directory

  • extra_python_files (Optional[Sequence[Code]]) – (experimental) Additional Python files that AWS Glue adds to the Python path before executing your script. Only individual files are supported, directories are not supported. Equivalent to a job parameter --extra-py-files. Default: - no extra python files and argument is not set

  • runtime (Optional[Runtime]) – (experimental) Runtime. It is required for Ray jobs.

Stability:

experimental

Return type:

JobExecutable

classmethod python_streaming(*, glue_version, python_version, script, extra_files=None, extra_jars=None, extra_jars_first=None, extra_python_files=None, runtime=None)

(experimental) Create Python executable props for Apache Spark Streaming job.

Parameters:
  • glue_version (GlueVersion) – (experimental) Glue version.

  • python_version (PythonVersion) – (experimental) The Python version to use.

  • script (Code) – (experimental) The script that executes a job.

  • extra_files (Optional[Sequence[Code]]) – (experimental) Additional files, such as configuration files that AWS Glue copies to the working directory of your script before executing it. Only individual files are supported, directories are not supported. Equivalent to a job parameter --extra-files. Default: [] - no extra files are copied to the working directory

  • extra_jars (Optional[Sequence[Code]]) – (experimental) Additional Java .jar files that AWS Glue adds to the Java classpath before executing your script. Only individual files are supported, directories are not supported. Equivalent to a job parameter --extra-jars. Default: [] - no extra jars are added to the classpath

  • extra_jars_first (Optional[bool]) – (experimental) Setting this value to true prioritizes the customer’s extra JAR files in the classpath. Equivalent to a job parameter --user-jars-first. Default: false - priority is not given to user-provided jars

  • extra_python_files (Optional[Sequence[Code]]) – (experimental) Additional Python files that AWS Glue adds to the Python path before executing your script. Only individual files are supported, directories are not supported. Equivalent to a job parameter --extra-py-files. Default: - no extra python files and argument is not set

  • runtime (Optional[Runtime]) – (experimental) Runtime. It is required for Ray jobs.

Stability:

experimental

Return type:

JobExecutable

classmethod scala_etl(*, class_name, glue_version, script, extra_files=None, extra_jars=None, extra_jars_first=None, runtime=None)

(experimental) Create Scala executable props for Apache Spark ETL job.

Parameters:
  • class_name (str) – (experimental) The fully qualified Scala class name that serves as the entry point for the job. Equivalent to a job parameter --class.

  • glue_version (GlueVersion) – (experimental) Glue version.

  • script (Code) – (experimental) The script that executes a job.

  • extra_files (Optional[Sequence[Code]]) – (experimental) Additional files, such as configuration files that AWS Glue copies to the working directory of your script before executing it. Only individual files are supported, directories are not supported. Equivalent to a job parameter --extra-files. Default: [] - no extra files are copied to the working directory

  • extra_jars (Optional[Sequence[Code]]) – (experimental) Additional Java .jar files that AWS Glue adds to the Java classpath before executing your script. Only individual files are supported, directories are not supported. Equivalent to a job parameter --extra-jars. Default: [] - no extra jars are added to the classpath

  • extra_jars_first (Optional[bool]) – (experimental) Setting this value to true prioritizes the customer’s extra JAR files in the classpath. Equivalent to a job parameter --user-jars-first. Default: false - priority is not given to user-provided jars

  • runtime (Optional[Runtime]) – (experimental) Runtime. It is required for Ray jobs.

Stability:

experimental

Return type:

JobExecutable

classmethod scala_streaming(*, class_name, glue_version, script, extra_files=None, extra_jars=None, extra_jars_first=None, runtime=None)

(experimental) Create Scala executable props for Apache Spark Streaming job.

Parameters:
  • class_name (str) – (experimental) The fully qualified Scala class name that serves as the entry point for the job. Equivalent to a job parameter --class.

  • glue_version (GlueVersion) – (experimental) Glue version.

  • script (Code) – (experimental) The script that executes a job.

  • extra_files (Optional[Sequence[Code]]) – (experimental) Additional files, such as configuration files that AWS Glue copies to the working directory of your script before executing it. Only individual files are supported, directories are not supported. Equivalent to a job parameter --extra-files. Default: [] - no extra files are copied to the working directory

  • extra_jars (Optional[Sequence[Code]]) – (experimental) Additional Java .jar files that AWS Glue adds to the Java classpath before executing your script. Only individual files are supported, directories are not supported. Equivalent to a job parameter --extra-jars. Default: [] - no extra jars are added to the classpath

  • extra_jars_first (Optional[bool]) – (experimental) Setting this value to true prioritizes the customer’s extra JAR files in the classpath. Equivalent to a job parameter --user-jars-first. Default: false - priority is not given to user-provided jars

  • runtime (Optional[Runtime]) – (experimental) Runtime. It is required for Ray jobs.

Stability:

experimental

Return type:

JobExecutable