Menu
AWS Glue
Developer Guide

Jobs

Data Types

Job Structure

Specifies a job definition.

Fields

  • Name – String, matching the Single-line string pattern.

    The name you assign to this job definition.

  • Description – Description string, matching the URI address multi-line string pattern.

    Description of the job being defined.

  • LogUri – String.

    This field is reserved for future use.

  • Role – String.

    The name or ARN of the IAM role associated with this job.

  • CreatedOn – Timestamp.

    The time and date that this job definition was created.

  • LastModifiedOn – Timestamp.

    The last point in time when this job definition was modified.

  • ExecutionProperty – An ExecutionProperty object.

    An ExecutionProperty specifying the maximum number of concurrent runs allowed for this job.

  • Command – A JobCommand object.

    The JobCommand that executes this job.

  • DefaultArguments – An array of UTF-8 string–to–UTF-8 string mappings.

    The default arguments for this job, specified as name-value pairs.

    You can specify arguments here that your own job-execution script consumes, as well as arguments that AWS Glue itself consumes.

    For information about how to specify and consume your own Job arguments, see the Calling AWS Glue APIs in Python topic in the developer guide.

    For information about the key-value pairs that AWS Glue consumes to set up your job, see the Special Parameters Used by AWS Glue topic in the developer guide.

  • Connections – A ConnectionsList object.

    The connections used for this job.

  • MaxRetries – Number (integer).

    The maximum number of times to retry this job after a JobRun fails.

  • AllocatedCapacity – Number (integer).

    The number of AWS Glue data processing units (DPUs) allocated to runs of this job. From 2 to 100 DPUs can be allocated; the default is 10. A DPU is a relative measure of processing power that consists of 4 vCPUs of compute capacity and 16 GB of memory. For more information, see the AWS Glue pricing page.

  • Timeout – Number (integer).

    The Job timeout in minutes. This is the maximum time that a job run can consume resources before it is terminated and enters TIMEOUT status. The default is 2,880 minutes (48 hours).

ExecutionProperty Structure

An execution property of a job.

Fields

  • MaxConcurrentRuns – Number (integer).

    The maximum number of concurrent runs allowed for the job. The default is 1. An error is returned when this threshold is reached. The maximum value you can specify is controlled by a service limit.

JobCommand Structure

Specifies code executed when a job is run.

Fields

  • Name – String.

    The name of the job command: this must be glueetl.

  • ScriptLocation – String.

    Specifies the S3 path to a script that executes a job (required).

ConnectionsList Structure

Specifies the connections used by a job.

Fields

  • Connections – An array of UTF-8 strings.

    A list of connections used by the job.

JobUpdate Structure

Specifies information used to update an existing job definition. Note that the previous job definition will be completely overwritten by this information.

Fields

  • Description – Description string, matching the URI address multi-line string pattern.

    Description of the job being defined.

  • LogUri – String.

    This field is reserved for future use.

  • Role – String.

    The name or ARN of the IAM role associated with this job (required).

  • ExecutionProperty – An ExecutionProperty object.

    An ExecutionProperty specifying the maximum number of concurrent runs allowed for this job.

  • Command – A JobCommand object.

    The JobCommand that executes this job (required).

  • DefaultArguments – An array of UTF-8 string–to–UTF-8 string mappings.

    The default arguments for this job.

    You can specify arguments here that your own job-execution script consumes, as well as arguments that AWS Glue itself consumes.

    For information about how to specify and consume your own Job arguments, see the Calling AWS Glue APIs in Python topic in the developer guide.

    For information about the key-value pairs that AWS Glue consumes to set up your job, see the Special Parameters Used by AWS Glue topic in the developer guide.

  • Connections – A ConnectionsList object.

    The connections used for this job.

  • MaxRetries – Number (integer).

    The maximum number of times to retry this job if it fails.

  • AllocatedCapacity – Number (integer).

    The number of AWS Glue data processing units (DPUs) to allocate to this Job. From 2 to 100 DPUs can be allocated; the default is 10. A DPU is a relative measure of processing power that consists of 4 vCPUs of compute capacity and 16 GB of memory. For more information, see the AWS Glue pricing page.

  • Timeout – Number (integer).

    The Job timeout in minutes. This is the maximum time that a job run can consume resources before it is terminated and enters TIMEOUT status. The default is 2,880 minutes (48 hours).

Operations

CreateJob Action (Python: create_job)

Creates a new job definition.

Request

  • Name – String, matching the Single-line string pattern. Required.

    The name you assign to this job definition. It must be unique in your account.

  • Description – Description string, matching the URI address multi-line string pattern.

    Description of the job being defined.

  • LogUri – String.

    This field is reserved for future use.

  • Role – String. Required.

    The name or ARN of the IAM role associated with this job.

  • ExecutionProperty – An ExecutionProperty object.

    An ExecutionProperty specifying the maximum number of concurrent runs allowed for this job.

  • Command – A JobCommand object. Required.

    The JobCommand that executes this job.

  • DefaultArguments – An array of UTF-8 string–to–UTF-8 string mappings.

    The default arguments for this job.

    You can specify arguments here that your own job-execution script consumes, as well as arguments that AWS Glue itself consumes.

    For information about how to specify and consume your own Job arguments, see the Calling AWS Glue APIs in Python topic in the developer guide.

    For information about the key-value pairs that AWS Glue consumes to set up your job, see the Special Parameters Used by AWS Glue topic in the developer guide.

  • Connections – A ConnectionsList object.

    The connections used for this job.

  • MaxRetries – Number (integer).

    The maximum number of times to retry this job if it fails.

  • AllocatedCapacity – Number (integer).

    The number of AWS Glue data processing units (DPUs) to allocate to this Job. From 2 to 100 DPUs can be allocated; the default is 10. A DPU is a relative measure of processing power that consists of 4 vCPUs of compute capacity and 16 GB of memory. For more information, see the AWS Glue pricing page.

  • Timeout – Number (integer).

    The Job timeout in minutes. This is the maximum time that a job run can consume resources before it is terminated and enters TIMEOUT status. The default is 2,880 minutes (48 hours).

Response

Errors

  • InvalidInputException

  • IdempotentParameterMismatchException

  • AlreadyExistsException

  • InternalServiceException

  • OperationTimeoutException

  • ResourceNumberLimitExceededException

  • ConcurrentModificationException

UpdateJob Action (Python: update_job)

Updates an existing job definition.

Request

  • JobName – String, matching the Single-line string pattern. Required.

    Name of the job definition to update.

  • JobUpdate – A JobUpdate object. Required.

    Specifies the values with which to update the job definition.

Response

Errors

  • InvalidInputException

  • EntityNotFoundException

  • InternalServiceException

  • OperationTimeoutException

  • ConcurrentModificationException

GetJob Action (Python: get_job)

Retrieves an existing job definition.

Request

Response

  • Job – A Job object.

    The requested job definition.

Errors

  • InvalidInputException

  • EntityNotFoundException

  • InternalServiceException

  • OperationTimeoutException

GetJobs Action (Python: get_jobs)

Retrieves all current job definitions.

Request

  • NextToken – String.

    A continuation token, if this is a continuation call.

  • MaxResults – Number (integer).

    The maximum size of the response.

Response

  • Jobs – An array of Jobs.

    A list of job definitions.

  • NextToken – String.

    A continuation token, if not all job definitions have yet been returned.

Errors

  • InvalidInputException

  • EntityNotFoundException

  • InternalServiceException

  • OperationTimeoutException

DeleteJob Action (Python: delete_job)

Deletes a specified job definition. If the job definition is not found, no exception is thrown.

Request

Response

Errors

  • InvalidInputException

  • InternalServiceException

  • OperationTimeoutException