AWS Glue
Developer Guide

The AWS Documentation website is getting a new look!
Try it now and let us know what you think. Switch to the new look >>

You can return to the original look by selecting English in the language selector above.

Workflows

The Workflows API describes the data types and API related to creating, updating, or viewing workflows in AWS Glue.

Data Types

JobNodeDetails Structure

The details of a Job node present in the workflow.

Fields

  • JobRuns – An array of JobRun objects.

    The information for the job runs represented by the job node.

CrawlerNodeDetails Structure

The details of a Crawler node present in the workflow.

Fields

  • Crawls – An array of Crawl objects.

    A list of crawls represented by the crawl node.

TriggerNodeDetails Structure

The details of a Trigger node present in the workflow.

Fields

  • Trigger – A Trigger object.

    The information of the trigger represented by the trigger node.

Crawl Structure

The details of a crawl in the workflow.

Fields

  • State – UTF-8 string (valid values: RUNNING | SUCCEEDED | CANCELLED | FAILED).

    The state of the crawler.

  • StartedOn – Timestamp.

    The date and time on which the crawl started.

  • CompletedOn – Timestamp.

    The date and time on which the crawl completed.

  • ErrorMessage – Description string, not more than 2048 bytes long, matching the URI address multi-line string pattern.

    The error message associated with the crawl.

  • LogGroup – UTF-8 string, not less than 1 or more than 512 bytes long, matching the Log group string pattern.

    The log group associated with the crawl.

  • LogStream – UTF-8 string, not less than 1 or more than 512 bytes long, matching the Log-stream string pattern.

    The log stream associated with the crawl.

Node Structure

A node represents an AWS Glue component like Trigger, Job etc. which is part of a workflow.

Fields

  • Type – UTF-8 string (valid values: CRAWLER | JOB | TRIGGER).

    The type of AWS Glue component represented by the node.

  • Name – UTF-8 string, not less than 1 or more than 255 bytes long, matching the Single-line string pattern.

    The name of the AWS Glue component represented by the node.

  • UniqueId – UTF-8 string, not less than 1 or more than 255 bytes long, matching the Single-line string pattern.

    The unique Id assigned to the node within the workflow.

  • TriggerDetails – A TriggerNodeDetails object.

    Details of the Trigger when the node represents a Trigger.

  • JobDetails – A JobNodeDetails object.

    Details of the Job when the node represents a Job.

  • CrawlerDetails – A CrawlerNodeDetails object.

    Details of the crawler when the node represents a crawler.

Edge Structure

An edge represents a directed connection between two AWS Glue components which are part of the workflow the edge belongs to.

Fields

  • SourceId – UTF-8 string, not less than 1 or more than 255 bytes long, matching the Single-line string pattern.

    The unique of the node within the workflow where the edge starts.

  • DestinationId – UTF-8 string, not less than 1 or more than 255 bytes long, matching the Single-line string pattern.

    The unique of the node within the workflow where the edge ends.

WorkflowGraph Structure

A workflow graph represents the complete workflow containing all the AWS Glue components present in the workflow and all the directed connections between them.

Fields

  • Nodes – An array of Node objects.

    A list of the the AWS Glue components belong to the workflow represented as nodes.

  • Edges – An array of Edge objects.

    A list of all the directed connections between the nodes belonging to the workflow.

WorkflowRun Structure

A workflow run is an execution of a workflow providing all the runtime information.

Fields

  • Name – UTF-8 string, not less than 1 or more than 255 bytes long, matching the Single-line string pattern.

    Name of the workflow which was executed.

  • WorkflowRunId – UTF-8 string, not less than 1 or more than 255 bytes long, matching the Single-line string pattern.

    The ID of this workflow run.

  • WorkflowRunProperties – A map array of key-value pairs.

    Each key is a UTF-8 string, not less than 1 or more than 255 bytes long, matching the Single-line string pattern.

    Each value is a UTF-8 string.

    The workflow run properties which were set during the run.

  • StartedOn – Timestamp.

    The date and time when the workflow run was started.

  • CompletedOn – Timestamp.

    The date and time when the workflow run completed.

  • Status – UTF-8 string (valid values: RUNNING | COMPLETED).

    The status of the workflow run.

  • Statistics – A WorkflowRunStatistics object.

    The statistics of the run.

  • Graph – A WorkflowGraph object.

    The graph representing all the AWS Glue components that belong to the workflow as nodes and directed connections between them as edges.

WorkflowRunStatistics Structure

Workflow run statistics provides statistics about the workflow run.

Fields

  • TotalActions – Number (integer).

    Total number of Actions in the workflow run.

  • TimeoutActions – Number (integer).

    Total number of Actions which timed out.

  • FailedActions – Number (integer).

    Total number of Actions which have failed.

  • StoppedActions – Number (integer).

    Total number of Actions which have stopped.

  • SucceededActions – Number (integer).

    Total number of Actions which have succeeded.

  • RunningActions – Number (integer).

    Total number Actions in running state.

Workflow Structure

A workflow represents a flow in which AWS Glue components should be executed to complete a logical task.

Fields

  • Name – UTF-8 string, not less than 1 or more than 255 bytes long, matching the Single-line string pattern.

    The name of the workflow representing the flow.

  • Description – UTF-8 string.

    A description of the workflow.

  • DefaultRunProperties – A map array of key-value pairs.

    Each key is a UTF-8 string, not less than 1 or more than 255 bytes long, matching the Single-line string pattern.

    Each value is a UTF-8 string.

    A collection of properties to be used as part of each execution of the workflow.

  • CreatedOn – Timestamp.

    The date and time when the workflow was created.

  • LastModifiedOn – Timestamp.

    The date and time when the workflow was last modified.

  • LastRun – A WorkflowRun object.

    The information about the last execution of the workflow.

  • Graph – A WorkflowGraph object.

    The graph representing all the AWS Glue components that belong to the workflow as nodes and directed connections between them as edges.

  • CreationStatus – UTF-8 string (valid values: CREATING | CREATED | CREATION_FAILED).

    The creation status of the workflow.

Operations

CreateWorkflow Action (Python: create_workflow)

Creates a new workflow.

Request

  • NameRequired: UTF-8 string, not less than 1 or more than 255 bytes long, matching the Single-line string pattern.

    The name to be assigned to the workflow. It should be unique within your account.

  • Description – UTF-8 string.

    A description of the workflow.

  • DefaultRunProperties – A map array of key-value pairs.

    Each key is a UTF-8 string, not less than 1 or more than 255 bytes long, matching the Single-line string pattern.

    Each value is a UTF-8 string.

    A collection of properties to be used as part of each execution of the workflow.

  • Tags – A map array of key-value pairs, not more than 50 pairs.

    Each key is a UTF-8 string, not less than 1 or more than 128 bytes long.

    Each value is a UTF-8 string, not more than 256 bytes long.

    The tags to be used with this workflow.

Response

  • Name – UTF-8 string, not less than 1 or more than 255 bytes long, matching the Single-line string pattern.

    The name of the workflow which was provided as part of the request.

Errors

  • AlreadyExistsException

  • InvalidInputException

  • InternalServiceException

  • OperationTimeoutException

  • ResourceNumberLimitExceededException

  • ConcurrentModificationException

UpdateWorkflow Action (Python: update_workflow)

Updates an existing workflow.

Request

  • NameRequired: UTF-8 string, not less than 1 or more than 255 bytes long, matching the Single-line string pattern.

    Name of the workflow to be updated.

  • Description – UTF-8 string.

    The description of the workflow.

  • DefaultRunProperties – A map array of key-value pairs.

    Each key is a UTF-8 string, not less than 1 or more than 255 bytes long, matching the Single-line string pattern.

    Each value is a UTF-8 string.

    A collection of properties to be used as part of each execution of the workflow.

Response

  • Name – UTF-8 string, not less than 1 or more than 255 bytes long, matching the Single-line string pattern.

    The name of the workflow which was specified in input.

Errors

  • InvalidInputException

  • EntityNotFoundException

  • InternalServiceException

  • OperationTimeoutException

  • ConcurrentModificationException

DeleteWorkflow Action (Python: delete_workflow)

Deletes a workflow.

Request

  • NameRequired: UTF-8 string, not less than 1 or more than 255 bytes long, matching the Single-line string pattern.

    Name of the workflow to be deleted.

Response

  • Name – UTF-8 string, not less than 1 or more than 255 bytes long, matching the Single-line string pattern.

    Name of the workflow specified in input.

Errors

  • InvalidInputException

  • InternalServiceException

  • OperationTimeoutException

  • ConcurrentModificationException

ListWorkflows Action (Python: list_workflows)

Lists names of workflows created in the account.

Request

  • NextToken – UTF-8 string.

    A continuation token, if this is a continuation request.

  • MaxResults – Number (integer), not less than 1 or more than 1000.

    The maximum size of a list to return.

Response

  • Workflows – An array of UTF-8 strings, not less than 1 or more than 25 strings.

    List of names of workflows in the account.

  • NextToken – UTF-8 string.

    A continuation token, if not all workflow names have been returned.

Errors

  • InvalidInputException

  • InternalServiceException

  • OperationTimeoutException

BatchGetWorkflows Action (Python: batch_get_workflows)

Returns a list of resource metadata for a given list of workflow names. After calling the ListWorkflows operation, you can call this operation to access the data to which you have been granted permissions. This operation supports all IAM permissions, including permission conditions that uses tags.

Request

  • NamesRequired: An array of UTF-8 strings, not less than 1 or more than 25 strings.

    A list of workflow names, which may be the names returned from the ListWorkflows operation.

  • IncludeGraph – Boolean.

    Specifies whether to include a graph when returning the workflow resource metadata.

Response

  • Workflows – An array of Workflow objects, not less than 1 or more than 25 structures.

    A list of workflow resource metadata.

  • MissingWorkflows – An array of UTF-8 strings, not less than 1 or more than 25 strings.

    A list of names of workflows not found.

Errors

  • InternalServiceException

  • OperationTimeoutException

  • InvalidInputException

GetWorkflowRun Action (Python: get_workflow_run)

Retrieves the metadata for a given workflow run.

Request

  • NameRequired: UTF-8 string, not less than 1 or more than 255 bytes long, matching the Single-line string pattern.

    Name of the workflow being run.

  • RunIdRequired: UTF-8 string, not less than 1 or more than 255 bytes long, matching the Single-line string pattern.

    The ID of the workflow run.

  • IncludeGraph – Boolean.

    Specifies whether to include the workflow graph in response or not.

Response

  • Run – A WorkflowRun object.

    The requested workflow run metadata.

Errors

  • InvalidInputException

  • EntityNotFoundException

  • InternalServiceException

  • OperationTimeoutException

GetWorkflowRuns Action (Python: get_workflow_runs)

Retrieves metadata for all runs of a given workflow.

Request

  • NameRequired: UTF-8 string, not less than 1 or more than 255 bytes long, matching the Single-line string pattern.

    Name of the workflow whose metadata of runs should be returned.

  • IncludeGraph – Boolean.

    Specifies whether to include the workflow graph in response or not.

  • NextToken – UTF-8 string.

    The maximum size of the response.

  • MaxResults – Number (integer), not less than 1 or more than 1000.

    The maximum number of workflow runs to be included in the response.

Response

  • Runs – An array of WorkflowRun objects, not less than 1 or more than 1000 structures.

    A list of workflow run metadata objects.

  • NextToken – UTF-8 string.

    A continuation token, if not all requested workflow runs have been returned.

Errors

  • InvalidInputException

  • EntityNotFoundException

  • InternalServiceException

  • OperationTimeoutException

GetWorkflowRunProperties Action (Python: get_workflow_run_properties)

Retrieves the workflow run properties which were set during the run.

Request

  • NameRequired: UTF-8 string, not less than 1 or more than 255 bytes long, matching the Single-line string pattern.

    Name of the workflow which was run.

  • RunIdRequired: UTF-8 string, not less than 1 or more than 255 bytes long, matching the Single-line string pattern.

    The ID of the workflow run whose run properties should be returned.

Response

  • RunProperties – A map array of key-value pairs.

    Each key is a UTF-8 string, not less than 1 or more than 255 bytes long, matching the Single-line string pattern.

    Each value is a UTF-8 string.

    The workflow run properties which were set during the specified run.

Errors

  • InvalidInputException

  • EntityNotFoundException

  • InternalServiceException

  • OperationTimeoutException

PutWorkflowRunProperties Action (Python: put_workflow_run_properties)

Puts the specified workflow run properties for the given workflow run. If a property already exists for the specified run, then it overrides the value otherwise adds the property to existing properties.

Request

  • NameRequired: UTF-8 string, not less than 1 or more than 255 bytes long, matching the Single-line string pattern.

    Name of the workflow which was run.

  • RunIdRequired: UTF-8 string, not less than 1 or more than 255 bytes long, matching the Single-line string pattern.

    The ID of the workflow run for which the run properties should be updated.

  • RunPropertiesRequired: A map array of key-value pairs.

    Each key is a UTF-8 string, not less than 1 or more than 255 bytes long, matching the Single-line string pattern.

    Each value is a UTF-8 string.

    The properties to put for the specified run.

Response

  • No Response parameters.

Errors

  • AlreadyExistsException

  • EntityNotFoundException

  • InvalidInputException

  • InternalServiceException

  • OperationTimeoutException

  • ResourceNumberLimitExceededException

  • ConcurrentModificationException