AWS Glue job run statuses - AWS Glue

AWS Glue job run statuses

You can view the status of an AWS Glue extract, transform, and load (ETL) job while it is running or after it has stopped. You can view the status using the AWS Glue console, the AWS Command Line Interface (AWS CLI), or the GetJobRun action in the AWS Glue API.

Possible job run statuses are STARTING, RUNNING, STOPPING, STOPPED, SUCCEEDED, FAILED, ERROR, WAITING and TIMEOUT.

The following table lists the statuses that indicate abnormal job termination.

Job run status Description
FAILED The job exceeded its maximum allowed concurrent runs, or terminated with an unknown exit code.
ERROR A workflow, schedule trigger, or event trigger attempted to run a deleted job.
TIMEOUT The job run time exceeded its specified timeout value.

The WAITING status indicates a job run is waiting for resources. The following table describes wait behavior for different classes of jobs.

Job type Behavior
Spark jobs (Standard) Jobs that have not been configured to retry based on your maxRetries configuration may enter the WAITING state. A new job run will be in the WAITING state if the service is not able acquire enough resources to start the run. This may occur due to service quotas for your account or capacity limits in your region encountering one of the following error cases:
  • Max concurrent job runs per account exceeded

  • Max concurrent job runs per job exceeded (includes the account level service quota as well as the limit you specify on the job with MaxConcurrentRuns)

  • Max concurrent compute (DPU usage) exceeded

  • Resource unavailable

For more information about AWS Glue service quotas, see AWS Glue endpoints and quotas. The time AWS Glue will wait for resources may differ based on circumstances. A job may transition between non-terminal statuses as it attempts to acquire resources. Eventually, the job will transition to FAILED if it cannot acquire resources. AWS Glue will retry for a maximum of 15 minutes or 10 attempts, whichever comes first.
Spark jobs (Flex) A new job run will be in the WAITING state if the service is not able acquire enough resources to start the run, which delays the starting of the run. The run will be in WAITING state for a maximum of 20 minutes (timeout controlled by the service). After 15 minutes, the service will try to do a force start and depending on available capacity the run may start or fail with an appropriate error message.
Python shell jobs Same behavior as standard jobs using Spark.

The following state diagram outlines expected state transitions through the lifecycle of a AWS Glue job. This information is applicable to all job types.


   A state diagram that outlines the state transitions a AWS Glue job may undergo.