AWS Glue Release Notes - AWS Glue

AWS Glue Release Notes

The Glue version parameter is configured when adding or updating a job. Glue version determines the versions of Apache Spark and Python that AWS Glue supports. The Python version indicates the version supported for jobs of type Spark. The following table lists the available Glue versions, the corresponding Spark and Python versions, and other changes in functionality.

AWS Glue Versions

Glue version Supported Spark and Python versions Changes in Functionality
Glue 0.9
  • Spark 2.2.1

  • Python 2.7

Jobs that were created without specifying a Glue version default to Glue 0.9.

Glue 1.0
  • Spark 2.4.3

  • Python 2.7

  • Python 3.6

You can maintain job bookmarks for Parquet and ORC formats in Glue ETL jobs (using Glue Version 1.0). Previously, you were only able to bookmark common Amazon S3 source formats such as JSON, CSV, Apache Avro and XML in AWS Glue ETL jobs.

When setting format options for ETL inputs and outputs, you can specify to use Apache Avro reader/writer format 1.8 to support Avro logical type reading and writing (using Glue Version 1.0). Previously, only the version 1.7 Avro reader/writer format was supported.

The DynamoDB connection type supports a writer option (using Glue Version 1.0).

Glue 2.0
  • Spark 2.4.3

  • Python 3.7

In addition to the features provided in Glue Version 1.0, Glue Version 2.0 also provides:

  • An upgraded infrastructure for running Apache Spark ETL jobs in AWS Glue with reduced startup times.

  • Default logging is now realtime, with separate streams for drivers and executors, and outputs and errors.

  • Support for specifying additional Python modules or different versions at the job level.

For more information about Glue 2.0 features and limitations, see Running Spark ETL Jobs with Reduced Startup Times.