AWS Glue
Developer Guide

Document History for AWS Glue

The following table describes important changes to the documentation for AWS Glue.

  • Latest API version: 2018-07-10

  • Latest documentation update: July 13, 2018

Change Description Date
Support for Apache Spark job metrics

Added information about the use of Apache Spark metrics for better debugging and profiling of ETL jobs. You can easily track runtime metrics such as bytes read and written, memory usage and CPU load of the driver and executors, and data shuffles among executors from the AWS Glue console. For more information, see Monitoring AWS Glue Using CloudWatch Metrics, Job Monitoring and Debugging and Working with Jobs on the AWS Glue Console.

July 13, 2018
Support of DynamoDB as a data source

Added information about crawling DynamoDB and using it as a data source of ETL jobs. For more information, see Cataloging Tables with a Crawler and Connection Parameters.

July 10, 2018
Updates to create notebook server procedure

Updated information about how to create a notebook server on an Amazon EC2 instance associated with a development endpoint. For more information, see Creating a Notebook Server Associated with a Development Endpoint.

July 9, 2018
Updates now available over RSS

You can now subscribe to an RSS feed to receive notifications about updates to the AWS Glue Developer Guide.

June 25, 2018
Support delay notifications for jobs

Added information about configuring a delay threshold when a job runs. For more information, see Adding Jobs in AWS Glue.

May 25, 2018
Configure a crawler to append new columns

Added information about new configuration option for crawlers, MergeNewColumns. For more information, see Configuring a Crawler.

May 7, 2018
Support timeout of jobs

Added information about setting a timeout threshold when a job runs. For more information, see Adding Jobs in AWS Glue.

April 10, 2018
Support Scala ETL script and trigger jobs based on additional run states

Added information about using Scala as the ETL programming language. In addition, the trigger API now supports firing when any conditions are met (in addition to all conditions). Also, jobs can be triggered based on a "failed" or "stopped" job run (in addition to a "succeeded" job run).

January 12, 2018

Earlier Updates

The following table describes the important changes in each release of the AWS Glue Developer Guide before January 2018.

Change Description Date
Support XML data sources and new crawler configuration option Added information about classifying XML data sources and new crawler option for partition changes. November 16, 2017
New transforms, support for additional Amazon RDS database engines, and development endpoint enhancements Added information about the map and filter transforms, support for Amazon RDS Microsoft SQL Server and Amazon RDS Oracle, and new features for development endpoints. September 29, 2017
AWS Glue initial release This is the initial release of the AWS Glue Developer Guide. August 14, 2017

On this page: