What's new? - Amazon EMR

What's new?

This topic covers features and issues resolved in the current release of Amazon EMR 6.x series and 5.x series. These release notes are also available on the Release 6.8.0 Tab and Release 5.36.0 Tab, along with the application versions, component versions, and available configuration classifications for this release.

Subscribe to the RSS feed for Amazon EMR release notes at https://docs.aws.amazon.com/emr/latest/ReleaseGuide/amazon-emr-release-notes.rss to receive updates when a new Amazon EMR release version is available.

For earlier release notes going back to release version 4.2.0, see Amazon EMR what's new history.

Note

Twenty-five previous Amazon EMR release versions now use AWS Signature Version 4 to authenticate requests to Amazon S3. The use of AWS Signature version 2 is being phased out and new S3 buckets created after June 24, 2020 will not support Signature Version 2 signed requests. Existing buckets will continue to support Signature Version 2. We recommend migrating to an Amazon EMR release that supports Signature Version 4 so you can continue accessing new S3 buckets and avoid any potential interruption to your workloads.

The following EMR releases are now available that supports Signature Version 4: emr-4.7.4, emr-4.8.5, emr-4.9.6, emr-4.10.1, emr-5.1.1, emr-5.2.3, emr-5.3.2, emr-5.4.1, emr-5.5.4, emr-5.6.1, emr-5.7.1, emr-5.8.3, emr-5.9.1, emr-5.10.1, emr-5.11.4, emr-5.12.3, emr-5.13.1, emr-5.14.2, emr-5.15.1, emr-5.16.1, emr-5.17.2, emr-5.18.1, emr-5.19.1, emr-5.20.1, and emr-5.21.2. EMR version 5.22.0 and later already support Signature Version 4.

You do not need to change your application code to use Signature Version 4 if you are using Amazon EMR applications, such as Apache Spark, Apache Hive, Presto, etc. If you are using custom applications, which are not included with Amazon EMR, you may need to update your code to use Signature Version 4. For more information about what updates may be required, see Moving from Signature Version 2 to Signature Version 4.

Release 6.8.0 (latest version of Amazon EMR 6.x series)

New Amazon EMR release versions are made available in different Regions over a period of several days, beginning with the first Region on the initial release date. The latest release version may not be available in your Region during this period.

The following release notes include information for Amazon EMR release version 6.8.0. Changes are relative to 6.7.0.

New Features

  • Amazon EMR steps feature now supports Apache Livy endpoint and JSBC/ODBC clients. For more information, see Configure runtime roles for Amazon EMR steps.

  • Amazon EMR release 6.8.0 comes with Apache HBase release 2.4.12. With this HBase release, you can both archive and delete your HBase tables. The Amazon S3 archive process renames all table files to the archive directory. This can be a costly and lengthy process. Now, you can skip the archive process and quickly drop and delete large tables. For more information, see Using the HBase shell.

Changes, Enhancements, and Resolved Issues

  • When Amazon EMR release 6.5.0, 6.6.0, or 6.7.0 read Apache Phoenix tables through the Apache Spark shell, Amazon EMR produced a NoSuchMethodError. Amazon EMR release 6.8.0 fixes this issue.

  • Amazon EMR release 6.8.0 comes with Apache Hudi 0.11.1; however, Amazon EMR 6.8.0 clusters are also compatible with the open-source hudi-spark3.3-bundle_2.12 from Hudi 0.12.0.

  • Amazon EMR release 6.8.0 comes with Apache Spark 3.3.0. This Spark release uses Apache Log4j 2 and the log4j2.properties file to configure Log4j in Spark processes. If you use Spark in the cluster or create EMR clusters with custom configuration parameters, and you want to upgrade to Amazon EMR release 6.8.0, you must migrate to the new spark-log4j2 configuration classification and key format for Apache Log4j 2. For more information, see Migrating from Apache Log4j 1.x to Log4j 2.x.

  • With Amazon EMR release 6.6.0 and later, when you launch new Amazon EMR clusters with the default Amazon Linux (AL) AMI option, Amazon EMR automatically uses the latest Amazon Linux AMI. In earlier releases, Amazon EMR does not update the Amazon Linux AMIs after the initial release. See Using the default Amazon Linux AMI for Amazon EMR.

    OsReleaseLabel (Amazon Linux Version) Amazon Linux Kernel Version Available Date
    2.0.20220912.1 4.14.291 2022-09-06

For more information on the release timeline, see the Change log for 6.8.0 release and release notes.

Release 5.36.0 (latest version of Amazon EMR 5.x series)

New Amazon EMR release versions are made available in different Regions over a period of several days, beginning with the first Region on the initial release date. The latest release version may not be available in your Region during this period.

The following release notes include information for Amazon EMR release version 5.36.0. Changes are relative to 5.35.0.

Initial release date: June 15, 2022

New Features

  • Amazon EMR release 5.36.0 adds support for data definition language (DDL) with Apache Spark on Apache Ranger enabled clusters. This allows you to use Apache Ranger for managing access for operations like creating, altering and dropping databases and tables from an Amazon EMR cluster.

  • Amazon EMR 5.36.0 supports automatic Amazon Linux updates for clusters using a default AMI. See Using the default Amazon Linux AMI for Amazon EMR.

    OsReleaseLabel (Amazon Linux Version) Amazon Linux Kernel Version Available Date
    2.0.20220426.0 4.14.281 6/14/2022

Changes, Enhancements, and Resolved Issues

  • Amazon EMR 5.36.0 upgrades now support: aws-java-sdk 1.12.206, Hadoop 2.10.1-amzn-4, Hive 2.3.9-amzn-2, Hudi 0.10.1-amzn-1, Spark 2.4.8-amzn-2, Presto 0.267-amzn-1, Amazon Glue connector 1.18.0, EMRFS 2.51.0.