What's new? - Amazon EMR

What's new?

This page describes the changes and functionality available in the latest releases of Amazon EMR 7.x, 6.x, and 5.x.

These release notes are also available on the Amazon EMR 7.2.0, Amazon EMR 6.15.0, and Amazon EMR 5.36.2 pages, along with the application versions, component versions, and available configuration classifications for each release.

Note

Later releases of Amazon EMR use AWS Signature Version 4 (SigV4) to authenticate requests to Amazon S3. We recommend that you use an Amazon EMR release that supports SigV4 so that you can access new S3 buckets and avoid interruption to your workloads. For more information and a list of Amazon EMR releases that support SigV4, see Amazon EMR and AWS Signature Version 4.

Amazon EMR 7.2.0 (latest release of 7.x series)

New Amazon EMR releases are made available in different Regions over a period of several days, beginning with the first Region on the initial release date. The latest release version may not be available in your Region during this period.

The following release notes include information for Amazon EMR release 7.2.0. Changes are relative to 7.2.0.

New features
  • Application upgrades – Amazon EMR 7.2.0 application upgrades include Iceberg 1.5.0-amzn-0 and Delta 3.1.0.

  • Amazon EMR adds support so that you can use other applications such as HBase, Flink, and Hive with the Amazon S3 Express One Zone storage class.

  • This release adds the capability to read restored objects, so you can read Glacier objects from an S3 location with the S3A protocol. This capability works with Spark, Flink, and Hive.

Known issues
  • Python 3.11 isn't supported with EMR Studio.

Changes, enhancements, and resolved issues
  • This release fixes a deadlock issue that can occur during internal step cleanup operations. This operation manages the life cycle of steps as they complete on the EMR cluster. This issue affects critical Amazon EMR operations, such as step operation and scaling.

  • This release resolves an issue where custom clusters with custom AMIs that have certain pre-existing log files can cause the Amazon EMR log management daemon to fail.

  • Amazon EMR 7.2.0 upgrades the Amazon EMR daemon responsible for cluster management and monitoring activities from AWS SDK v1 to v2.

  • When you launch a cluster with the latest patch release of Amazon EMR 5.36 or higher, 6.6 or higher, or 7.0 or higher, Amazon EMR uses the latest Amazon Linux 2023 or Amazon Linux 2 release for the default Amazon EMR AMI. For more information, see Using the default Amazon Linux AMI for Amazon EMR.

    OsReleaseLabel (Amazon Linux version) Amazon Linux kernel version Available date Supported Regions
    2023.5.20240708.0 6.1.96-102.177.amzn2023 July 8th, 2024 US East (N. Virginia), US East (Ohio), US West (N. California), US West (Oregon), Europe (Stockholm), Europe (Milan), Europe (Spain), Europe (Frankfurt), Europe (Zurich), Europe (Ireland), Europe (London), Europe (Paris), Asia Pacific (Hong Kong), Asia Pacific (Mumbai), Asia Pacific (Hyderabad), Asia Pacific (Tokyo), Asia Pacific (Seoul), Asia Pacific (Osaka), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Jakarta), Asia Pacific (Melbourne), Africa (Cape Town), South America (São Paulo), Middle East (Bahrain), Middle East (UAE), Canada (Central), Israel (Tel Aviv), Canada West (Calgary), AWS GovCloud (US-West), AWS GovCloud (US-East),China (Beijing), China (Ningxia)

Amazon EMR 6.15.0 (latest release of 6.x series)

New Amazon EMR releases are made available in different Regions over a period of several days, beginning with the first Region on the initial release date. The latest release version may not be available in your Region during this period.

The following release notes include information for Amazon EMR release 6.15.0. Changes are relative to 6.14.0. For information on the release timeline, see the 6.15.0 change log.

New features
  • Application upgrades – Amazon EMR 6.15.0 application upgrades include Apache Hadoop 3.3.6, Apache Hudi 0.14.0-amzn-0, Iceberg 1.4.0-amzn-0, and Trino 426.

  • Faster launches for EMR clusters that run on EC2 – It's now up to 35% faster to launch an Amazon EMR on EC2 cluster. With this improvement, most customers can launch their clusters in 5 minutes or less.

  • CodeWhisperer for EMR Studio – You can now use Amazon CodeWhisperer with Amazon EMR Studio to get real-time recommendations as you write code in JupyterLab. CodeWhisperer can complete your comments, finish single lines of code, make line-by-line recommendations, and generate fully-formed functions.

  • Faster job restart times with Flink – With Amazon EMR 6.15.0 and higher, several new mechanisms are available for Apache Flink to improve the job restart time during task recovery or scaling operations. This optimizes the speed of recovery and restart of execution graphs to improve job stability.

  • Table-level and fine-grained access control for open-table formats – With Amazon EMR 6.15.0 and higher, when you run Spark jobs on Amazon EMR on EC2 clusters that access data in the AWS Glue Data Catalog, you can use AWS Lake Formation to apply table, row, column, and cell level permissions on Hudi, Iceberg, or Delta Lake based tables.

  • Hadoop upgrade – Amazon EMR 6.15.0 includes an upgrade of Apache Hadoop to version 3.3.6. Hadoop 3.3.6 was the latest version at the time of the Amazon EMR 6.15 deployment, released by Apache in June 2023. Prior releases of Amazon EMR (6.9.0 to 6.14.x) used Hadoop 3.3.3.

    The upgrade includes hundreds of improvements and fixes, and features that include reconfigurable datanode parameters, DFSAdmin option to initiate bulk reconfiguration operations on all live datanodes, and a vectored API that allows seek-heavy readers to specify multiple ranges to read. Hadoop 3.3.6 also adds support for HDFS APIs and semantics for its write-ahead log (WAL), so that HBase can run on other storage system implementations. For more information, see the changelogs for versions 3.3.4, 3.3.5, and 3.3.6 in the Apache Hadoop documentation.

  • Support for AWS SDK for Java, version 2 - Amazon EMR 6.15.0 applications can use AWS SDK for Java versions 1.12.569 or 2.20.160 if the application supports v2. The AWS SDK for Java 2.x is a major rewrite of the version 1.x code base. It’s built on top of Java 8+ and adds several frequently requested features. These include support for non-blocking I/O, and the ability to plug in a different HTTP implementation at runtime. For more information, including a Migration Guide from SDK for Java v1 to v2, see the AWS SDK for Java, version 2 guide.

Changes, enhancements, and resolved issues
  • To improve your high-availability EMR clusters, this release enables connectivity to Amazon EMR daemons on local host that use IPv6 endpoints.

  • This release enables TLS 1.2 for communication with ZooKeeper provisioned on all the primary nodes of your high-availability cluster.

  • This release improves the management of ZooKeeper transaction log files that are maintained on primary nodes to minimize scenarios where the log files grow out of bounds and interrupt cluster operations.

  • This release makes intra-node communication more resilient for high-availability EMR clusters. This improvement reduces the chance of bootstrap action failures or cluster start failures.

  • Tez in Amazon EMR 6.15.0 introduces configurations that you can specify to asynchronously open the input splits in a Tez grouped split. This results in faster performance of read queries when there are a large number of input splits in a single Tez grouped split. For more information, see Tez asynchronous split opening.

  • When you launch a cluster with the latest patch release of Amazon EMR 5.36 or higher, 6.6 or higher, or 7.0 or higher, Amazon EMR uses the latest Amazon Linux 2023 or Amazon Linux 2 release for the default Amazon EMR AMI. For more information, see Using the default Amazon Linux AMI for Amazon EMR.

    OsReleaseLabel (Amazon Linux version) Amazon Linux kernel version Available date Supported Regions
    2.0.20240709.1 4.14.348 July 23, 2024 US East (N. Virginia), US East (Ohio), US West (N. California), US West (Oregon), Europe (Stockholm), Europe (Milan), Europe (Frankfurt), Europe (Ireland), Europe (London), Europe (Paris), Asia Pacific (Hong Kong), Asia Pacific (Mumbai), Asia Pacific (Tokyo), Asia Pacific (Seoul), Asia Pacific (Osaka), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Jakarta), Africa (Cape Town), South America (São Paulo), Middle East (Bahrain), Canada (Central), AWS GovCloud (US-West), AWS GovCloud (US-East), China (Beijing), China (Ningxia), Asia Pacific (Hyderabad), Middle East (UAE), Europe (Spain), Europe (Zurich), Asia Pacific (Melbourne), Israel (Tel Aviv), Canada West (Calgary)
    2.0.20240223.0 4.14.336 March 8, 2024 US East (N. Virginia), US East (Ohio), US West (N. California), US West (Oregon), Europe (Stockholm), Europe (Milan), Europe (Spain), Europe (Frankfurt), Europe (Zurich), Europe (Ireland), Europe (London), Europe (Paris), Asia Pacific (Hong Kong), Asia Pacific (Mumbai), Asia Pacific (Hyderabad), Asia Pacific (Tokyo), Asia Pacific (Seoul), Asia Pacific (Osaka), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Jakarta), Asia Pacific (Melbourne), Africa (Cape Town), South America (São Paulo), Middle East (Bahrain), Middle East (UAE), Canada (Central), Israel (Tel Aviv), AWS GovCloud (US-West), AWS GovCloud (US-East), China (Beijing), China (Ningxia), Canada West (Calgary)
    2.0.20240131.0 4.14.336 February 14, 2024 US East (N. Virginia), US East (Ohio), US West (N. California), US West (Oregon), Europe (Stockholm), Europe (Milan), Europe (Spain), Europe (Frankfurt), Europe (Zurich), Europe (Ireland), Europe (London), Europe (Paris), Asia Pacific (Hong Kong), Asia Pacific (Mumbai), Asia Pacific (Hyderabad), Asia Pacific (Tokyo), Asia Pacific (Seoul), Asia Pacific (Osaka), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Jakarta), Asia Pacific (Melbourne), Africa (Cape Town), South America (São Paulo), Middle East (Bahrain), Middle East (UAE), Canada (Central), Israel (Tel Aviv), AWS GovCloud (US-West), AWS GovCloud (US-East), China (Beijing), China (Ningxia), Canada West (Calgary)
    2.0.20240124.0 4.14.336 February 7, 2024 US East (N. Virginia), US East (Ohio), US West (N. California), US West (Oregon), Europe (Stockholm), Europe (Milan), Europe (Spain), Europe (Frankfurt), Europe (Zurich), Europe (Ireland), Europe (London), Europe (Paris), Asia Pacific (Hong Kong), Asia Pacific (Mumbai), Asia Pacific (Hyderabad), Asia Pacific (Tokyo), Asia Pacific (Seoul), Asia Pacific (Osaka), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Jakarta), Asia Pacific (Melbourne), Africa (Cape Town), South America (São Paulo), Middle East (Bahrain), Middle East (UAE), Canada (Central), Israel (Tel Aviv), AWS GovCloud (US-West), AWS GovCloud (US-East), China (Beijing), China (Ningxia), Canada West (Calgary)
    2.0.20240109.0 4.14.334 January 24, 2024 US East (N. Virginia), US East (Ohio), US West (N. California), US West (Oregon), Europe (Stockholm), Europe (Milan), Europe (Spain), Europe (Frankfurt), Europe (Zurich), Europe (Ireland), Europe (London), Europe (Paris), Asia Pacific (Hong Kong), Asia Pacific (Mumbai), Asia Pacific (Hyderabad), Asia Pacific (Tokyo), Asia Pacific (Seoul), Asia Pacific (Osaka), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Jakarta), Asia Pacific (Melbourne), Africa (Cape Town), South America (São Paulo), Middle East (Bahrain), Middle East (UAE), Canada (Central), Israel (Tel Aviv), AWS GovCloud (US-West), AWS GovCloud (US-East), China (Beijing), China (Ningxia), Canada West (Calgary)
    2.0.20231218.0 4.14.330 January 2, 2024 US East (N. Virginia), US East (Ohio), US West (N. California), US West (Oregon), Europe (Stockholm), Europe (Milan), Europe (Spain), Europe (Frankfurt), Europe (Zurich), Europe (Ireland), Europe (London), Europe (Paris), Asia Pacific (Hong Kong), Asia Pacific (Mumbai), Asia Pacific (Hyderabad), Asia Pacific (Tokyo), Asia Pacific (Seoul), Asia Pacific (Osaka), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Jakarta), Asia Pacific (Melbourne), Africa (Cape Town), South America (São Paulo), Middle East (Bahrain), Middle East (UAE), Canada (Central), Israel (Tel Aviv), AWS GovCloud (US-West), AWS GovCloud (US-East), China (Beijing), China (Ningxia)
    2.0.20231206.0 4.14.330 December 22, 2023 US East (N. Virginia), US East (Ohio), US West (N. California), US West (Oregon), Europe (Stockholm), Europe (Milan), Europe (Spain), Europe (Frankfurt), Europe (Zurich), Europe (Ireland), Europe (London), Europe (Paris), Asia Pacific (Hong Kong), Asia Pacific (Mumbai), Asia Pacific (Hyderabad), Asia Pacific (Tokyo), Asia Pacific (Seoul), Asia Pacific (Osaka), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Jakarta), Asia Pacific (Melbourne), Africa (Cape Town), South America (São Paulo), Middle East (Bahrain), Middle East (UAE), Canada (Central), Israel (Tel Aviv), AWS GovCloud (US-West), AWS GovCloud (US-East), China (Beijing), China (Ningxia)
    2.0.20231116.0 4.14.328 December 11, 2023 US East (N. Virginia), US East (Ohio), US West (N. California), US West (Oregon), Europe (Stockholm), Europe (Milan), Europe (Spain), Europe (Frankfurt), Europe (Zurich), Europe (Ireland), Europe (London), Europe (Paris), Asia Pacific (Hong Kong), Asia Pacific (Mumbai), Asia Pacific (Hyderabad), Asia Pacific (Tokyo), Asia Pacific (Seoul), Asia Pacific (Osaka), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Jakarta), Asia Pacific (Melbourne), Africa (Cape Town), South America (São Paulo), Middle East (Bahrain), Middle East (UAE), Canada (Central), Israel (Tel Aviv), AWS GovCloud (US-West), AWS GovCloud (US-East), China (Beijing), China (Ningxia)
    2.0.20231101.0 4.14.327 November 13, 2023 US East (N. Virginia), US East (Ohio), US West (N. California), US West (Oregon), Europe (Stockholm), Europe (Milan), Europe (Spain), Europe (Frankfurt), Europe (Zurich), Europe (Ireland), Europe (London), Europe (Paris), Asia Pacific (Hong Kong), Asia Pacific (Mumbai), Asia Pacific (Hyderabad), Asia Pacific (Tokyo), Asia Pacific (Seoul), Asia Pacific (Osaka), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Jakarta), Asia Pacific (Melbourne), Africa (Cape Town), South America (São Paulo), Middle East (Bahrain), Middle East (UAE), Canada (Central), Israel (Tel Aviv), AWS GovCloud (US-West), AWS GovCloud (US-East), China (Beijing), China (Ningxia)

Amazon EMR 5.36.2 (latest release of 5.x series)

New Amazon EMR releases are made available in different Regions over a period of several days, beginning with the first Region on the initial release date. The latest release version may not be available in your Region during this period.

The following release notes include information for Amazon EMR release 5.36.2. Changes are relative to 5.36.1. For information on the release timeline, see the change log.

Changes, enhancements, and resolved issues
  • This releases improves cluster scale-down logic so that Amazon EMR doesn't scale-down core nodes below the HDFS replication factor setting for the cluster. This improvement fulfills data redundancy requirements, and reduces the chance that a scaling operation might stall.

  • This release adds a new retry mechanism to the cluster scaling workflow for that run Presto or Trino. This improvement reduces the risk that cluster resize runs indefinitely due to a single failed resize operation. It also improves cluster utilization, because your cluster scales up and down faster.

  • Fixes an issue where cluster scale-down operations might stall while Amazon EMR gracefully decommissions a core node and it turns unhealthy before it is fully decommissioned.

  • Improves the stability of a node in a high-availability cluster with multiple primary nodes when Amazon EMR restarts a single node.

  • Optimizes log management with Amazon EMR running on Amazon EC2. As a result, you might see a slight reduction in storage costs for your cluster logs.

  • Improves the management of ZooKeeper transaction log files that are maintained on primary nodes to minimize scenarios where the log files grow out of bounds and interrupt cluster operations.

  • Fixes a rare bug which can cause a high-availability cluster with multiple primary nodes to fail due to not being able to communicate with the Yarn ResourceManager.

  • When you launch a cluster with the latest patch release of Amazon EMR 5.36 or higher, 6.6 or higher, or 7.0 or higher, Amazon EMR uses the latest Amazon Linux 2023 or Amazon Linux 2 release for the default Amazon EMR AMI. For more information, see Using the default Amazon Linux AMI for Amazon EMR.

    OsReleaseLabel (Amazon Linux Version) Amazon Linux Kernel Version Available Date Supported Regions
    2.0.20240709.1 4.14.348 July 23, 2024 US East (N. Virginia), US East (Ohio), US West (N. California), US West (Oregon), Europe (Stockholm), Europe (Milan), Europe (Frankfurt), Europe (Ireland), Europe (London), Europe (Paris), Asia Pacific (Hong Kong), Asia Pacific (Mumbai), Asia Pacific (Tokyo), Asia Pacific (Seoul), Asia Pacific (Osaka), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Jakarta), Africa (Cape Town), South America (São Paulo), Middle East (Bahrain), Canada (Central), AWS GovCloud (US-West), AWS GovCloud (US-East), China (Beijing), China (Ningxia), Asia Pacific (Hyderabad), Middle East (UAE), Europe (Spain), Europe (Zurich), Asia Pacific (Melbourne), Israel (Tel Aviv), Canada West (Calgary)
    2.0.20240503.0 4.14.343 xxxxxx, 2024 US East (N. Virginia), US East (Ohio), US West (N. California), US West (Oregon), Europe (Stockholm), Europe (Milan), Europe (Frankfurt), Europe (Ireland), Europe (London), Europe (Paris), Asia Pacific (Hong Kong), Asia Pacific (Mumbai), Asia Pacific (Tokyo), Asia Pacific (Seoul), Asia Pacific (Osaka), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Jakarta), Africa (Cape Town), South America (São Paulo), Middle East (Bahrain), Canada (Central), AWS GovCloud (US-West), AWS GovCloud (US-East), China (Beijing), China (Ningxia)

Amazon EMR and AWS Signature Version 4

Amazon EMR releases use AWS Signature Version 4 (SigV4) to authenticate requests to Amazon S3. Buckets created in Amazon S3 after June 24, 2020 don't support requests signed by Signature Version 2 (SigV2). Buckets created on or before June 24, 2020 will continue to support SigV2. We recommend that you migrate to an Amazon EMR release that supports SigV4 so that you can access new S3 buckets and avoid interruption to your workloads.

If you use applications that are included with Amazon EMR such as Apache Spark, Apache Hive, and Presto, you don't need to change your application code to use SigV4 . If you use custom applications that are not included with Amazon EMR, you might need to update your code to use SigV4. For more information, see Moving from Signature Version 2 to Signature Version 4 in the Amazon S3 User Guide.

The following Amazon EMR releases support SigV4: emr-4.7.4, emr-4.8.5, emr-4.9.6, emr-4.10.1, emr-5.1.1, emr-5.2.3, emr-5.3.2, emr-5.4.1, emr-5.5.4, emr-5.6.1, emr-5.7.1, emr-5.8.3, emr-5.9.1, emr-5.10.1, emr-5.11.4, emr-5.12.3, emr-5.13.1, emr-5.14.2, emr-5.15.1, emr-5.16.1, emr-5.17.2, emr-5.18.1, emr-5.19.1, emr-5.20.1, emr-5.21.2, and emr-5.22.0 and higher. All 6.x and 7.x releases support SigV4.