Hudi
Apache Hudi
Hudi is integrated with Apache Spark
With Amazon EMR release version 5.28.0 and later, EMR installs Hudi components by default when Spark, Hive, Presto, or Flink are installed. You can use Spark or the Hudi DeltaStreamer utility to create or update Hudi datasets. You can use Hive, Spark, Presto, or Flink to query a Hudi dataset interactively or build data processing pipelines using incremental pull. Incremental pull refers to the ability to pull only the data that changed between two actions.
These features make Hudi suitable for the following use cases:
-
Working with streaming data from sensors and other Internet of Things (IoT) devices that require specific data insertion and update events.
-
Complying with data privacy regulations in applications where users might choose to be forgotten or modify their consent for how their data can be used.
-
Implementing a change data capture (CDC) system
that allows you to apply changes to a dataset over time.
The following table lists the version of Hudi included in the latest release of the Amazon EMR 7.x series, along with the components that Amazon EMR installs with Hudi.
For the version of components installed with Hudi in this release, see Release 7.2.0 Component Versions.
Hudi version information for emr-7.2.0 | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Amazon EMR Release Label | Hudi Version | Components Installed With Hudi | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
emr-7.2.0 |
Hudi 0.14.1-amzn-1 |
Not available. |
The following table lists the version of Hudi included in the latest release of the Amazon EMR 6.x series, along with the components that Amazon EMR installs with Hudi.
For the version of components installed with Hudi in this release, see Release 6.15.0 Component Versions.
Hudi version information for emr-6.15.0 | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Amazon EMR Release Label | Hudi Version | Components Installed With Hudi | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
emr-6.15.0 |
Hudi 0.14.0-amzn-0 |
Not available. |
Note
Amazon EMR release 6.8.0 comes with Apache Hudihudi-spark3.3-bundle_2.12
from Hudi 0.12.0.
The following table lists the version of Hudi included in the latest release of the Amazon EMR 5.x series, along with the components that Amazon EMR installs with Hudi.
For the version of components installed with Hudi in this release, see Release 5.36.2 Component Versions.
Hudi version information for emr-5.36.2 | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Amazon EMR Release Label | Hudi Version | Components Installed With Hudi | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
emr-5.36.2 |
Hudi 0.10.1-amzn-1 |
Not available. |