Improve Hive performance - Amazon EMR

Improve Hive performance

Amazon EMR offers features to help optimize performance when using Hive to query, read and write data saved in Amazon S3.

S3 Select can improve query performance for CSV and JSON files in some applications by “pushing down” processing to Amazon S3.

The EMRFS S3 optimized committer is an alternative to the OutputCommitter class, that eliminates list and rename operations to improve performance when writing files Amazon S3 using EMRFS.