Job tuning considerations - Amazon EMR

Job tuning considerations

The EMRFS S3-optimized committer consumes a small amount of memory for each file written by a task attempt until the task gets committed or aborted. In most jobs, the amount of memory consumed is negligible. For jobs that have long-running tasks that write a large number of files, the memory that the committer consumes may be noticeable and require adjustments to the memory allocated for Spark executors. You can tune executor memory using the spark.executor.memory property. As a guideline, a single task writing 100,000 files would typically require an additional 100MB of memory. For more information, see Application properties in the Apache Spark Configuration documentation.