Amazon EMR
Amazon EMR Release Guide

The AWS Documentation website is getting a new look!
Try it now and let us know what you think. Switch to the new look >>

You can return to the original look by selecting English in the language selector above.

Configuring Flink

You may want to configure Flink using a configuration file. For example, the main configuration file for Flink is called flink-conf.yaml. This is configurable using the Amazon EMR configuration API.

To configure the number of task slots used for Flink using the AWS CLI

  1. Create a file, configuration.json, with the following content:

    [ { "Classification": "flink-conf", "Properties": { "taskmanager.numberOfTaskSlots":"2" } } ]
  2. Next, create a cluster with the following configuration:

    aws emr create-cluster --release-label emr-5.26.0 \ --applications Name=Flink \ --configurations file://./configurations.json \ --region us-east-1 \ --log-uri s3://myLogUri \ --instance-type m4.large \ --instance-count 2 \ --service-role EMR_DefaultRole \ --ec2-attributes KeyName=YourKeyName,InstanceProfile=EMR_EC2_DefaultRole


It is also possible to change some configurations using the Flink API. For more information, see Basic API Concepts in the Flink documentation.

With Amazon EMR version 5.21.0 and later, you can override cluster configurations and specify additional configuration classifications for each instance group in a running cluster. You do this by using the Amazon EMR console, the AWS Command Line Interface (AWS CLI), or the AWS SDK. For more information, see Supplying a Configuration for an Instance Group in a Running Cluster.

As the owner of your application, you know best what resources should be assigned to tasks within Flink. For the purposes of the examples in this documentation, use the same number of tasks as the slave instances that you use for the application. We generally recommend this for the initial level of parallelism but you can also increase the granularity of parallelism using task slots, which should generally not exceed the number of virtual cores per instance. For more information about Flink’s architecture, see Concepts in the Flink documentation.

Currently, the files that are configurable within the Amazon EMR configuration API are:

  • flink-conf.yaml