Managed node groups - Amazon EKS

Managed node groups

Amazon EKS managed node groups automate the provisioning and lifecycle management of nodes (Amazon EC2 instances) for Amazon EKS Kubernetes clusters.

With Amazon EKS managed node groups, you don’t need to separately provision or register the Amazon EC2 instances that provide compute capacity to run your Kubernetes applications. You can create, automatically update, or terminate nodes for your cluster with a single operation. Nodes run using the latest Amazon EKS optimized AMIs in your AWS account. Node updates and terminations automatically and gracefully drain nodes to ensure that your applications stay available.

All managed nodes are provisioned as part of an Amazon EC2 Auto Scaling group that's managed for you by Amazon EKS. All resources including the instances and Auto Scaling groups run within your AWS account. Each node group run across multiple Availability Zones that you define.

You can add a managed node group to new or existing clusters using the Amazon EKS console, eksctl, AWS CLI; AWS API, or infrastructure as code tools including AWS CloudFormation. Nodes launched as part of a managed node group are automatically tagged for auto-discovery by the Kubernetes cluster autoscaler. You can use the node group to apply Kubernetes labels to nodes and update them at any time.

There are no additional costs to use Amazon EKS managed node groups, you only pay for the AWS resources you provision. These include Amazon EC2 instances, Amazon EBS volumes, Amazon EKS cluster hours, and any other AWS infrastructure. There are no minimum fees and no upfront commitments.

To get started with a new Amazon EKS cluster and managed node group, see Getting started with Amazon EKS – AWS Management Console and AWS CLI.

To add a managed node group to an existing cluster, see Creating a managed node group.

Managed node groups concepts

  • Amazon EKS managed node groups create and manage Amazon EC2 instances for you.

  • All managed nodes are provisioned as part of an Amazon EC2 Auto Scaling group that's managed for you by Amazon EKS. Moreover, all resources including Amazon EC2 instances and Auto Scaling groups run within your AWS account.

  • A managed node group's Auto Scaling group spans all of the subnets that you specify when you create the group.

  • Amazon EKS tags managed node group resources so that they are configured to use the Kubernetes Cluster Autoscaler.

    Important

    If you are running a stateful application across multiple Availability Zones that is backed by Amazon EBS volumes and using the Kubernetes Cluster Autoscaler, you should configure multiple node groups, each scoped to a single Availability Zone. In addition, you should enable the --balance-similar-node-groups feature.

  • By default, instances in a managed node group use the latest version of the Amazon EKS optimized Amazon Linux 2 AMI for its cluster's Kubernetes version. You can choose between standard and GPU variants of the Amazon EKS optimized Amazon Linux 2 AMI. If you deploy using a launch template, you can also use a custom AMI. For more information, see Launch template support.

  • Amazon EKS follows the shared responsibility model for CVEs and security patches on managed node groups. When managed nodes run an Amazon EKS optimized AMI, Amazon EKS is responsible for building patched versions of the AMI when bugs or issues are reported. We can publish a fix. However, you're responsible for deploying these patched AMI versions to your managed node groups. When managed nodes run a custom AMI, you're responsible for building patched versions of the AMI when bugs or issues are reported and then deploying the AMI. For more information, see Updating a managed node group.

  • Amazon EKS managed node groups can be launched in both public and private subnets. If you launch a managed node group in a public subnet on or after April 22, 2020, the subnet must have MapPublicIpOnLaunch set to true for the instances to successfully join a cluster. If the public subnet was created using eksctl or the Amazon EKS vended AWS CloudFormation templates on or after March 26, 2020, then this setting is already set to true. If the public subnets were created before March 26, 2020, then you need to change the setting manually. For more information, see Modifying the public IPv4 addressing attribute for your subnet.

  • When using VPC endpoints in private subnets, you must create endpoints for com.amazonaws.region.ecr.api, com.amazonaws.region.ecr.dkr, and a gateway endpoint for Amazon S3. For more information, see Amazon ECR interface VPC endpoints (AWS PrivateLink).

  • Managed node groups can't be deployed on AWS Outposts or in AWS Wavelength or AWS Local Zones.

  • You can create multiple managed node groups within a single cluster. For example, you can create one node group with the standard Amazon EKS optimized Amazon Linux 2 AMI for some workloads and another with the GPU variant for workloads that require GPU support.

  • If your managed node group encounters a health issue, Amazon EKS returns an error message to help you to diagnose the issue. For more information, see Managed node group errors.

  • Amazon EKS adds Kubernetes labels to managed node group instances. These Amazon EKS provided labels are prefixed with eks.amazonaws.com.

  • Amazon EKS automatically drains nodes using the Kubernetes API during terminations or updates. Updates respect the pod disruption budgets that you set for your pods.

  • There are no additional costs to use Amazon EKS managed node groups. You only pay for the AWS resources that you provision.

  • If you want to encrypt Amazon EBS volumes for your nodes, you can deploy the nodes using a launch template. To deploy managed nodes with encrypted Amazon EBS volumes without using a launch template, you must encrypt all new Amazon EBS volumes created in your account by default. For more information, see Encryption by default in the Amazon EC2 User Guide for Linux Instances.

Managed node group capacity types

When creating a managed node group, you can choose either the On-Demand or Spot capacity type. Amazon EKS deploys a managed node group with an Amazon EC2 Auto Scaling Group that either contains only On-Demand or only Amazon EC2 Spot Instances. You can schedule pods for fault tolerant applications to Spot managed node groups, and fault intolerant applications to On-Demand node groups within a single Kubernetes cluster. By default, a managed node group deploys On-Demand Amazon EC2 instances.

On-Demand

With On-Demand Instances, you pay for compute capacity by the second, with no long-term commitments.

How it works

By default, if you don’t specify a Capacity Type, the managed node group is provisioned with On-Demand Instances. A managed node group configures an Amazon EC2 Auto Scaling group on your behalf with the following settings applied:

  • The allocation strategy to provision On-Demand capacity is set to prioritized. Managed node groups use the order of instance types passed in the API to determine which instance type to use first when fulfilling On-Demand capacity. For example, you might specify three instance types in the following order: c5.large, c4.large, and c3.large. When your On-Demand Instances are launched, the managed node group fulfills On-Demand capacity by starting with c5.large, then c4.large, and then c3.large. For more information, see Amazon EC2 Auto Scaling group in the Amazon EC2 Auto Scaling User Guide.

  • Amazon EKS adds the following Kubernetes label to all nodes in your managed node group that specifies the capacity type: eks.amazonaws.com/capacityType: ON_DEMAND. You can use this label to schedule stateful or fault intolerant applications on On-Demand nodes.

Spot

Amazon EC2 Spot Instances are spare Amazon EC2 capacity that offers steep discounts off of On-Demand prices. Amazon EC2 Spot Instances can be interrupted with a two-minute interruption notice when EC2 needs the capacity back. For more information, see Spot Instances in the Amazon EC2 User Guide for Linux Instances. You can configure a managed node group with Amazon EC2 Spot Instances to optimize costs for the compute nodes running in your Amazon EKS cluster.

How it works

To use Spot Instances inside a managed node group, you need to create a managed node group by setting the capacity type as spot. A managed node group configures an Amazon EC2 Auto Scaling group on your behalf with the following Spot best practices applied:

  • The allocation strategy to provision Spot capacity is set to capacity-optimized to ensure that your Spot nodes are provisioned in the optimal Spot capacity pools. To increase the number of Spot capacity pools available for allocating capacity from, we recommend that you configure a managed node group to use multiple instance types.

  • Amazon EC2 Spot Capacity Rebalancing is enabled so that Amazon EKS can gracefully drain and rebalance your Spot nodes to minimize application disruption when a Spot node is at elevated risk of interruption. For more information, see Amazon EC2 Auto Scaling Capacity Rebalancing in the Amazon EC2 Auto Scaling User Guide.

    • When a Spot node receives a rebalance recommendation, Amazon EKS automatically attempts to launch a new replacement Spot node and waits until it successfully joins the cluster.

    • When a replacement Spot node is bootstrapped and in the Ready state on Kubernetes, Amazon EKS cordons and drains the Spot node that received the rebalance recommendation. Cordoning the Spot node ensures that the service controller doesn't send any new requests to this Spot node. It also removes it from its list of healthy, active Spot nodes. Draining the Spot node ensures that running pods are evicted gracefully.

    • If a Spot two-minute interruption notice arrives before the replacement Spot node is in a Ready state, Amazon EKS starts draining the Spot node that received the rebalance recommendation.

  • Amazon EKS adds the following Kubernetes label to all nodes in your managed node group that specifies the capacity type: eks.amazonaws.com/capacityType: SPOT. You can use this label to schedule fault tolerant applications on Spot nodes.

Considerations for selecting a capacity type

When deciding whether to deploy a node group with On-Demand or Spot capacity, you should consider the following conditions:

  • Spot Instances are a good fit for stateless, fault-tolerant, flexible applications such as batch and machine learning training workloads, big data ETLs such as Apache Spark, queue processing applications, and stateless API endpoints. Because Spot is spare Amazon EC2 capacity, which can change over time, we recommend that you use Spot capacity for interruption-tolerant workloads that can tolerate periods where the required capacity is not available.

  • We recommend that you use On-Demand for applications that are fault intolerant, including cluster management tools such as monitoring and operational tools, deployments that require StatefulSets, and stateful applications, such as databases.

  • To maximize the availability of your applications while using Spot Instances, we recommend that you configure a Spot managed node group to use multiple instance types. We recommend applying the following rules when using multiple instance types:

    • Within a managed node group, if you're using the Cluster Autoscaler, we recommend using a flexible set of instance types with the same amount of vCPU and memory resources to ensure that the nodes in your cluster scale as expected. For example, if you need four vCPUs and eight GiB memory, we recommend that you use c3.xlarge, c4.xlarge, c5.xlarge, c5d.xlarge, c5a.xlarge, c5n.xlarge, or other similar instance types.

    • To enhance application availability, we recommend deploying multiple Spot managed node groups, each using a flexible set of instance types that have the same vCPU and memory resources. For example, if you need 4 vCPUs and 8 GiB memory, we recommend that you create one managed node group with c3.xlarge, c4.xlarge, c5.xlarge, c5d.xlarge, c5a.xlarge, c5n.xlarge, or other similar instance types, and a second managed node group with m3.xlarge, m4.xlarge, m5.xlarge, m5d.xlarge, m5a.xlarge, m5n.xlarge or other similar instance types.

    • When deploying your node group with the Spot capacity type that's using a custom launch template, use the API to pass multiple instance types instead of passing a single instance type through the launch template. For more information about deploying a node group using a launch template, see Launch template support.