Amazon EKS
User Guide

The AWS Documentation website is getting a new look!
Try it now and let us know what you think. Switch to the new look >>

You can return to the original look by selecting English in the language selector above.

Migrating to a New Worker Node Group

This topic helps you to create a new worker node group, gracefully migrate your existing applications to the new group, and then remove the old worker node group from your cluster.

eksctlAWS Management Console
eksctl

To migrate your applications to a new worker node group with eksctl

This procedure assumes that you have installed eksctl, and that your eksctl version is at least 0.7.0. You can check your version with the following command:

eksctl version

For more information on installing or upgrading eksctl, see Installing or Upgrading eksctl.

Note

This procedure only works for clusters and worker node groups that were created with eksctl.

  1. Retrieve the name of your existing worker node groups, substituting default with your cluster name.

    eksctl get nodegroups --cluster=default

    Output:

    CLUSTER NODEGROUP CREATED MIN SIZE MAX SIZE DESIRED CAPACITY INSTANCE TYPE IMAGE ID default standard-workers 2019-05-01T22:26:58Z 1 4 3 t3.medium ami-05a71d034119ffc12
  2. Launch a new worker node group with eksctl with the following command, substituting the example values with your own values.

    Note

    For more available flags and their descriptions, see https://eksctl.io/.

    eksctl create nodegroup \ --cluster default \ --version 1.14 \ --name standard-1-14 \ --node-type t3.medium \ --nodes 3 \ --nodes-min 1 \ --nodes-max 4 \ --node-ami auto
  3. When the previous command completes, verify that all of your worker nodes have reached the Ready state with the following command:

    kubectl get nodes
  4. Delete the original node group with the following command, substituting the example values with your cluster and nodegroup names:

    eksctl delete nodegroup --cluster default --name standard-workers
AWS Management Console

To migrate your applications to a new worker node group with the AWS Management Console

  1. Launch a new worker node group by following the steps outlined in Launching Amazon EKS Linux Worker Nodes.

  2. When your stack has finished creating, select it in the console and choose Outputs.

  3. Record the NodeInstanceRole for the node group that was created. You need this to add the new Amazon EKS worker nodes to your cluster.

    Note

    If you have attached any additional IAM policies to your old node group IAM role, such as adding permissions for the Kubernetes Cluster Autoscaler, you should attach those same policies to your new node group IAM role to maintain that functionality on the new group.

  4. Update the security groups for both worker node groups so that they can communicate with each other. For more information, see Cluster Security Group Considerations.

    1. Record the security group IDs for both worker node groups. This is shown as the NodeSecurityGroup value in the AWS CloudFormation stack outputs.

      You can use the following AWS CLI commands to get the security group IDs from the stack names. In these commands, oldNodes is the AWS CloudFormation stack name for your older worker node stack, and newNodes is the name of the stack that you are migrating to.

      oldNodes="<old_node_CFN_stack_name>" newNodes="<new_node_CFN_stack_name>" oldSecGroup=$(aws cloudformation describe-stack-resources --stack-name $oldNodes \ --query 'StackResources[?ResourceType==`AWS::EC2::SecurityGroup`].PhysicalResourceId' \ --output text) newSecGroup=$(aws cloudformation describe-stack-resources --stack-name $newNodes \ --query 'StackResources[?ResourceType==`AWS::EC2::SecurityGroup`].PhysicalResourceId' \ --output text)
    2. Add ingress rules to each worker node security group so that they accept traffic from each other.

      The following AWS CLI commands add ingress rules to each security group that allow all traffic on all protocols from the other security group. This configuration allows pods in each worker node group to communicate with each other while you are migrating your workload to the new group.

      aws ec2 authorize-security-group-ingress --group-id $oldSecGroup \ --source-group $newSecGroup --protocol -1 aws ec2 authorize-security-group-ingress --group-id $newSecGroup \ --source-group $oldSecGroup --protocol -1
  5. Edit the aws-auth configmap to map the new worker node instance role in RBAC.

    kubectl edit configmap -n kube-system aws-auth

    Add a new mapRoles entry for the new worker node group.

    apiVersion: v1 data: mapRoles: | - rolearn: <ARN of instance role (not instance profile)> username: system:node:{{EC2PrivateDNSName}} groups: - system:bootstrappers - system:nodes - rolearn: arn:aws:iam::111122223333:role/workers-1-10-NodeInstanceRole-U11V27W93CX5 username: system:node:{{EC2PrivateDNSName}} groups: - system:bootstrappers - system:nodes

    Replace the <ARN of instance role (not instance profile)> snippet with the NodeInstanceRole value that you recorded in Step 3, then save and close the file to apply the updated configmap.

  6. Watch the status of your nodes and wait for your new worker nodes to join your cluster and reach the Ready status.

    kubectl get nodes --watch
  7. (Optional) If you are using the Kubernetes Cluster Autoscaler, scale the deployment down to 0 replicas to avoid conflicting scaling actions.

    kubectl scale deployments/cluster-autoscaler --replicas=0 -n kube-system
  8. Use the following command to taint each of the nodes that you want to remove with NoSchedule so that new pods are not scheduled or rescheduled on the nodes you are replacing:

    kubectl taint nodes node_name key=value:NoSchedule

    If you are upgrading your worker nodes to a new Kubernetes version, you can identify and taint all of the nodes of a particular Kubernetes version (in this case, 1.10.3) with the following code snippet.

    K8S_VERSION=1.10.3 nodes=$(kubectl get nodes -o jsonpath="{.items[?(@.status.nodeInfo.kubeletVersion==\"v$K8S_VERSION\")].metadata.name}") for node in ${nodes[@]} do echo "Tainting $node" kubectl taint nodes $node key=value:NoSchedule done
  9. Determine your cluster's DNS provider.

    kubectl get deployments -l k8s-app=kube-dns -n kube-system

    Output (this cluster is using kube-dns for DNS resolution, but your cluster may return coredns instead):

    NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE kube-dns 1 1 1 1 31m
  10. If your current deployment is running fewer than two replicas, scale out the deployment to two replicas. Substitute coredns for kube-dns if your previous command output returned that instead.

    kubectl scale deployments/kube-dns --replicas=2 -n kube-system
  11. Drain each of the nodes that you want to remove from your cluster with the following command:

    kubectl drain node_name --ignore-daemonsets --delete-local-data

    If you are upgrading your worker nodes to a new Kubernetes version, you can identify and drain all of the nodes of a particular Kubernetes version (in this case, 1.10.3) with the following code snippet.

    K8S_VERSION=1.10.3 nodes=$(kubectl get nodes -o jsonpath="{.items[?(@.status.nodeInfo.kubeletVersion==\"v$K8S_VERSION\")].metadata.name}") for node in ${nodes[@]} do echo "Draining $node" kubectl drain $node --ignore-daemonsets --delete-local-data done
  12. After your old worker nodes have finished draining, revoke the security group ingress rules you authorized earlier, and then delete the AWS CloudFormation stack to terminate the instances.

    Note

    If you have attached any additional IAM policies to your old node group IAM role, such as adding permissions for the Kubernetes Cluster Autoscaler), you must detach those additional policies from the role before you can delete your AWS CloudFormation stack.

    1. Revoke the ingress rules that you created for your worker node security groups earlier. In these commands, oldNodes is the AWS CloudFormation stack name for your older worker node stack, and newNodes is the name of the stack that you are migrating to.

      oldNodes="<old_node_CFN_stack_name>" newNodes="<new_node_CFN_stack_name>" oldSecGroup=$(aws cloudformation describe-stack-resources --stack-name $oldNodes \ --query 'StackResources[?ResourceType==`AWS::EC2::SecurityGroup`].PhysicalResourceId' \ --output text) newSecGroup=$(aws cloudformation describe-stack-resources --stack-name $newNodes \ --query 'StackResources[?ResourceType==`AWS::EC2::SecurityGroup`].PhysicalResourceId' \ --output text) aws ec2 revoke-security-group-ingress --group-id $oldSecGroup \ --source-group $newSecGroup --protocol -1 aws ec2 revoke-security-group-ingress --group-id $newSecGroup \ --source-group $oldSecGroup --protocol -1
    2. Open the AWS CloudFormation console at https://console.aws.amazon.com/cloudformation.

    3. Select your old worker node stack.

    4. Choose Actions, then Delete stack.

  13. Edit the aws-auth configmap to remove the old worker node instance role from RBAC.

    kubectl edit configmap -n kube-system aws-auth

    Delete the mapRoles entry for the old worker node group.

    apiVersion: v1 data: mapRoles: | - rolearn: arn:aws:iam::111122223333:role/workers-1-11-NodeInstanceRole-W70725MZQFF8 username: system:node:{{EC2PrivateDNSName}} groups: - system:bootstrappers - system:nodes - rolearn: arn:aws:iam::111122223333:role/workers-1-10-NodeInstanceRole-U11V27W93CX5 username: system:node:{{EC2PrivateDNSName}} groups: - system:bootstrappers - system:nodes

    Save and close the file to apply the updated configmap.

  14. (Optional) If you are using the Kubernetes Cluster Autoscaler, scale the deployment back to one replica.

    Note

    You must also tag your new Auto Scaling group appropriately (for example, k8s.io/cluster-autoscaler/enabled,k8s.io/cluster-autoscaler/<YOUR CLUSTER NAME>) and update your Cluster Autoscaler deployment's command to point to the newly tagged Auto Scaling group. For more information, see Cluster Autoscaler on AWS.

    kubectl scale deployments/cluster-autoscaler --replicas=1 -n kube-system
  15. (Optional) Verify that you are using the latest version of the Amazon VPC CNI plugin for Kubernetes. You may need to update your CNI version to take advantage of the latest supported instance types. For more information, see Amazon VPC CNI Plugin for Kubernetes Upgrades.

  16. If your cluster is using kube-dns for DNS resolution (see step Step 9), scale in the kube-dns deployment to one replica.

    kubectl scale deployments/kube-dns --replicas=1 -n kube-system