Windows support - Amazon EKS

Windows support

Before deploying Windows nodes, be aware of the following considerations.

Considerations

  • Amazon EC2 instance types C3, C4, D2, I2, M4 (excluding m4.16xlarge), and R3 instances are not supported for Windows workloads.

  • Host networking mode is not supported for Windows workloads.

  • Amazon EKS clusters must contain one or more Linux or Fargate nodes to run core system pods that only run on Linux, such as CoreDNS.

  • The kubelet and kube-proxy event logs are redirected to the EKS Windows Event Log and are set to a 200 MB limit.

  • You can't use Security groups for pods with pods running on Windows nodes.

  • You can't use custom networking with Windows nodes.

  • You can't use IP prefixes with Windows nodes.

  • Windows nodes support one elastic network interface per node. The number of pods that you can run per Windows node is equal to the number of IP addresses available per elastic network interface for the node's instance type, minus one. For more information, see IP addresses per network interface per instance type in the Amazon EC2 User Guide for Windows Instances.

  • In an Amazon EKS cluster, a single service with a load balancer can support up to 64 back-end pods. Each pod has its own unique IP address. This is a limitation of the Windows operating system on the Amazon EC2 nodes.

  • You can't deploy Windows managed or Fargate nodes. You can only create self-managed Windows nodes. For more information, see Launching self-managed Windows nodes.

  • You can't retrieve logs from the vpc-resource-controller Pod. You previously could when you deployed the controller to the data plane.

  • There is a cool down period before an IPv4 address is assigned to a new Pod. This prevents traffic from flowing to an older Pod with the same IPv4 address due to stale kube-proxy rules.

  • The source for the controller is managed on GitHub. To contribute to, or file issues against the controller, visit the project on GitHub.

Prerequisites

  • An existing cluster. The cluster must be running one of the Kubernetes versions and platform versions listed in the following table. Any Kubernetes and platform versions later than those listed are also supported. If your cluster or platform version is earlier than one of the following versions, you need to enable legacy Windows support on your cluster's data plane. Once your cluster is at one of the following Kubernetes and platform versions, or later, you can remove legacy Windows support and enable Windows support on your control plane.

    Kubernetes version Platform version
    1.21 eks.3
    1.20 eks.3
    1.19 eks.7
    1.18 eks.9
    1.17 eks.10

    Your cluster must have at least one (we recommend at least two) Linux node or Fargate pod to run CoreDNS. If you enable legacy Windows support, you must use a Linux node (you can't use a Fargate pod) to run CoreDNS.

  • An existing Amazon EKS cluster IAM role.

Enabling Windows support

If your cluster is not at, or later, than one of the Kubernetes and platform versions listed in the Prerequisites, you must enable legacy Windows support instead. For more information, see Enabling legacy Windows support.

If you've never enabled Windows support on your cluster, skip to the next step.

If you enabled Windows support on a cluster that is earlier than a Kubernetes or platform version listed in the Prerequisites, then you must first remove the vpc-resource-controller and vpc-admission-webhook from your data plane. They're deprecated and no longer needed.

To enable Windows support for your cluster

  1. If you don't have Amazon Linux nodes in your cluster and use security groups for pods, skip to the next step. Otherwise, confirm that the AmazonEKSVPCResourceController managed policy is attached to your cluster role. Replace eksClusterRole with your cluster role name.

    aws iam list-attached-role-policies --role-name eksClusterRole

    Output

    { "AttachedPolicies": [ { "PolicyName": "AmazonEKSClusterPolicy", "PolicyArn": "arn:aws:iam::aws:policy/AmazonEKSClusterPolicy" }, { "PolicyName": "AmazonEKSVPCResourceController", "PolicyArn": "arn:aws:iam::aws:policy/AmazonEKSVPCResourceController" } ] }

    If the policy is attached, as it is in the previous output, skip the next step.

  2. Attach the AmazonEKSVPCResourceController managed policy to your Amazon EKS cluster IAM role. Replace eksClusterRole with your cluster role name and 111122223333 with your account ID.

    aws iam attach-role-policy \ --role-name eksClusterRole \ --policy-arn arn:aws:iam::aws:policy/AmazonEKSVPCResourceController
  3. Create a file named vpc-resource-controller-configmap.yaml with the following contents.

    apiVersion: v1 kind: ConfigMap metadata: name: amazon-vpc-cni namespace: kube-system data: enable-windows-ipam: "true"
  4. Apply the ConfigMap to your cluster.

    kubectl apply -f vpc-resource-controller-configmap.yaml

Removing legacy Windows support from your data plane

If you enabled Windows support on a cluster that is earlier than a Kubernetes or platform version listed in the Prerequisites, then you must first remove the vpc-resource-controller and vpc-admission-webhook from your data plane. They're deprecated and no longer needed because the functionality that they provided is now enabled on the control plane.

  1. Uninstall the vpc-resource-controller with the following command. Use this command regardless of which tool you originally installed it with. Replace us-west-2 (only the instance of that text after /manifests/) with the Region that your cluster is in.

    kubectl delete -f https://amazon-eks.s3.us-west-2.amazonaws.com/manifests/us-west-2/vpc-resource-controller/latest/vpc-resource-controller.yaml
  2. Uninstall the vpc-admission-webhook using the instructions for the tool that you installed it with.

    eksctl

    Run the following commands.

    kubectl delete deployment -n kube-system vpc-admission-webhook kubectl delete service -n kube-system vpc-admission-webhook kubectl delete mutatingwebhookconfigurations.admissionregistration.k8s.io vpc-admission-webhook-cfg
    kubectl on macOS or Windows

    Run the following command. Replace us-west-2 (only the instance of that text after /manifests/) with the Region that your cluster is in.

    kubectl delete -f https://amazon-eks.s3.us-west-2.amazonaws.com/manifests/us-west-2/vpc-admission-webhook/latest/vpc-admission-webhook-deployment.yaml
  3. Enable Windows support for your cluster on the control plane.

Disabling Windows support

To disable Windows support on your cluster

  1. If your cluster contains Amazon Linux nodes and you use security groups for pods with them, then skip this step.

    Remove the AmazonVPCResourceController managed IAM policy from your cluster role. Replace eksClusterRole with the name of your cluster role and 111122223333 with your account ID.

    aws iam detach-role-policy \ --role-name eksClusterRole \ --policy-arn arn:aws:iam::aws:policy/AmazonEKSVPCResourceController
  2. Disable Windows IPAM in the amazon-vpc-cni ConfigMap.

    kubectl patch configmap/amazon-vpc-cni \-n kube-system \--type merge \-p '{"data":{"enable-windows-ipam":"false"}}'

Deploying Pods

When you deploy Pods to your cluster, you need to specify the operating system that they use if you're running a mixture of node types.

For Linux pods, use the following node selector text in your manifests.

nodeSelector: kubernetes.io/os: linux kubernetes.io/arch: amd64

For Windows pods, use the following node selector text in your manifests.

nodeSelector: kubernetes.io/os: windows kubernetes.io/arch: amd64

You can deploy a sample application to see the node selectors in use.

Enabling legacy Windows support

If your cluster is at, or later, than one of the Kubernetes and platform versions listed in the Prerequisites, then we recommend that you enable Windows support on your control plane instead. For more information, see Enabling Windows support.

The following steps help you to enable legacy Windows support for your Amazon EKS cluster's data plane if your cluster or platform version are earlier than the versions listed in the Prerequisites. Once your cluster and platform version are at, or later than a version listed in the Prerequisites, we recommend that you remove legacy Windows support and enable it for your control plane.

You can use eksctl, a Windows client, or a macOS or Linux client to enable legacy Windows support for your cluster.

eksctl

To enable legacy Windows support for your cluster with eksctl

Prerequisite

This procedure requires eksctl version 0.74.0 or later. You can check your version with the following command.

eksctl version

For more information about installing or upgrading eksctl, see Installing or upgrading eksctl.

  1. Enable Windows support for your Amazon EKS cluster with the following eksctl command. Replace my-cluster with the name of your cluster. This command deploys the VPC resource controller and VPC admission controller webhook that are required on Amazon EKS clusters to run Windows workloads.

    eksctl utils install-vpc-controllers -cluster my-cluster -approve
    Important

    The VPC admission controller webhook is signed with a certificate that expires one year after the date of issue. To avoid down time, make sure to renew the certificate before it expires. For more information, see Renewing the VPC admission webhook certificate.

  2. After you have enabled Windows support, you can launch a Windows node group into your cluster. For more information, see Launching self-managed Windows nodes.

Windows

To enable legacy Windows support for your cluster with a Windows client

In the following steps, replace us-west-2 with the Region that your cluster resides in.

  1. Deploy the VPC resource controller to your cluster.

    kubectl apply -f https://amazon-eks.s3.us-west-2.amazonaws.com/manifests/us-west-2/vpc-resource-controller/latest/vpc-resource-controller.yaml
  2. Deploy the VPC admission controller webhook to your cluster.

    1. Download the required scripts and deployment files.

      curl -o vpc-admission-webhook-deployment.yaml https://amazon-eks.s3.us-west-2.amazonaws.com/manifests/us-west-2/vpc-admission-webhook/latest/vpc-admission-webhook-deployment.yaml; curl -o Setup-VPCAdmissionWebhook.ps1 https://amazon-eks.s3.us-west-2.amazonaws.com/manifests/us-west-2/vpc-admission-webhook/latest/Setup-VPCAdmissionWebhook.ps1; curl -o webhook-create-signed-cert.ps1 https://amazon-eks.s3.us-west-2.amazonaws.com/manifests/us-west-2/vpc-admission-webhook/latest/webhook-create-signed-cert.ps1; curl -o webhook-patch-ca-bundle.ps1 https://amazon-eks.s3.us-west-2.amazonaws.com/manifests/us-west-2/vpc-admission-webhook/latest/webhook-patch-ca-bundle.ps1;
    2. Install OpenSSL and jq.

    3. Set up and deploy the VPC admission webhook.

      ./Setup-VPCAdmissionWebhook.ps1 -DeploymentTemplate ".\vpc-admission-webhook-deployment.yaml"
      Important

      The VPC admission controller webhook is signed with a certificate that expires one year after the date of issue. To avoid down time, make sure to renew the certificate before it expires. For more information, see Renewing the VPC admission webhook certificate.

  3. Determine if your cluster has the required cluster role binding.

    kubectl get clusterrolebinding eks:kube-proxy-windows

    If output similar to the following example output is returned, then the cluster has the necessary role binding.

    NAME                      AGE
    eks:kube-proxy-windows    10d

    If the output includes Error from server (NotFound), then the cluster does not have the necessary cluster role binding. Add the binding by creating a file named eks-kube-proxy-windows-crb.yaml with the following content.

    kind: ClusterRoleBinding apiVersion: rbac.authorization.k8s.io/v1beta1 metadata: name: eks:kube-proxy-windows labels: k8s-app: kube-proxy eks.amazonaws.com/component: kube-proxy subjects: - kind: Group name: "eks:kube-proxy-windows" roleRef: kind: ClusterRole name: system:node-proxier apiGroup: rbac.authorization.k8s.io

    Apply the configuration to the cluster.

    kubectl apply -f eks-kube-proxy-windows-crb.yaml
  4. After you have enabled Windows support, you can launch a Windows node group into your cluster. For more information, see Launching self-managed Windows nodes.

macOS and Linux

To enable legacy Windows support for your cluster with a macOS or Linux client

This procedure requires that the openssl library and jq JSON processor are installed on your client system.

In the following steps, replace region-code with the Region that your cluster resides in.

  1. Deploy the VPC resource controller to your cluster.

    kubectl apply -f https://amazon-eks.s3.us-west-2.amazonaws.com/manifests/us-west-2/vpc-resource-controller/latest/vpc-resource-controller.yaml
  2. Create the VPC admission controller webhook manifest for your cluster.

    1. Download the required scripts and deployment files.

      curl -o webhook-create-signed-cert.sh https://amazon-eks.s3.us-west-2.amazonaws.com/manifests/us-west-2/vpc-admission-webhook/latest/webhook-create-signed-cert.sh curl -o webhook-patch-ca-bundle.sh https://amazon-eks.s3.us-west-2.amazonaws.com/manifests/us-west-2/vpc-admission-webhook/latest/webhook-patch-ca-bundle.sh curl -o vpc-admission-webhook-deployment.yaml https://amazon-eks.s3.us-west-2.amazonaws.com/manifests/us-west-2/vpc-admission-webhook/latest/vpc-admission-webhook-deployment.yaml
    2. Add permissions to the shell scripts so that they can be run.

      chmod +x webhook-create-signed-cert.sh webhook-patch-ca-bundle.sh
    3. Create a secret for secure communication.

      ./webhook-create-signed-cert.sh
    4. Verify the secret.

      kubectl get secret -n kube-system vpc-admission-webhook-certs
    5. Configure the webhook and create a deployment file.

      cat ./vpc-admission-webhook-deployment.yaml | ./webhook-patch-ca-bundle.sh > vpc-admission-webhook.yaml
  3. Deploy the VPC admission webhook.

    kubectl apply -f vpc-admission-webhook.yaml
    Important

    The VPC admission controller webhook is signed with a certificate that expires one year after the date of issue. To avoid down time, make sure to renew the certificate before it expires. For more information, see Renewing the VPC admission webhook certificate.

  4. Determine if your cluster has the required cluster role binding.

    kubectl get clusterrolebinding eks:kube-proxy-windows

    If output similar to the following example output is returned, then the cluster has the necessary role binding.

    NAME                     ROLE                              AGE
    eks:kube-proxy-windows   ClusterRole/system:node-proxier   19h

    If the output includes Error from server (NotFound), then the cluster does not have the necessary cluster role binding. Add the binding by creating a file named eks-kube-proxy-windows-crb.yaml with the following content.

    kind: ClusterRoleBinding apiVersion: rbac.authorization.k8s.io/v1beta1 metadata: name: eks:kube-proxy-windows labels: k8s-app: kube-proxy eks.amazonaws.com/component: kube-proxy subjects: - kind: Group name: "eks:kube-proxy-windows" roleRef: kind: ClusterRole name: system:node-proxier apiGroup: rbac.authorization.k8s.io

    Apply the configuration to the cluster.

    kubectl apply -f eks-kube-proxy-windows-crb.yaml
  5. After you have enabled Windows support, you can launch a Windows node group into your cluster. For more information, see Launching self-managed Windows nodes.

Renewing the VPC admission webhook certificate

The certificate used by the VPC admission webhook expires one year after issue. To avoid down time, it's important that you renew the certificate before it expires. You can check the expiration date of your current certificate with the following command.

kubectl get secret \ -n kube-system \ vpc-admission-webhook-certs -o json | \ jq -r '.data."cert.pem"' | \ base64 -decode | \ openssl x509 \ -noout \ -enddate | \ cut -d= -f2

Output

May 28 14:23:00 2022 GMT

You can renew the certificate using eksctl or a Windows or Linux/macOS computer. Follow the instructions for the tool you originally used to install the VPC admission webhook. For example, if you originally installed the VPC admission webhook using eksctl, then you should renew the certificate using the instructions on the eksctl tab.

eksctl
  1. Reinstall the certificate. Replace <cluster-name> (including <>) with the name of your cluster.

    eksctl utils install-vpc-controllers -cluster <cluster-name> -approve
  2. Verify that you receive the following output.

    2021/05/28 05:24:59 [INFO] generate received request 2021/05/28 05:24:59 [INFO] received CSR 2021/05/28 05:24:59 [INFO] generating key: rsa-2048 2021/05/28 05:24:59 [INFO] encoded CSR
  3. Restart the webhook deployment.

    kubectl rollout restart deployment -n kube-system vpc-admission-webhook
  4. If the certificate that you renewed was expired, and you have Windows pods stuck in the Container creating state, then you must delete and redeploy those pods.

Windows
  1. Get the script to generate new certificate.

    curl -o webhook-create-signed-cert.ps1 https://amazon-eks.s3.us-west-2.amazonaws.com/manifests/us-west-2/vpc-admission-webhook/latest/webhook-create-signed-cert.ps1;
  2. Prepare parameter for the script.

    ./webhook-create-signed-cert.ps1 -ServiceName vpc-admission-webhook-svc -SecretName vpc-admission-webhook-certs -Namespace kube-system
  3. Restart the webhook deployment.

    kubectl rollout restart deployment -n kube-system vpc-admission-webhook-deployment
  4. If the certificate that you renewed was expired, and you have Windows pods stuck in the Container creating state, then you must delete and redeploy those pods.

Linux and macOS

Prerequisite

You must have OpenSSL and jq installed on your computer.

  1. Get the script to generate new certificate.

    curl -o webhook-create-signed-cert.sh \ https://amazon-eks.s3.us-west-2.amazonaws.com/manifests/us-west-2/vpc-admission-webhook/latest/webhook-create-signed-cert.sh
  2. Change the permissions.

    chmod +x webhook-create-signed-cert.sh
  3. Run the script.

    ./webhook-create-signed-cert.sh
  4. Restart the webhook.

    kubectl rollout restart deployment -n kube-system vpc-admission-webhook-deployment
  5. If the certificate that you renewed was expired, and you have Windows pods stuck in the Container creating state, then you must delete and redeploy those pods.