Configure pod eviction time - AWS Prescriptive Guidance

Configure pod eviction time

Pod eviction time is useful when designing for resiliency in a control plane or Availability Zone failure scenario. During the Availability Zone failure testing, where the subnets lose network connectivity, all the impacted Amazon EKS nodes lose connectivity to the Amazon EKS control planes. Within 1 minute, all the impacted Amazon EKS nodes are marked with the NotReady status, and the pod endpoints or EndpointSlices have been removed from the service endpoints. However, all the pods running on the impacted nodes remain at the running status for the default 5 minutes. Then the pods are marked as TERMINATING, and new pods are scheduled.

The pod-eviction-timeout parameter inside the Kubernetes Controller Manager is set by default at 5 minutes and could be updated through the Kubernetes control plane. However, because Amazon EKS is a managed Kubernetes service, pod-eviction-timeout is not available to be modified.

For a work-around, you can use node taint-based evictions. When a node goes down or the kubelet stops posting the status, the node will get tainted with node.kubernetes.io/unreachable. Pods by default tolerate this taint for 5 minutes, but you can override that with a standard taint toleration for more or less time. To define a custom toleration duration, attach code that specifies the tolerationSeconds for node.kubernetes.io/unreachable and node.kubernetes.io/not-ready values to each deployment. The following code provides an example:

apiVersion: apps/v1 kind: Deployment metadata: name: busybox namespace: default spec: replicas: 2 selector: matchLabels: app: busybox template: metadata: labels: app: busybox spec: tolerations: - key: "node.kubernetes.io/unreachable" operator: "Exists" effect: "NoExecute" tolerationSeconds: 2 - key: "node.kubernetes.io/not-ready" operator: "Exists" effect: "NoExecute" tolerationSeconds: 2 containers: - image: busybox command: - sleep - "3600" imagePullPolicy: IfNotPresent name: busybox restartPolicy: Always