Amazon EKS optimized Amazon Linux AMIs - Amazon EKS

Amazon EKS optimized Amazon Linux AMIs

The Amazon EKS optimized Amazon Linux AMI is built on top of Amazon Linux 2 (AL2) and Amazon Linux 2023 (AL2023). It's configured to serve as the base image for Amazon EKS nodes. The AMI is configured to work with Amazon EKS and it includes the following components:

  • kubelet

  • AWS IAM Authenticator

  • Docker (Amazon EKS version 1.23 and earlier)

  • containerd

Note
  • You can track security or privacy events for AL2 at the Amazon Linux security center or subscribe to the associated RSS feed. Security and privacy events include an overview of the issue, what packages are affected, and how to update your instances to correct the issue.

  • Before deploying an accelerated or Arm AMI, review the information in Amazon EKS optimized accelerated Amazon Linux AMIs and Amazon EKS optimized Arm Amazon Linux AMIs.

  • For Kubernetes version 1.23, you can use an optional bootstrap flag to test migration from Docker to containerd. For more information, see Test migration from Docker to containerd.

  • Starting with Kubernetes version 1.25, you will no longer be able to use Amazon EC2 P2 instances with the Amazon EKS optimized accelerated Amazon Linux AMIs out of the box. These AMIs for Kubernetes versions 1.25 or later will support NVIDIA 525 series or later drivers, which are incompatible with the P2 instances. However, NVIDIA 525 series or later drivers are compatible with the P3, P4, and P5 instances, so you can use those instances with the AMIs for Kubernetes version 1.25 or later. Before your Amazon EKS clusters are upgraded to version 1.25, migrate any P2 instances to P3, P4, and P5 instances. You should also proactively upgrade your applications to work with the NVIDIA 525 series or later. We plan to back port the newer NVIDIA 525 series or later drivers to Kubernetes versions 1.23 and 1.24 in late January 2024.

  • Starting with Amazon EKS version 1.30, any newly created node groups will automatically default to using AL2023 as the node operating system across all Amazon EKS versions. Previously, new node groups would default to AL2. You can continue to use AL2 by choosing it as the AMI type when creating a new node group.

  • Support for AL2 will end on June 30th, 2025. For more information, see Amazon Linux 2 FAQs.

Upgrade from AL2 to AL2023

The Amazon EKS optimized AMI is available in two families based on AL2 and AL2023. AL2023 is a new Linux-based operating system designed to provide a secure, stable, and high-performance environment for your cloud applications. It's the next generation of Amazon Linux from Amazon Web Services and is available across all supported Amazon EKS versions, including versions 1.23 and 1.24 in extended support. Amazon EKS accelerated AMIs based on AL2023 will be available at a later date. If you have accelerated workloads, you should continue to use the AL2 accelerated AMI or Bottlerocket.

AL2023 offers several improvements over AL2. For a full comparison, see Comparing AL2 and Amazon Linux 2023 in the Amazon Linux 2023 User Guide. Several packages have been added, upgraded, and removed from AL2. It's highly recommended to test your applications with AL2023 before upgrading. For a list of all package changes in AL2023, see Package changes in Amazon Linux 2023 in the Amazon Linux 2023 Release Notes.

In addition to these changes, you should be aware of the following:

  • AL2023 introduces a new node initialization process nodeadm that uses a YAML configuration schema. If you're using self-managed node groups or an AMI with a launch template, you'll now need to provide additional cluster metadata explicitly when creating a new node group. An example of the minimum required parameters is as follows, where apiServerEndpoint, certificateAuthority, and service cidr are now required:

    --- apiVersion: node.eks.aws/v1alpha1 kind: NodeConfig spec: cluster: name: my-cluster apiServerEndpoint: https://example.com certificateAuthority: Y2VydGlmaWNhdGVBdXRob3JpdHk= cidr: 10.100.0.0/16

    In AL2, the metadata from these parameters was discovered from the Amazon EKS DescribeCluster API call. With AL2023, this behavior has changed since the additional API call risks throttling during large node scale ups. This change doesn't affect you if you're using managed node groups without a launch template or if you're using Karpenter. For more information on certificateAuthority and service cidr, see DescribeCluster in the Amazon EKS API Reference.

  • Docker isn't supported in AL2023 for all supported Amazon EKS versions. Support for Docker has ended and been removed with Amazon EKS version 1.24 or greater in AL2. For more information on deprecation, see Amazon EKS ended support for Dockershim.

  • Amazon VPC CNI version 1.16.2 or greater is required for AL2023.

  • AL2023 requires IMDSv2 by default. IMDSv2 has several benefits that help improve security posture. It uses a session-oriented authentication method that requires the creation of a secret token in a simple HTTP PUT request to start the session. A session's token can be valid for anywhere between 1 second and 6 hours. For more information on how to transition from IMDSv1 to IMDSv2, see Transition to using Instance Metadata Service Version 2 and Get the full benefits of IMDSv2 and disable IMDSv1 across your AWS infrastructure. If you would like to use IMDSv1, you can still do so by manually overriding the settings using instance metadata option launch properties.

    Note

    For IMDSv2, the default hop count for managed node groups is set to 1. This means that containers won't have access to the node's credentials using IMDS. If you require container access to the node's credentials, you can still do so by manually overriding the HttpPutResponseHopLimit in a custom Amazon EC2 launch template, increasing it to 2. Alternatively, you can use Amazon EKS Pod Identity to provide credentials instead of IMDSv2.

  • AL2023 features the next generation of unified control group hierarchy (cgroupv2). cgroupv2 is used to implement a container runtime, and by systemd. While AL2023 still includes code that can make the system run using cgroupv1, this isn't a recommended or supported configuration. This configuration will be completely removed in a future major release of Amazon Linux.

For previously existing managed node groups, you can either perform an in-place upgrade or a blue/green upgrade depending on how you're using a launch template:

  • If you're using a custom AMI with a managed node group, you can perform an in-place upgrade by swapping the AMI ID in the launch template. You should ensure that your applications and any user data transfer over to AL2023 first before performing this upgrade strategy.

  • If you're using managed node groups with either the standard launch template or with a custom launch template that doesn't specify the AMI ID, you're required to upgrade using a blue/green strategy. A blue/green upgrade is typically more complex and involves creating an entirely new node group where you would specify AL2023 as the AMI type. The new node group will need to then be carefully configured to ensure that all custom data from the AL2 node group is compatible with the new OS. Once the new node group has been tested and validated with your applications, Pods can be migrated from the old node group to the new node group. Once the migration is completed, you can delete the old node group.

If you're using Karpenter and want to use AL2023, you'll need to modify the AWSNoteTemplate amiFamily field with AL2023. By default, Drift is enabled in Karpenter. This means that once the amiFamily field has been changed, Karpenter will automatically update your worker nodes to the latest AMI when available.

Amazon EKS optimized accelerated Amazon Linux AMIs

Note

Amazon EKS accelerated AMIs based on AL2023 will be available at a later date. If you have accelerated workloads, you should continue to use the AL2 accelerated AMI or Bottlerocket.

The Amazon EKS optimized accelerated Amazon Linux AMI is built on top of the standard Amazon EKS optimized Amazon Linux AMI. It's configured to serve as an optional image for Amazon EKS nodes to support GPU, Inferentia, and Trainium based workloads.

In addition to the standard Amazon EKS optimized AMI configuration, the accelerated AMI includes the following:

  • NVIDIA drivers

  • The nvidia-container-runtime (as the default runtime)

  • AWS Neuron container runtime

For a list of the latest components included in the accelerated AMI, see the amazon-eks-ami Releases on GitHub.

Note
  • The Amazon EKS optimized accelerated AMI only supports GPU and Inferentia based instance types. Make sure to specify these instance types in your node AWS CloudFormation template. By using the Amazon EKS optimized accelerated AMI, you agree to NVIDIA's user license agreement (EULA).

  • The Amazon EKS optimized accelerated AMI was previously referred to as the Amazon EKS optimized AMI with GPU support.

  • Previous versions of the Amazon EKS optimized accelerated AMI installed the nvidia-docker repository. The repository is no longer included in Amazon EKS AMI version v20200529 and later.

To enable GPU based workloads

The following procedure describes how to run a workload on a GPU based instance with the Amazon EKS optimized accelerated AMI. For other options, see the following references:

  1. After your GPU nodes join your cluster, you must apply the NVIDIA device plugin for Kubernetes as a DaemonSet on your cluster. Replace vX.X.X with your desired NVIDIA/k8s-device-plugin version before running the following command.

    kubectl apply -f https://raw.githubusercontent.com/NVIDIA/k8s-device-plugin/vX.X.X/nvidia-device-plugin.yml
  2. You can verify that your nodes have allocatable GPUs with the following command.

    kubectl get nodes "-o=custom-columns=NAME:.metadata.name,GPU:.status.allocatable.nvidia\.com/gpu"
To deploy a Pod to test that your GPU nodes are configured properly
  1. Create a file named nvidia-smi.yaml with the following contents. Replace tag with your desired tag for nvidia/cuda. This manifest launches an NVIDIA CUDA container that runs nvidia-smi on a node.

    apiVersion: v1 kind: Pod metadata: name: nvidia-smi spec: restartPolicy: OnFailure containers: - name: nvidia-smi image: nvidia/cuda:tag args: - "nvidia-smi" resources: limits: nvidia.com/gpu: 1
  2. Apply the manifest with the following command.

    kubectl apply -f nvidia-smi.yaml
  3. After the Pod has finished running, view its logs with the following command.

    kubectl logs nvidia-smi

    An example output is as follows.

    Mon Aug  6 20:23:31 20XX
    +-----------------------------------------------------------------------------+
    | NVIDIA-SMI XXX.XX                 Driver Version: XXX.XX                    |
    |-------------------------------+----------------------+----------------------+
    | GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
    | Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
    |===============================+======================+======================|
    |   0  Tesla V100-SXM2...  On   | 00000000:00:1C.0 Off |                    0 |
    | N/A   46C    P0    47W / 300W |      0MiB / 16160MiB |      0%      Default |
    +-------------------------------+----------------------+----------------------+
    
    +-----------------------------------------------------------------------------+
    | Processes:                                                       GPU Memory |
    |  GPU       PID   Type   Process name                             Usage      |
    |=============================================================================|
    |  No running processes found                                                 |
    +-----------------------------------------------------------------------------+

Amazon EKS optimized Arm Amazon Linux AMIs

Arm instances deliver significant cost savings for scale-out and Arm-based applications such as web servers, containerized microservices, caching fleets, and distributed data stores. When adding Arm nodes to your cluster, review the following considerations.

Considerations
  • If your cluster was deployed before August 17, 2020, you must do a one-time upgrade of critical cluster add-on manifests. This is so that Kubernetes can pull the correct image for each hardware architecture in use in your cluster. For more information about updating cluster add-ons, see Update the Kubernetes version for your Amazon EKS cluster. If you deployed your cluster on or after August 17, 2020, then your CoreDNS, kube-proxy, and Amazon VPC CNI plugin for Kubernetes add-ons are already multi-architecture capable.

  • Applications deployed to Arm nodes must be compiled for Arm.

  • If you have DaemonSets that are deployed in an existing cluster, or you want to deploy them to a new cluster that you also want to deploy Arm nodes in, then verify that your DaemonSet can run on all hardware architectures in your cluster.

  • You can run Arm node groups and x86 node groups in the same cluster. If you do, consider deploying multi-architecture container images to a container repository such as Amazon Elastic Container Registry and then adding node selectors to your manifests so that Kubernetes knows what hardware architecture a Pod can be deployed to. For more information, see Pushing a multi-architecture image in the Amazon ECR User Guide and the Introducing multi-architecture container images for Amazon ECR blog post.

Test migration from Docker to containerd

Amazon EKS ended support for Docker starting with the Kubernetes version 1.24 launch. For more information, see Amazon EKS ended support for Dockershim.

For Kubernetes version 1.23, you can use an optional bootstrap flag to enable the containerd runtime for Amazon EKS optimized AL2 AMIs. This feature gives you a clear path to migrate to containerd when updating to version 1.24 or later. Amazon EKS ended support for Docker starting with the Kubernetes version 1.24 launch. The containerd runtime is widely adopted in the Kubernetes community and is a graduated project with the CNCF. You can test it by adding a node group to a new or existing cluster.

You can enable the boostrap flag by creating one of the following types of node groups.

Self-managed

Create the node group using the instructions in Launching self-managed Amazon Linux nodes. Specify an Amazon EKS optimized AMI and the following text for the BootstrapArguments parameter.

--container-runtime containerd
Managed

If you use eksctl, create a file named my-nodegroup.yaml with the following contents. Replace every example value with your own values. The node group name can't be longer than 63 characters. It must start with letter or digit, but can also include hyphens and underscores for the remaining characters. To retrieve an optimized AMI ID for ami-1234567890abcdef0, see Retrieving Amazon EKS optimized Amazon Linux AMI IDs.

apiVersion: eksctl.io/v1alpha5 kind: ClusterConfig metadata: name: my-cluster region: region-code version: 1.23 managedNodeGroups: - name: my-nodegroup ami: ami-1234567890abcdef0 overrideBootstrapCommand: | #!/bin/bash /etc/eks/bootstrap.sh my-cluster --container-runtime containerd
Note

If you launch many nodes simultaneously, you may also want to specify values for the --apiserver-endpoint, --b64-cluster-ca, and --dns-cluster-ip bootstrap arguments to avoid errors. For more information, see Specifying an AMI.

Run the following command to create the node group.

eksctl create nodegroup -f my-nodegroup.yaml

If you prefer to use a different tool to create your managed node group, you must deploy the node group using a launch template. In your launch template, specify an Amazon EKS optimized AMI ID, then deploy the node group using a launch template and provide the following user data. This user data passes arguments into the bootstrap.sh file. For more information about the bootstrap file, see bootstrap.sh on GitHub.

/etc/eks/bootstrap.sh my-cluster --container-runtime containerd

More information

For more information about using Amazon EKS optimized Amazon Linux AMIs, see the following sections: