Overview of Machine Learning on Amazon EKS - Amazon EKS

Help improve this page

To contribute to this user guide, choose the Edit this page on GitHub link that is located in the right pane of every page.

Overview of Machine Learning on Amazon EKS

Amazon Elastic Kubernetes Service (EKS) is a managed Kubernetes platform that empowers organizations to deploy, manage, and scale AI and machine learning (ML) workloads with unparalleled flexibility and control. Built on the open source Kubernetes ecosystem, EKS lets you harness your existing Kubernetes expertise, while integrating seamlessly with open source tools and AWS services.

Whether you’re training large-scale models, running real-time online inference, or deploying generative AI applications, EKS delivers the performance, scalability, and cost efficiency your AI/ML projects demand.

Why Choose EKS for AI/ML?

EKS is a managed Kubernetes platform that helps you deploy and manage complex AI/ML workloads. Built on the open source Kubernetes ecosystem, it integrates with AWS services, providing the control and scalability needed for advanced projects. For teams new to AI/ML deployments, existing Kubernetes skills transfer directly, allowing efficient orchestration of multiple workloads.

EKS supports everything from operating system customizations to compute scaling, and its open source foundation promotes technological flexibility, preserving choice for future infrastructure decisions. The platform provides the performance and tuning options AI/ML workloads require, supporting features such as:

  • Full cluster control to fine-tune costs and configurations without hidden abstractions

  • Sub-second latency for real-time inference workloads in production

  • Advanced customizations like multi-instance GPUs, multi-cloud strategies, and OS-level tuning

  • Ability to centralize workloads using EKS as a unified orchestrator across AI/ML pipelines

Key use cases

Amazon EKS provides a robust platform for a wide range of AI/ML workloads, supporting various technologies and deployment patterns:

Case studies

Customers choose Amazon EKS for various reasons, such as optimizing GPU usage or running real-time inference workloads with sub-second latency, as demonstrated in the following case studies. For a list of all case studies for Amazon EKS, see AWS Customer Success Stories.

  • Unitary processes 26 million videos daily using AI for content moderation, requiring high-throughput, low-latency inference and have achieved an 80% reduction in container boot times, ensuring fast response to scaling events as traffic fluctuates.

  • Miro, the visual collaboration platform supporting 70 million users worldwide, reported an 80% reduction in compute costs compared to their previous self-managed Kubernetes clusters.

  • Synthesia, which offers generative AI video creation as a service for customers to create realistic videos from text prompts, achieved a 30x improvement in ML model training throughput.

  • Harri, providing HR technology for the hospitality industry, achieved 90% faster scaling in response to spikes in demand and reduced its compute costs by 30% by migrating to AWS Graviton processors.

  • Ada Support, an AI-powered customer service automation company, achieved a 15% reduction in compute costs alongside a 30% increase in compute efficiency.

  • Snorkel AI, which equips enterprises to build and adapt foundation models and large language models, achieved over 40% cost savings by implementing intelligent scaling mechanisms for their GPU resources.

Start using Machine Learning on EKS

To begin planning for and using Machine Learning platforms and workloads on EKS on the AWS cloud, proceed to the Get started with ML section.