Capacity Blocks for ML
Capacity Blocks for ML allow you to reserve highly sought-after GPU instances on a future date to support
your short duration machine learning (ML) workloads. Instances that run inside a Capacity Block
are automatically placed close together inside Amazon EC2 UltraClusters
With Capacity Blocks, you can see when GPU instance capacity is available on future dates, and you can schedule a Capacity Block to start at a time that works best for you. When you reserve a Capacity Block, you get predictable capacity assurance for GPU instances while paying only for the amount of time that you need. We recommend Capacity Blocks when you need GPUs to support your ML workloads for days or weeks at a time and don't want to pay for a reservation while your GPU instances aren't in use.
The following are some common use cases for Capacity Blocks.
-
ML model training and fine-tuning – Get uninterrupted access to the GPU instances that you reserved to complete ML model training and fine-tuning.
-
ML experiments and prototypes – Run experiments and build prototypes that require GPU instances for short durations.
Capacity Blocks are currently available for p5.48xlarge
, p5e.48xlarge
, p4d.24xlarge
, and trn1.32xlarge
instances. The p5.48xlarge
instances are available
in the US East (N. Virginia) and US East (Ohio) Regions. The p5e.48xlarge
instances are available in the
US East (Ohio) Region. The
p4d.24xlarge
instances are available in the US East (Ohio) and the
US West (Oregon) Regions. The trn1.32xlarge
instances are available in the Asia Pacific (Melbourne) Region. You can reserve a Capacity Block with a reservation
start time up to eight weeks in the future.
You can use Capacity Blocks to reserve p5
, p5e
, p4d
, and trn1
instances with the
following reservation duration and instance quantity options.
-
Reservation durations for 1-day increments up 14 days and 7-day increments up to 28 days total
-
Reservation instance quantity options of 1, 2, 4, 8, 16, 32, or 64 instances
To reserve a Capacity Block, you start by specifying your capacity needs, including the instance type, the number of instances, amount of time, earliest start date, and latest end date that you need. Then, you can see an available Capacity Block offering that meets your specifications. The Capacity Block offering includes details such as start time, Availability Zone, and reservation price. The price of a Capacity Block offering depends on available supply and demand at the time the offering was delivered. After you reserve a Capacity Block, the price doesn't change. For more information, see Capacity Blocks pricing and billing.
When you purchase a Capacity Block offering, your reservation is created for the date and number of instances that you selected. When your Capacity Block reservation begins, you can target instance launches by specifying the reservation ID in your launch requests.
You can use all the instances you reserved until 30 minutes before the end time of the Capacity Block. With 30 minutes left in your Capacity Block reservation, we begin terminating any instances that are running in the Capacity Block. We use this time to clean up your instances before delivering the Capacity Block to the next customer. The last 30 minutes of the reservation are not charged in the price of the Capacity Block. We emit an event through EventBridge 10 minutes before the termination process begins. For more information, see Monitor Capacity Blocks using EventBridge.
Topics
Supported platforms
Capacity Blocks for ML currently support p5.48xlarge
, p5e.48xlarge
, p4d.24xlarge
, and trn1.32xlarge
instances with default tenancy. When you use the AWS Management Console to purchase a Capacity Block,
the default platform option is Linux/UNIX. When you use the AWS Command Line Interface (AWS CLI) or AWS
SDK to purchase a Capacity Block, the following platform options are
available:
-
Linux/Unix
-
Red Hat Enterprise Linux
-
RHEL with HA
-
SUSE Linux
-
Ubuntu Pro
Considerations
Before you use Capacity Blocks, consider the following details and limitations.
-
Capacity Blocks start and end at 11:30AM Coordinated Universal Time (UTC).
-
The termination process for instances running in a Capacity Block begins at 11:00AM Coordinated Universal Time (UTC) on the final day of the reservation.
-
Capacity Blocks can be reserved with a start time up to 8 weeks in the future.
-
Capacity Block modifications and cancellations aren't allowed.
-
Capacity Blocks can't be shared across AWS accounts or within your AWS Organization.
-
Capacity Blocks can't be used in a capacity reservation group.
-
The total number of instances that can be reserved in Capacity Blocks across all accounts in your AWS Organization can't exceed 64 instances on a particular date.
-
To use a Capacity Block, instances must specifically target the reservation ID.
-
Instances in a Capacity Block don't count against your On-Demand Instances limits.
-
For P5 instances using a custom AMI, ensure that you have the required software and configuration for EFA.
-
For Amazon EKS managed node groups, see Create a managed node group with Amazon EC2 Capacity Blocks for ML. For Amazon EKS self-managed node groups, see Use Capacity Blocks for ML with self-managed nodes.
Related resources
After you create a Capacity Block, you can do the following with the Capacity Block:
-
Launch instances into the Capacity Block. For more information, see Launch instances into Capacity Blocks.
-
Create an Amazon EC2 Auto Scaling group. For more information, see Use Capacity Blocks for machine learning workloads in the Amazon EC2 Auto Scaling User Guide.
Note
If you use Amazon EC2 Auto Scaling or Amazon EKS, you can schedule scaling to run at the start of the Capacity Block reservation. With scheduled scaling, AWS automatically handles retries for you, so you don't need to worry about implementing retry logic to handle transient failures.
-
Enhance ML workflows with AWS ParallelCluster. For more information, see Enhancing ML workflows with AWS ParallelCluster and Amazon EC2 Capacity Blocks for ML
.
For more information about AWS ParallelCluster, see What is AWS ParallelCluster.