Compute environment
Job queues are mapped to one or more compute environments. Compute environments contain the Amazon ECS container
instances that are used to run containerized batch jobs. A specific compute environment can also be mapped to one or
more than one job queue. Within a job queue, the associated compute environments each have an order that's used by the
scheduler to determine where jobs that are ready to be run will run. If the first compute environment has a status of
VALID
and has available resources, the job is scheduled to a container instance within that compute
environment. If the first compute environment has a status of INVALID
or can't provide a suitable compute
resource, the scheduler attempts to run the job on the next compute environment.
Topics
- Managed compute environments
- Unmanaged compute environments
- Compute resource AMIs
- Launch template support
- Creating a compute environment
- Compute environment template
- Compute environment parameters
- EC2 Configurations
- Allocation strategies
- Updating compute environments
- Amazon EKS compute environments
- Compute Resource Memory Management
Managed compute environments
You can use a managed compute environment to have AWS Batch manage the capacity and instance types of the compute resources within the environment. This is based on the compute resource specifications that you define when you create the compute environment. You can choose either to use Amazon EC2 On-Demand Instances and Amazon EC2 Spot Instances. Or, you can alternatively use Fargate and Fargate Spot capacity in your managed compute environment. When using Spot Instances, you can optionally set a maximum price. This way, Spot Instances only launch when the Spot Instance price is under a specified percentage of the On-Demand price.
Important
Fargate Spot instances are not supported on Windows containers on AWS Fargate. A job queue will be blocked if a FargateWindows job is submitted to a job queue that only uses Fargate Spot compute environments.
Managed compute environments launch Amazon EC2 instances into the VPC and subnets that you specify and then registers them with an Amazon ECS cluster. The Amazon EC2 instances need external network access to communicate with the Amazon ECS service endpoint. Some subnets don't provide Amazon EC2 instances with public IP addresses. If your Amazon EC2 instances don't have a public IP address, they must use network address translation (NAT) to gain this access. For more information, see NAT gateways in the Amazon VPC User Guide. For more information about how to create a VPC, see Creating a virtual private cloud .
By default, AWS Batch managed compute environments use a recent, approved version of the Amazon ECS optimized AMI for compute resources. However, you might want to create your own AMI to use for your managed compute environments for various reasons. For more information, see Compute resource AMIs.
Note
AWS Batch doesn't automatically upgrade the AMIs in a compute environment after it's created. For example, it doesn't update the AMIs in your compute environment when a newer version of the Amazon ECS optimized AMI is released. You're responsible for the management of the guest operating system. This includes any updates and security patches. You're also responsible for any additional application software or utilities that you install on the compute resources. There are two ways to use a new AMI for your AWS Batch jobs. The original method is to complete these steps:
-
Create a new compute environment with the new AMI.
-
Add the compute environment to an existing job queue.
-
Remove the earlier compute environment from your job queue.
-
Delete the earlier compute environment.
In April 2022, AWS Batch added enhanced support for updating compute environments. For more information, see Updating compute environments. To use the enhanced updating of compute environments to update AMIs, follow these rules:
-
Either don't set the service role (
serviceRole
) parameter or set it to the AWSServiceRoleForBatch service-linked role. -
Set the allocation strategy (
allocationStrategy
) parameter toBEST_FIT_PROGRESSIVE
,SPOT_CAPACITY_OPTIMIZED
orSPOT_PRICE_CAPACITY_OPTIMIZED
. -
Set the update to latest image version (
updateToLatestImageVersion
) parameter totrue
. -
Don't specify an AMI ID in
imageId
,imageIdOverride
(inec2Configuration
), or in the launch template (launchTemplate
). In that case, AWS Batch selects the latest Amazon ECS optimized AMI that's supported by AWS Batch at the time the infrastructure update is initiated. Alternatively, you can specify the AMI ID in theimageId
orimageIdOverride
parameters, or the launch template identified by theLaunchTemplate
properties. Changing any of these properties starts an infrastructure update. If the AMI ID is specified in the launch template, it can't be replaced by specifying an AMI ID in either theimageId
orimageIdOverride
parameters. It can only be replaced by specifying a different launch template. Or, if the launch template version is set to$Default
or$Latest
, by setting either a new default version for the launch template (if it's$Default
) or by adding a new version to the launch template (if it's$Latest
).
If these rules are followed, any update that starts an infrastructure update will cause the AMI ID to be
re-selected. If the version
setting in the launch template (launchTemplate
) is set to $Latest
or $Default
, the latest or default
version of the launch template are evaluated up at the time of the infrastructure update, even if the launchTemplate
was not updated.
Consideration when creating multi-node parallel jobs
AWS Batch recommends creating dedicated compute environments for running multi-node parallel (MNP) jobs and
non-MNP jobs. This is due to the way compute capacity is created in your managed compute environment. When creating
a new managed compute environment, if you specify a minvCpu
value greater than zero then AWS Batch
creates an instance pool for use with non-MNP jobs only. If a multi-node parallel job is submitted, AWS Batch creates
new instance capacity to run the multi-node parallel jobs. In cases where there are both single-node and multi-node
parallel jobs running in the same compute environment where either a minvCpus
or maxvCpus
value is set, if the required compute resources are unavailable AWS Batch will wait for the current jobs to finish
before creating the compute resources necessary to run the new jobs.
Unmanaged compute environments
In an unmanaged compute environment, you manage your own compute resources. You must verify that the AMI you use for your compute resources meets the Amazon ECS container instance AMI specification. For more information, see Compute resource AMI specification and Creating a compute resource AMI.
Note
AWS Fargate resources aren't supported in unmanaged compute environments.
After you created your unmanaged compute environment, use the DescribeComputeEnvironments API operation to view the compute environment details. Find the Amazon ECS cluster that's associated with the environment and then manually launch your container instances into that Amazon ECS cluster.
The following AWS CLI command also provides the Amazon ECS cluster ARN.
$
aws batch describe-compute-environments \ --compute-environments
unmanagedCE
\ --query "computeEnvironments[].ecsClusterArn"
For more information, see Launching an Amazon ECS
container instance in the Amazon Elastic Container Service Developer Guide. When you launch your compute resources,
specify the Amazon ECS cluster ARN that the resources register with the following Amazon EC2 user data. Replace
ecsClusterArn
with the cluster ARN that you obtained with the previous command.
#!/bin/bash echo "ECS_CLUSTER=
ecsClusterArn
" >> /etc/ecs/ecs.config