Release notes for Slurm versions in AWS PCS
This topic describes important changes for each Slurm version currently supported in AWS PCS. We recommend you review the changes between the old and new versions when you upgrade your cluster.
Changes implemented in AWS PCS
-
AWS PCS supports Slurm accounting. For more information, see Slurm accounting in AWS PCS.
For more information about Slurm 24.11, see the following publications:
Changes implemented in AWS PCS
-
The new Slurm Step Manager module is now enabled by default in AWS PCS. This module provides significant benefits by offloading step management from the central controller to compute nodes, substantially improving system concurrency in environments with heavy step usage. To support this configuration and better isolate
Prolog
andEpilog
process execution, new prolog flags (Contain
,Alloc
) are enabled. -
Hierarchical communication from controller to compute nodes is enabled to optimize Slurm intra-node communication, which improves scalability and performance. Additionally, the routing configuration now uses partition node lists for communications from the controller, instead of the plugin's default routing algorithm, enhancing system resiliency.
-
A new hash plugin
HashPlugin=hash/sha3
replaces the previoushash/k12 plugin
. This is now enabled by default in AWS PCS clusters. -
Slurm controller logs now include enhanced auditing capabilities for all inbound remote procedure calls (RPC) to
slurmctld
. The logs include the source address, authenticated user, and RPC type before connection processing.
For more information about Slurm 24.05, see the following publications:
Slurm settings you can change in AWS PCS
-
The
SuspendTime
defaults to60
. Use the AWS PCSscaleDownIdleTimeInSeconds
configuration parameter to set it. For more information, see thescaleDownIdleTimeInSeconds
parameter of theClusterSlurmConfiguration
data type in the AWS PCS API Reference. -
The
MaxJobCount
andMaxArraySize
is based on the size you choose for the cluster. For more information, see thesize
parameter of theCreateCluster
API action in the AWS PCS API Reference. -
The
SelectTypeParameters
Slurm setting defaults toCR_CPU
. You can provide it as a value forslurmCustomSettings
to set it when you create a cluster. For more information, see theslurmCustomSettings
parameter of theCreateCluster
API action and SlurmCustomSetting in the AWS PCS API Reference. -
You can set
Prolog
andEpilog
at the cluster level. You can provide it as a value forslurmCustomSettings
to set it when you create a cluster. For more information, seeCreateCluster
and SlurmCustomSetting in the AWS PCS API Reference. -
You can set
Weight
andRealMemory
at the compute node group level. You can provide it as a value forslurmCustomSettings
to set it when you create a compute node group. For more information, seeCreateComputeNodeGroup
and SlurmCustomSetting in the AWS PCS API Reference.