InferenceComponentComputeResourceRequirements

Defines the compute resources to allocate to run a model, plus any adapter models, that you assign to an inference component. These resources include CPU cores, accelerators, and memory.

MinMemoryRequiredInMb

The minimum MB of memory to allocate to run a model that you assign to an inference component.

Type: Integer

Valid Range: Minimum value of 128.

Required: Yes

MaxMemoryRequiredInMb

The maximum MB of memory to allocate to run a model that you assign to an inference component.

Type: Integer

Valid Range: Minimum value of 128.

Required: No

NumberOfAcceleratorDevicesRequired

The number of accelerators to allocate to run a model that you assign to an inference component. Accelerators include GPUs and AWS Inferentia.

Type: Float

Valid Range: Minimum value of 1.

Required: No

NumberOfCpuCoresRequired

The number of CPU cores to allocate to run a model that you assign to an inference component.

Type: Float

Valid Range: Minimum value of 0.25.

Required: No

InferenceComponentComputeResourceRequirements

Contents

See Also