InferenceComponentComputeResourceRequirements
Defines the compute resources to allocate to run a model, plus any adapter models, that you assign to an inference component. These resources include CPU cores, accelerators, and memory.
Contents
- MinMemoryRequiredInMb
-
The minimum MB of memory to allocate to run a model that you assign to an inference component.
Type: Integer
Valid Range: Minimum value of 128.
Required: Yes
- MaxMemoryRequiredInMb
-
The maximum MB of memory to allocate to run a model that you assign to an inference component.
Type: Integer
Valid Range: Minimum value of 128.
Required: No
- NumberOfAcceleratorDevicesRequired
-
The number of accelerators to allocate to run a model that you assign to an inference component. Accelerators include GPUs and AWS Inferentia.
Type: Float
Valid Range: Minimum value of 1.
Required: No
- NumberOfCpuCoresRequired
-
The number of CPU cores to allocate to run a model that you assign to an inference component.
Type: Float
Valid Range: Minimum value of 0.25.
Required: No
See Also
For more information about using this API in one of the language-specific AWS SDKs, see the following: