InferenceMetrics - Amazon SageMaker

InferenceMetrics

The metrics for an existing endpoint compared in an Inference Recommender job.

Contents

MaxInvocations

The expected maximum number of requests per minute for the instance.

Type: Integer

Required: Yes

ModelLatency

The expected model latency at maximum invocations per minute for the instance.

Type: Integer

Required: Yes

See Also

For more information about using this API in one of the language-specific AWS SDKs, see the following: