Target tracking
With target tracking, you can adjust endpoint provisioning to fit your capacity needs based on usage. The number of inference units automatically adjust so that the utilized capacity is within a target percentage of the provisioned capacity. You can use target tracking to accommodate temporary surges of use for your document classification endpoints and entity recognizer endpoints. For more information, see Target tracking scaling policies for Application Auto Scaling.
Note
The following examples are formatted for Unix, Linux, and macOS. For Windows, replace the backslash (\) Unix continuation character at the end of each line with a caret (^).
Setting up target tracking
To set up target tracking for an endpoint, you use AWS CLI commands to register a scalable target and then create a scaling policy. The scalable target defines inference units as the resource used to adjust endpoint provisioning, and the scaling policy defines the metrics that control the auto scaling of the provisioned capacity.
To set up target tracking
-
Register a scalable target. The following examples register a scalable target to adjust endpoint provisioning with a minimum capacity of 1 inference unit and a maximum capacity of 2 inference units.
For a document classification endpoint, use the following AWS CLI command:
aws application-autoscaling register-scalable-target \ --service-namespace comprehend \ --resource-id arn:aws:comprehend:
region
:account-id
:document-classifier-endpoint/name
\ --scalable-dimension comprehend:document-classifier-endpoint:DesiredInferenceUnits \ --min-capacity 1 \ --max-capacity 2For an entity recognizer endpoint, use the following AWS CLI command:
aws application-autoscaling register-scalable-target \ --service-namespace comprehend \ --resource-id arn:aws:comprehend:
region
:account-id
:entity-recognizer-endpoint/name
\ --scalable-dimension comprehend:entity-recognizer-endpoint:DesiredInferenceUnits \ --min-capacity 1 \ --max-capacity 2 -
To verify the registration of the scalable target, use the following AWS CLI command:
aws application-autoscaling describe-scalable-targets \ --service-namespace comprehend \ --resource-id
endpoint ARN
-
Create a target tracking configuration for the scaling policy and save the configuration in a file called
config.json
. The following is an example of a target tracking configuration that automatically adjusts the number of inference units so that utilized capacity is always 70% of the provisioned capacity.{ "TargetValue": 70, "PredefinedMetricSpecification": { "PredefinedMetricType": "ComprehendInferenceUtilization" } }
-
Create a scaling policy. The following examples create a scaling policy based on the target tracking configuration defined in the
config.json
file.For a document classification endpoint, use the following AWS CLI command:
aws application-autoscaling put-scaling-policy \ --service-namespace comprehend \ --resource-id arn:aws:comprehend:
region
:account-id
:document-classifier-endpoint/name
\ --scalable-dimension comprehend:document-classifier-endpoint:DesiredInferenceUnits \ --policy-nameTestPolicy
\ --policy-type TargetTrackingScaling \ --target-tracking-scaling-policy-configuration file://config.jsonFor an entity recognizer endpoint, use the following AWS CLI command:
aws application-autoscaling put-scaling-policy \ --service-namespace comprehend \ --resource-id arn:aws:comprehend:
region
:account-id
:entity-recognizer-endpoint/name
\ --scalable-dimension comprehend:entity-recognizer-endpoint:DesiredInferenceUnits \ --policy-nameTestPolicy
\ --policy-type TargetTrackingScaling \ --target-tracking-scaling-policy-configuration file://config.json
Removing target tracking
To remove target tracking for an endpoint, you use AWS CLI commands to delete the scaling policy and then deregister the scalable target.
To remove target tracking
-
Delete the scaling policy. The following examples delete a specified scaling policy.
For a document classification endpoint, use the following AWS CLI command:
aws application-autoscaling delete-scaling-policy \ --service-namespace comprehend \ --resource-id arn:aws:comprehend:
region
:account-id
:document-classifier-endpoint/name
\ --scalable-dimension comprehend:document-classifier-endpoint:DesiredInferenceUnits \ --policy-nameTestPolicy
\For an entity recognizer endpoint, use the following AWS CLI command:
aws application-autoscaling delete-scaling-policy \ --service-namespace comprehend \ --resource-id arn:aws:comprehend:
region
:account-id
:entity-recognizer-endpoint/name
\ --scalable-dimension comprehend:entity-recognizer-endpoint:DesiredInferenceUnits \ --policy-nameTestPolicy
-
Deregister the scalable target. The following examples deregister a specified scalable target.
For a document classification endpoint, use the following AWS CLI command:
aws application-autoscaling deregister-scalable-target \ --service-namespace comprehend \ --resource-id arn:aws:comprehend:
region
:account-id
:document-classifier-endpoint/name
\ --scalable-dimension comprehend:document-classifier-endpoint:DesiredInferenceUnitsFor an entity recognizer endpoint, use the following AWS CLI command:
aws application-autoscaling deregister-scalable-target \ --service-namespace comprehend \ --resource-id arn:aws:comprehend:
region
:account-id
:entity-recognizer-endpoint/name
\ --scalable-dimension comprehend:entity-recognizer-endpoint:DesiredInferenceUnits