MLSEC-11: Protect against adversarial and malicious activities
Add protection inside and outside of the deployed code to detect malicious inputs that might result in incorrect predictions. Automatically detect unauthorized changes by examining the inputs in detail. Repair and validate the inputs before they are added back to the pool.
Implementation plan
-
Evaluate the robustness of the algorithm - Evaluate your use case and determine bad predictions or classifications. Use sensitivity analysis to evaluate the robustness of the algorithm against increasingly perturbed inputs to understand susceptibility to manipulated inputs.
-
Build for robustness from the start - Select diverse features to improve the algorithm’s ability to handle outliers. Consider using models in an ensemble for increased diversity in decisions and for robustness around decision points.
-
Identify repeats - Detect similar repeated inputs to the model to indicate possible threats to the decision boundaries using Amazon SageMaker Model Monitor to run a SageMaker processing job on a periodic interval to analyze the inference data. This can take the form of model brute forcing, where threats iterate only a limited set of variables to determine what influences decision points and derive feature importance.
-
Lineage tracking - If retraining on untrusted or unvalidated inputs, make sure any model skew is traced back to the data and pruned before retraining a replacement model.
-
Use secure inference API endpoints - Host the model so that a consumer of the model can perform inference against it securely. Permit consumers using the API to define the relationship, restrict access to the base model, and provide monitoring of model interactions.