MLREL-08: Ensure model validation with relevant data - Machine Learning Lens

MLREL-08: Ensure model validation with relevant data

Put processes in place to include real and representative data for testing and validation. Data that does not include all possible patterns and scenarios will result in failures once model is in production. Check for a distribution mismatch between training, validation, and test data as well as the inference data.

Implementation plan

  • Use Amazon SageMaker Experiments - Your models should be tested and validated using data that is representative of what they will encounter in production. This data can include both real-world data and engineered data. You should account for all scenarios in your training data so that you can avoid errors when your model is deployed to production. Use Amazon SageMaker Experiments to organize, track, compare, and evaluate your machine learning experiments.

  • Use Amazon SageMaker Model Monitor - Consider implementing a plan to periodically test endpoints for deviations in model quality. Early detection of deviations can help you determine when to take corrective actions. SageMaker Model Monitor continually monitors the quality of Amazon SageMaker ML models in production. With Model Monitor, you can set alerts that notify you when there are deviations in the model quality.

Documents

Blogs

Examples