Domain 3: Modeling (36% of the exam content) - AWS Certification

Domain 3: Modeling (36% of the exam content)

This domain accounts for 36% of the exam content.

Task 3.1: Frame business problems as ML problems

  • Determine when to use and when not to use ML.

  • Know the difference between supervised and unsupervised learning.

  • Select from among classification, regression, forecasting, clustering, recommendation, and foundation models.

Task 3.2: Select the appropriate model(s) for a given ML problem

  • XGBoost, logistic regression, k-means, linear regression, decision trees, random forests, RNN, CNN, ensemble, transfer learning, and large language models (LLMs)

  • Express the intuition behind models.

Task 3.3: Train ML models

  • Split data between training and validation (for example, cross validation).

  • Understand optimization techniques for ML training (for example, gradient descent, loss functions, convergence).

  • Choose appropriate compute resources (for example GPU or CPU, distributed or non-distributed).

    • Choose appropriate compute platforms (Spark or non-Spark).

  • Update and retrain models.

    • Batch or real-time/online

Task 3.4: Perform hyperparameter optimization

  • Perform regularization.

    • Dropout

    • L1/L2

  • Perform cross-validation.

  • Initialize models.

  • Understand neural network architecture (layers and nodes), learning rate, and activation functions.

  • Understand tree-based models (number of trees, number of levels).

  • Understand linear models (learning rate).

Task 3.5: Evaluate ML models

  • Avoid overfitting or underfitting.

    • Detect and handle bias and variance.

  • Evaluate metrics (for example, area under curve [AUC]-receiver operating characteristics [ROC], accuracy, precision, recall, Root Mean Square Error [RMSE], F1 score).

  • Interpret confusion matrices.

  • Perform offline and online model evaluation (A/B testing).

  • Compare models by using metrics (for example, time to train a model, quality of model, engineering costs).

  • Perform cross-validation.