Domain 3: Modeling (36% of the exam content)

This domain accounts for 36% of the exam content.

Task 3.1: Frame business problems as ML problems

Determine when to use and when not to use ML.
Know the difference between supervised and unsupervised learning.
Select from among classification, regression, forecasting, clustering, recommendation, and foundation models.

XGBoost, logistic regression, k-means, linear regression, decision trees, random forests, RNN, CNN, ensemble, transfer learning, and large language models (LLMs)
Express the intuition behind models.

Split data between training and validation (for example, cross validation).
Understand optimization techniques for ML training (for example, gradient descent, loss functions, convergence).
Choose appropriate compute resources (for example GPU or CPU, distributed or non-distributed).
- Choose appropriate compute platforms (Spark or non-Spark).
Update and retrain models.
- Batch or real-time/online

Perform regularization.
- Dropout
- L1/L2
Perform cross-validation.
Initialize models.
Understand neural network architecture (layers and nodes), learning rate, and activation functions.
Understand tree-based models (number of trees, number of levels).
Understand linear models (learning rate).

Avoid overfitting or underfitting.
- Detect and handle bias and variance.
Evaluate metrics (for example, area under curve [AUC]-receiver operating characteristics [ROC], accuracy, precision, recall, Root Mean Square Error [RMSE], F1 score).
Interpret confusion matrices.
Perform offline and online model evaluation (A/B testing).
Compare models by using metrics (for example, time to train a model, quality of model, engineering costs).
Perform cross-validation.

Warning Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Domain 2: Exploratory Data Analysis (24% of the exam content)

Domain 4: Machine Learning Implementation and Operations (20% of the exam content)