Template option 1: Single account deployment
The MLOps Workload Orchestrator solution’s AWS API Gateway has two main API endpoints, /provisionpipeline, used to provision a pipeline, and /pipelinestatus, used to get the status of a provisioned pipeline.
-
/provisionpipeline
-
Method: POST
-
Body:
-
pipeline_type
: Type of the pipeline to provision. The solution currently supportsbyom_realtime_builtin
(real-time inference with Amazon SageMaker built-in algorithms pipeline),model_training_builtin
(model training using Amazon SageMaker training pipeline),model_tuner_builtin
(Amazon hyperparameter tuning pipeline),model_autopilot_training
(Amazon SageMaker autopilot pipeline),byom_realtime_custom
(real-time inference with custom algorithms pipeline),byom_batch_builtin
, (batch transform with built-in algorithms pipeline),byom_batch_custom
(batch transform with custom algorithms pipeline),byom_data_quality_monitor
pipeline (data quality monitor),byom_model_quality_monitor
pipeline (model quality monitor),byom_model_bias_monitor
pipeline (model bias monitor),byom_model_explainability_monitor
pipeline (model explainability monitor), andbyom_image_builder
(custom algorithm Docker image builder pipeline), and model card operations (create_model_card
,describe_model_card
,update_model_card
,delete_model_card
,list_model_cards
, andexport_model_cards
). -
custom_algorithm_docker
: Path to a zip file inside the S3 assets bucket, containing the necessary files (for example, Dockerfile, assets, etc.) to create a Docker image that can be used by Amazon SageMaker to deploy a model trained using the custom algorithm. For more information, refer to the Example Notebooks: Use Your Own Algorithm or Model in the Amazon SageMaker Developer Guide. -
custom_image_uri
: URI of a custom algorithm image in an Amazon ECR repository. -
ecr_repo_name
: Name of an Amazon ECR repository where the custom algorithm image, created by thebyom_image_builder
pipeline, will be stored. -
image_tag
: custom algorithm’s image tag to assign to the created image using thebyom_image_builder
pipeline. -
model_framework
: Name of the built-in algorithm used to train the model. -
model_framework_version
: Version number of the built-in algorithm used to train the model. -
model_name
: Arbitrary model name for the deploying model. The solution uses this parameter to create an Amazon SageMaker model, endpoint configuration, and endpoint with extensions on model name, such as
and<model_name>
-endpoint-config
. The<model_name>
-endpointmodel_name
is also used in the name of the deployed AWS CloudFormation stack for all pipelines. -
model_artifact_location
: Path to a file in S3 assets bucket containing the model artifact file (the output file after training a model). -
model_package_name
: Amazon SageMaker model package name (for example,"arn:aws:sagemaker:
).<region>
:<account_id>
:model-package/<model_package_group_name>
/<model_version>
" -
baseline_data
: Path to a csv file in S3 assets bucket containing the data with features names used for training the model (for data quality, model bias, and model explainability monitors), or model predictions and ground truth labels (for model quality monitor), for example a csv file with the header“prediction, probability, label”
for aBinaryClassification
problem. -
inference_instance
: Instance type for inference (real-time or batch). Refer to Amazon SageMaker Pricingfor a complete list of machine learning instance types. -
data_capture_location
: Path to a prefix in an S3 Bucket (including the bucket’s name, for example
) to store the data captured by the real-time Amazon SageMaker inference endpoint.<bucket-name>
/<prefix>
-
batch_inference_data
: Path to a file in an S3 Bucket (including the bucket’s name, for example
) containing the data for batch inference. This parameter is not required if your inference type is set to<bucket-name>
/<path-to-file>
real-time
. -
batch_job_output_location
: Path to a prefix in an S3 bucket (including the bucket’s name, for example
) to store the output of the batch transform job. This parameter is not required if your inference type is set to<bucket-name>
/<prefix>
real-time
. -
instance_type
: Instance type used by the data baseline and model monitoring jobs. -
instance_volume_size
: Size of the EC2 volume in GB to use for the baseline and monitoring job. The size must be enough to hold your training data and create the data baseline. -
instance_count
: the number of EC2 instances used by the training job. -
endpoint_name
: The name of the deployed Amazon SageMaker endpoint to monitor when deploying data and model quality monitor pipelines. Optionally, provide theendpoint_name
when creating a real-time inference pipeline which will be used to name the created Amazon SageMaker endpoint. If you do not provideendpoint_name
, it will be automatically generated. -
baseline_job_output_location
: Path to a prefix in an S3 bucket (including the bucket’s name, for example
) to store the output of the data baseline job.<bucket-name>
/<prefix>
-
monitoring_output_location
: Path to a prefix in an S3 bucket (including the bucket’s name, for example
) to store the output of the monitoring job.<bucket-name>
/<prefix>
-
schedule_expression
: Cron job expression to run the monitoring job. For example,cron(0 * ? * * *)
will run the monitoring job hourly,cron(0 0 ? * * *)
daily, etc. -
baseline_max_runtime_seconds
: Specifies the maximum time, in seconds, the baseline job is allowed to run. If the attribute is not provided, the job will run until it finishes. -
monitor_max_runtime_seconds
: Specifies the maximum time, in seconds, the monitoring job is allowed to run. For data quality and model explainability monitors, the value can be up to 3300 seconds for an hourly schedule. For model quality and model bias hourly schedules, this can be up to 1800 seconds. -
kms_key_arn
: Optional customer managed AWS Key Management Service (AWS KMS) key to encrypt captured data from the real-time Amazon SageMaker endpoint, output of batch transform and data baseline jobs, output of model monitor, and Amazon Elastic Compute Cloud (Amazon EC2) instance's volume used by Amazon SageMaker to run the solution's pipelines. This attribute may be included in the API calls ofbyom_realtime_builtin
,byom_realtime_custom
,byom_batch_builtin
,byom_batch_custom
, andbyom_
pipelines.<monitor-type>
_monitor -
baseline_inference_attribute
: Index or JSON path to locate predicted label(s) required forRegression
orMulticlassClassification
problems. The attribute is used by the model quality baseline. Ifbaseline_probability_attribute
andprobability_threshold_attribute
are provided,baseline_inference_attribute
is not required for aBinaryClassification
problem. -
baseline_probability_attribute
: Index or JSON path to locate predicted label(s) required forRegression
orMulticlassClassification
problems. The attribute is used by the model quality baseline. Ifbaseline_probability_attribute
andprobability_threshold_attribute
are provided,baseline_inference_attribute
is not required for aBinaryClassification
problem. -
baseline_ground_truth_attribute
: Index or JSON path to locate actual label(s). Used by the model quality baseline. -
problem_type
: Type of Machine learning problem. Valid values are"Regression"
,“BinaryClassification”
, or“MulticlassClassification”
. Used by the model quality, model bias, and model explainability monitoring schedules. It is an optional attribute for themodel_autopilot_training
pipeline. If not provided, the autopilot job will infer the problem type from thetarget_attribute
. If provided, thejob_objective
attribute must be provided too. -
job_objetive
: (optional) Metric to optimize, used by themodel_autopilot_training
pipeline. If provided, theproblem_type
must be provided. Valid values"Accuracy"
,"MSE"
,"F1"
,"F1macro"
,"AUC"
. -
job_name
: (optional) The name of the training job. If not provided, a name will be automatically generated by the solution. Used by all training pipelines. Note: The given name must be unique (no previous jobs created by the same name). -
training_data
: The S3 file key/prefix of the training data in the solution’s S3 assets bucket. This attribute is required by all training pipelines. Note: Formodel_training_builtin
andmodel_tuner_builtin
pipelines, the csv should not have a header. The target attribute should be the first column. Formodel_autopilot_training
pipeline, the file should have a header. -
validation_data
: (optional) The S3 file key/prefix of the training data in the solution’s S3 assets bucket. This attribute is used by themodel_training_builtin
andmodel_tuner_builtin
pipelines. -
target_attribute
: Target attribute name in the training data. Required by themodel_autopilot_training
pipeline. -
compression_type
: (optional) Compression type used with the training/validation data. Valid values “Gzip”. -
content_type
: (optional) The MIME type of the training data. Default:“csv”
. -
s3_data_type
: (optional) Training S3 data type. Valid values“S3Prefix”
,“ManifestFile”
, or“AugmentedManifestFile”
. Used by themodel_training_builtin
andmodel_tuner_builtin
pipelines. Default:“S3Prefix”
. -
data_distribution
: (optional) Data distribution. Valid values“FullyReplicated”
or“ShardedByS3Key”
. Used by themodel_training_builtin
andmodel_tuner_builtin
pipelines. Default:“FullyReplicated”
. -
data_input_mode
: (optional) Training data input mode. Valid“File”
,“Pipe”
,“FastFile”
. Used by themodel_training_builtin
andmodel_tuner_builtin
pipelines. Default:“File”
. -
data_record_wrapping
: (optional) Training data record wrapping, if any. Valid values“RecordIO”
. Used by themodel_training_builtin
andmodel_tuner_builtin
pipelines. -
attribute_names
: (optional) List of one or more attribute names to use that are found in a specifiedAugmentedManifestFile
(ifs3_data_type = “AugmentedManifestFile”
). Used by themodel_training_builtin
andmodel_tuner_builtin
pipelines. -
job_output_location
: S3 prefix in the solution’s S3 assets bucket, where the output of the training jobs will be saved. -
job_max_candidates
: (optional) Maximum number of candidates to be tried by the autopilot job. Default:10
. -
max_runtime_per_job
: (optional) Maximum runtime in seconds the training job is allowed to run. Default:86400
. -
total_max_runtime
: (optional) Autopilot total runtime in seconds allowed for the job. Default:2592000
. -
generate_definition_only
: (optional) Generate candidate definitions only by the autopilot job. Used by themodel_autopilot_training
pipeline. Default:“False”
. -
encrypt_inner_traffic
: (optional) Encrypt inner-container traffic for the job. Used by training pipelines. Default:“True”
. -
use_spot_instances
: (optional) Use managed spot instances with the training job. Used by themodel_training_builtin
andmodel_tuner_builtin
pipelines. Default:“True”
. -
Max_wait_time_spot_instances
: (optional) Maximum wait time in seconds for Spot instances (required ifuse_spot_instances = True
). Must be greater thanmax_runtime_per_job
. Default:172800
. -
algo_hyperparamaters
: Amazon SageMaker built-in Algorithm hyperparameters provided as a JSON object. Used by themodel_training_builtin
andmodel_tuner_builtin
pipelines. Example: {"eval_metric": "auc"
,"objective": "binary:logistic"
,"num_round": 400
,"rate_drop": 0.3
}. -
tuner_configs
: sagemaker.tuner.HyperparameterTunerconfigs ( objective_metric_name
,metric_definitions
,strategy
,objective_type
,max_jobs
,max_parallel_jobs
,base_tuning_job_name=None
,early_stopping_type
) provided as a JSON object. Required by themodel_tuner_builtin
pipeline.Note
Note: Some have default values and are not required to be specified. Example: {
"early_stopping_type": "Auto"
,"objective_metric_name": "validation:auc"
,"max_jobs": 10
,"max_parallel_jobs": 2
}. -
hyperparamaters_ranges
: Algorithm hyperparameters range used by theHyperparameters
job provided as a JSON object, where the key is hyperparameter name, and the value is list with the first item the type ("continuous"|"integer"|"categorical"
) and the second item is a list of [min_value
,max_value
] for"continuous"|"integer"
and a list of values for"categorical"
. Required by themodel_tuner_builtin
pipeline.Example: {
“min_child_weight”: [“continuous”,[0, 120]]
,“max_depth”: [“integer”,[1, 15]]
,“optimizer”: [“categorical”
,[“sgd”, “Adam”]])
} -
monitor_inference_attribute
: Index or JSON path to locate predicted label(s). Required forRegression
orMulticlassClassification
problems, and not required for aBinaryClassification
problem. Used by the model quality, model bias, and model explainability monitoring schedules. -
monitor_probability_attribute
: Index or JSON path to locate probabilities. Used only with aBinaryClassification
problem. Used by the model quality monitoring schedule. -
probability_threshold_attribute
: Threshold to convert probabilities to binaries. Used by the model quality monitoring schedule, and only with aBinaryClassification
problem. -
monitor_ground_truth_input
: Used by the model quality and model bias monitoring schedules to locate the ground truth labels. The solution expects you to useeventId
to label the captured data by the Amazon SageMaker endpoint. For more information, refer to the Amazon SageMaker developer guide on how to Ingest Ground Truth Labels and Merge Them with Predictions. -
bias_config
: a JSON object representing the attributes of sagemaker.clarify.BiasConfig. Required only for model bias monitor pipeline. -
model_predicted_label_config
: a JSON object representing the attributes of sagemaker.clarify.ModelPredictedLabelConfig. Required only for model bias monitor pipeline and problem_type
isBinaryClassification
, orMulticlassClassification
. -
shap_config
: a JSON object representing the attributes of sagemaker.clarify.SHAPConfig. Required only for model explainability monitor. For the “baseline”
attribute, you can provide a list of lists or as s3 csv file’s key (representing features values to be used as the baseline dataset in the kernel SHAP algorithm). If a file key is provided, the file must be uploaded to the solution’s S3 assets bucket before making the API call. -
name
: A unique name of the model card. -
Status
: optional) The status of model card. Possible values include:Approved
,Archived
,Draft
(default), andPendingReview
. -
Version
: (optional) The model card version (integer). -
created_by
: (optional) A JSON object, the group or individual that created the model card. -
last_modified_by
: (optional) A JSON object, the group or individual that last modified the model card. -
model_overview
: (optional) A JSON object, an overview of the model (used with model card operations) with the following attributes:-
model_name
: (optional) The name of an existing SageMaker model. If provided, the model overview will be automatically extracted from the model. -
model_id
: (optional) A SageMaker model ARN or non-SageMaker model ID. -
model_description
: (optional) A description of the model. -
model_version
: (optional) The model version (integer or float). -
problem_type
: (optional) The type of problem that the model solves. For example,Binary Classification
,Multiclass Classification
,Linear Regression
,Computer Vision
, orNatural Language Processing
. -
algorithm_type
: (optional) The algorithm used to solve the problem type. -
model_creator
: (optional) The organization, research group, or authors that created the model. -
model_owner
: (optional) The individual or group that maintains the model in your organization. -
model_artifact
: (optional) A list of model artifact location URIs. The maximum list size is 15. -
inference_environment
: (optional) A list of a model’s inference docker image(s).
-
-
intended_uses
: (optional) A JSON object (used with model card operations) with the following attributes:-
purpose_of_model
: (optional) The general purpose of this model. -
intended_uses
: (optional) The intended use cases for this model. -
factors_affecting_model_efficiency
: (optional) Factors affecting model efficacy. -
risk_rating
: (optional) Your organization’s risk rating for this model. Possible values include:High
,Low
,Medium
, orUnknown
. -
explanations_for_risk_rating
: (optional) An explanation of why your organization categorizes this model with this risk rating.
-
-
training_details
: (optional) A JSON object (used with model card operations) with the following attributes:-
model_name
: (optional) An existing SageMaker model name. If provided, training details are auto-discovered frommodel_overview
. -
training_job_name
: (optional) SageMaker training job name used to train the model. If provided, training details are be auto-discovered. -
objective_function
: (optional) A JSON object with the following attributes:-
function
: (optional) The optimization direction of the model’s objective function. Possible values includeMaximize
orMinimize
. -
facet
: (optional) The metric of the model’s objective function. Possible values includeAccuracy
,AUC
,Loss
,MAE
, orRMSE
. -
condition
: (optional) Description of your objective function metric conditions. -
notes
: (optional) Additional notes about the objective function.
-
-
training_observations
: optional) Observations about training. -
training_job_details
: (optional) A JSON object with the following attributes:-
training_arn
: (optional) The SageMaker training job ARN. -
training_datasets
: (optional) A list of Amazon S3 bucket URLs for the datasets used to train the model. The maximum list size is 15. -
training_environment
: (optional) a list of SageMaker training image URI. -
training_metrics
: (optional) A JSON object with the following attributes:-
name
: The metric name. -
value
: The metric value (integer or float). -
notes
: (optional) Notes on the metric.
-
-
user_provided_training_metrics
: (optional) A list oftraining_metrics
JSON objects. The maximum list length is 50.
-
-
-
evaluation_details
: (optional) A list of JSON object(s) (used with model card operations). Each JSON object has the following attributes:-
name
: The evaluation job name. -
metric_file_s3_url
: (optional) The metric file’s Amazon S3 bucket URL, which the solution uses to auto-discover evaluation metrics. The file must be uploaded to the solution’sAmazon S3 Assets
bucket. If provided, evaluation metrics are extracted from the file. -
metric_type
: (required ifmetric_file_s3_url
is provided) The type of evaluation. Possible values includemodel_card_metric_schema
,clarify_bias
,clarify_explainability
,regression
,binary_classification
, ormulticlass_classification
. -
evaluation_observation
: (optional) Observations made during model evaluation. -
evaluation_job_arn
: (optional) The ARN of the evaluation job. -
datasets
: (optional) A list of valuation dataset Amazon S3 bucket URLs. Maximum list length is 10. -
metadata
: (optional) A JSON object with additional attributes associated with the evaluation results. -
metric_groups
: (optional) A JSON object with the following attributes:-
name
: The metric group name. -
metric_data
: A list of JSON object(s) with the following attributes:-
name
: The name of the metric. -
type
: Metric type. Possible values include:bar_char
,boolean
,linear_graph
,matrix
,number
, orstring
. -
value
: The data type of the metric (integer, float, string, boolean, or list). -
notes
: (optional) Notes to add to the metric. -
x_axis_name
: The name of the x axis. -
y_axis_name
: The name of the y axis.
-
-
-
-
additional_information
: (optional) A JSON object (used with model card operations). The JSON object has the following attributes:-
ethical_considerations
: (optional) Ethical considerations to document about the model. -
caveats_and_recommendations
: (optional) Caveats and recommendations for users who might use this model in their applications. -
custom_details
: (optional) A JSON object of any additional custom information to document about the model.
-
-
-
Required attributes per pipeline type (Amazon SageMaker model registry is not used):
-
Model training using Amazon SageMaker training job (with required attributes):
{ "pipeline_type": "model_training_builtin", "model_name": "
<my-model-name>
", "model_framework": "xgboost", "model_framework_version": "1", "job_output_location": "<s3-prefix-in-assets-bucket>
", "training_data": "<path/to/training_data.csv>
", "validation_data": "<path/to/validation_data.csv>
", "algo_hyperparamaters": "<algo-hyperparameters-json-object>
" } -
Model training using Amazon SageMaker hyperparameter tuning Job (with required attributes):
{ "pipeline_type": "model_tuner_builtin", "model_name": "
<my-model-name>
", "model_framework": "xgboost", "model_framework_version": "1", "job_output_location": "<s3-prefix-in-assets-bucket>
", "training_data": "<path/to/training_data.csv>
", "validation_data": "<path/to/validation_data.csv>
", "algo_hyperparamaters": "<algo-hyperparameters-json-object>
", "tuner_configs": "<tuner-configs-json-object>
", "hyperparamaters_ranges": "<hyperparamaters-ranges-json-object>
" } } -
Model training using Amazon SageMaker autopilot job (with required attributes):
{ "pipeline_type": "model_autopilot_training", "model_name": "
<my-model-name>
", "job_output_location": "<s3-prefix-in-assets-bucket>
", "training_data": "<path/to/training_data.csv>
", "target_attribute": "<target-attribute-name>
" } -
Real-time inference with a custom algorithm for a machine learning model:
{ "pipeline_type": "byom_realtime_custom", "custom_image_uri": "
<docker-image-uri-in-Amazon-ECR-repo>
", "model_name": "<my-model-name>
", "model_artifact_location": "<path/to/model.tar.gz>
", "data_capture_location": "<bucket-name>
/<prefix>
", "inference_instance": "ml.m5.large", "endpoint_name": "<custom-endpoint-name>
" } -
Real-time inference with an Amazon SageMaker built-in model:
{ "pipeline_type": "byom_realtime_builtin", "model_framework": "xgboost", "model_framework_version": "1", "model_name": "
<my-model-name>
", "model_artifact_location": "<path/to/model.tar.gz>
", "data_capture_location": "<bucket-name>
/<prefix>
", "inference_instance": "ml.m5.large", "endpoint_name": "<custom-endpoint-name>
" } -
Batch inference with a custom algorithm for a machine learning model:
{ "pipeline_type": "byom_batch_custom", "custom_image_uri": "
<docker-image-uri-in-Amazon-ECR-repo>
", "model_name": "<my-model-name>
", "model_artifact_location": "<path/to/model.tar.gz>
", "inference_instance": "ml.m5.large", "batch_inference_data": "<bucket-name>
/<prefix>
/inference_data.csv", "batch_job_output_location": "<bucket-name>
/<prefix>
" } -
Batch inference with an Amazon SageMaker built-in model:
{ "pipeline_type": "byom_batch_builtin", "model_framework": "xgboost", "model_framework_version": "1", "model_name": "
<my-model-name>
", "model_artifact_location": "<path/to/model.tar.gz>
", "inference_instance": "ml.m5.large", "batch_inference_data": "<bucket-name>
/<prefix>
/inference_data.csv",, "batch_job_output_location": "<bucket-name>
/<prefix>
" } -
Data quality monitor pipeline:
{ "pipeline_type": "byom_data_quality_monitor", "model_name": "
<my-model-name>
", "endpoint_name": "xgb-churn-prediction-endpoint", "baseline_data": "<path/to/training_data_with_header.csv>
", "baseline_job_output_location": "<bucket-name>
/<prefix>
", "data_capture_location": "<bucket-name>
/<prefix>
", "monitoring_output_location": "<bucket-name>
/<prefix>
", "schedule_expression”: "cron(0 * ? * * *)", "instance_type": "ml.m5.large", "instance_volume_size": "20", "baseline_max_runtime_seconds": "3300", "monitor_max_runtime_seconds": "3300" } -
Model quality monitor pipeline (BinaryClassification problem):
{ "pipeline_type": "byom_model_quality_monitor", "model_name": "
<my-model-name>
", "endpoint_name": "xgb-churn-prediction-endpoint", "baseline_data": "<path/to/baseline_dataset.csv>
", "baseline_job_output_location": "<bucket-name>
/<prefix>
", "data_capture_location": "<bucket-name>
/<prefix>
", "monitoring_output_location": "<bucket-name>
/<prefix>
", "schedule_expression": "cron(0 0 ? * * *)", "instance_type": "ml.m5.large", "instance_volume_size": "20", "baseline_max_runtime_seconds": "3300", "monitor_max_runtime_seconds": "1800", "baseline_inference_attribute": "prediction", "baseline_probability_attribute": "probability", "baseline_ground_truth_attribute": "label", "probability_threshold_attribute": "0.5", "problem_type": "BinaryClassification", "monitor_probability_attribute": "0", "monitor_ground_truth_input": "<bucket-name>
/<prefix>
/<yyyy>
/<mm>
/<dd>
/<hh>
" } -
Model quality monitor pipeline (
Regression
orMulticlassClassification
problem):{ "pipeline_type": "byom_model_quality_monitor", "model_name": "
<my-model-name>
", "endpoint_name": "xgb-churn-prediction-endpoint", "baseline_data": "<path/to/baseline_data.csv>
", "baseline_job_output_location": "<bucket-name>
/<prefix>
", "data_capture_location": "<bucket-name>
/<prefix>
", "monitoring_output_location": "<bucket-name>
/<prefix>
", "schedule_expression": "cron(0 0 ? * * *)", "instance_type": "ml.m5.large", "instance_volume_size": "20", "baseline_max_runtime_seconds": "3300", "monitor_max_runtime_seconds": "1800", "baseline_inference_attribute": "prediction", "baseline_ground_truth_attribute": "label", "problem_type": "Regression", "monitor_inference_attribute": "0", "monitor_ground_truth_input": "<bucket-name>
/<prefix>
/<yyyy>
/<mm>
/<dd>
/<hh>
" } -
Model bias monitor pipeline (
BinaryClassification
problem):{ "pipeline_type": "byom_model_bias_monitor", "model_name": "
<my-model-name>
", "endpoint_name": "xgb-churn-prediction-endpoint", "baseline_data": "path/to/traing_data_with_header.csv", "baseline_job_output_location": "<bucket-name>
/<prefix>
", "data_capture_location”: "<bucket-name>
/<prefix>
", "monitoring_output_location": "<bucket-name>
/<prefix>
", "schedule_expression": "cron(0 0 ? * * *)", "instance_type": "ml.m5.large", "instance_volume_size": "20", "baseline_max_runtime_seconds": "3300", "monitor_max_runtime_seconds": "1800", "probability_threshold_attribute": "0.5", "problem_type": "BinaryClassification", "monitor_probability_attribute": "0", "bias_config": { "label_values_or_threshold": "<value>
", "facet_name": "<value>
", "facet_values_or_threshold": "<value>
" }, "model_predicted_label_config":{"probability": 0}, "monitor_ground_truth_input": "<bucket-name>
/<prefix>
/<yyyy>
/<mm>
/<dd>
/<hh>
" } -
Model bias monitor pipeline (
Regression
problem):{ "pipeline_type": "byom_model_bias_monitor", "model_name": "
<my-model-name>
", "endpoint_name": "xgb-churn-prediction-endpoint", "baseline_data": "<path/to/training_data_with_header.csv>
", "baseline_job_output_location": "<bucket-name>
/<prefix>
", "data_capture_location”: "<bucket-name>
/<prefix>
", "monitoring_output_location": "<bucket-name>
/<prefix>
", "schedule_expression": "cron(0 0 ? * * *)", "instance_type": "ml.m5.large", "instance_volume_size": "20", "baseline_max_runtime_seconds": "3300", "monitor_max_runtime_seconds": "1800", "problem_type": "Regression", "monitor_inference_attribute": "0", "bias_config": { "label_values_or_threshold": "<value>
", "facet_name": "<value>
", "facet_values_or_threshold": "<value>
" }, "monitor_ground_truth_input": "<bucket-name>
/<prefix>
/<yyyy>
/<mm>
/<dd>
/<hh>
" } -
Model explainability monitor pipeline (
BinaryClassification
problem):{ "pipeline_type": "byom_model_explainability_monitor", "model_name": "
<my-model-name>
", "endpoint_name": "xgb-churn-prediction-endpoint", "baseline_data": "<path/to/training_data_with_header.csv>
", "baseline_job_output_location": "<bucket-name>
/<prefix>
", "data_capture_location”: "<bucket-name>
/<prefix>
", "monitoring_output_location": "<bucket-name>
/<prefix>
", "schedule_expression": "cron(0 0 ? * * *)", "instance_type": "ml.m5.large", "instance_volume_size": "20", "baseline_max_runtime_seconds": "3300", "monitor_max_runtime_seconds": "1800", "probability_threshold_attribute": "0.5", "problem_type": "BinaryClassification", "monitor_probability_attribute": "0", "shap_config": { "baseline": "<path/to/shap_baseline_dataset.csv>
", "num_samples": "<value>
", "agg_method": "mean_abs|mean_sq|median" } } -
Custom algorithm image builder pipeline:
{ "pipeline_type": "byom_image_builder", "custom_algorithm_docker": "
<path/to/custom_image.zip>
", "ecr_repo_name": "<name-of-Amazon-ECR-repository>
", "image_tag": "<image-tag>
" } -
Model card's
create
operation:{ "pipeline_type": "create_model_card", "name": "
<model-card-name>
", "model_overview": { "model_name": "<name-of-existing-model>
", "model_description": "<model description>
", "model_version":<version number>
, "problem_type": "<type of problem the model solves>
", "algorithm_type": "<algorithm name>
", "model_creator": "<name of the model creator>
", "model_owner": "<model owner>
", "model_artifact": ["<model artifact>
"], "inference_environment": ["<image used for inference>
"] }, "intended_uses": { "purpose_of_model": "<description of purpose of model>
", "intended_uses": "<description of intended uses>
", "factors_affecting_model_efficiency": "<any factors>
", "risk_rating": "Low", "explanations_for_risk_rating":"<risk rating>
" }, "training_details":{ "training_job_name": "<training job name>
", "objective_function": { "function": "<one of Maximize|Minimize>
", "facet": "<one of Accuracy|AUC|Loss|MAE|RMSE>
", "condition": "<description of any conditions>
", "notes": "<any notes>
" }, "training_observations": "<any observations>
", "training_job_details": { "user_provided_training_metrics": [{"name": "<metric-name>
", "value":<metric value>
, "notes": "<metric notes>
"}] } }, "evaluation_details": [ { "name": "<evaluation name>
", "metric_file_s3_url": "<s3 url for the JSON evaluation file in the solution's asset S3 bucket>
", "metric_type": "<one of model_card_metric_schema|clarify_bias|clarify_explainability|regression|binary_classification|multiclass_classification>
" }, { "name": "<evaluation name>
", "evaluation_observation": "<any-observation>
", "evaluation_job_arn": "<job-arn>
", "datasets": ["<s3 url for training data>
"], "metadata": {"key": "value"}, "metric_groups": [{"name": "<group-name>
", "metric_data": [{"name":"<metric-name>
", "type": "<one of bar_char|boolean|linear_graph|matrix|number|string>
", "value":<value>
, "notes": "<metric notes>
"}]}] }], "additional_information": { "ethical_considerations": "make sure data is representative", "caveats_and_recommendations": "some recommendations", "custom_details": { "key": "value" } } } -
Model card's
describe
operation:{ "pipeline_type": "describe_model_card", "name": "
<model card name>
" } -
Model card's
delete
operation:{ "pipeline_type": "delete_model_card", "name": "
<model card name>
" } -
Model card's
update
operation:{ "pipeline_type": "update_model_card", "name": "
<model card name>
", "status": "<status>
", "training_details":{ "training_job_name": "<training job name>
" } } -
Model card's
export
operation:{ "pipeline_type": "export_model_card", "name": "
<model card name>
" } -
Model card's list cards:
{ "pipeline_type": "list_model_cards" }
-
Required attributes per pipeline type when the Amazon SageMaker model registry is used. When the model registry is used, the following attributes must be modified:
-
Real-time inference and batch pipelines with custom algorithms:
-
Remove
custom_image_uri
andmodel_artifact_location
-
Add
model_package_name
-
-
Real-time inference and batch pipelines with Amazon SageMaker built-in algorithms:
-
Remove
model_framework
,model_framework_version
, andmodel_artifact_location
-
Add
model_package_name
-
Expected responses of API requests to
/provisonpipeline
:-
If the pipeline is provisioned for the first time (that is, if no existing pipeline with the same name), the response is:
-
{ "message": "success: stack creation started", "pipeline_id": "arn:aws:cloudformation:
<region>
:<account-id>
:stack/<stack-id>
" } -
If the pipeline is already provisioned, the response is:
{ "message": "Pipeline
<stack-name>
is already provisioned. Updating template parameters.", "pipeline_id": "arn:aws:cloudformation:<region>
:<account-id>
:stack/<stack-id>
" }
-
If the pipeline is already provisioned, the
pipeline_type
isbyom_image_builder
, and there are updates to be performed, the response is:{ "message": "Pipeline
<stack-name>
is being updated.", "pipeline_id": " arn:aws:cloudformation:<region>
:<account-id>
:stack/<stack-id>
" } -
If the pipeline is already provisioned, the
pipeline_type
isbyom_image_builder
, and there are no updates to be performed, the response is:{ "message": "Pipeline
<stack-name>
is already provisioned. No updates are to be performed. "pipeline_id": " arn:aws:cloudformation:<region>
:<account-id>
:stack/<stack-id>
" } -
If the pipeline type is one of the model card operations (
create
,describe
,update
,delete
,export
, andlist model cards
), the response is:{ "message": "
<message based on the model card operation>
" } -
/pipelinestatus
-
Method: POST
-
Body
-
pipeline_id
: The ARN of the created CloudFormation stack after provisioning a pipeline. (This information can be retrieved from/provisionpipeline
.)
-
-
Example structure:
{ "pipeline_id": "arn:aws:cloudformation:us-west-1:123456789123:stack/my-mlops-pipeline/12abcdef-abcd-1234-ab12-abcdef123456" }
-
-
Expected responses of APIs requests to
/pipelinestatus
:-
The returned response depends on the solution’s option (single- or multi-account deployment). Example response for the single-account option:
{ "pipelineName": "
<pipeline-name>
", "pipelineVersion": 1, "stageStates": [ { "stageName": "Source", "inboundTransitionState": { "enabled": true }, "actionStates": [ { "actionName": "S3Source", "currentRevision": { "revisionId": "<version-id>
" }, "latestExecution": { "actionExecutionId": "<execution-id>
", "status": "Succeeded", "summary": "Amazon S3 version id: "<id>
", "lastStatusChange": "<timestamp>
", "externalExecutionId": "<execution-id>
" }, "entityUrl": "https://console.aws.amazon.com/s3/home?region=<region>
#" } ], "latestExecution": { "pipelineExecutionId": "<execution-id>
", "status": "Succeeded" } }, { "stageName": "DeployCloudFormation", "inboundTransitionState": { "enabled": true }, "actionStates": [ { "actionName": "deploy_stack", "latestExecution": { "actionExecutionId": "<execution-id>
", "status": "Succeeded", "summary": "Stack<pipeline-name>
was created.", "lastStatusChange": "<timestamp>
", "externalExecutionId": "<stack-id>
", "externalExecutionUrl": ""<stack-url>
" }, "entityUrl": "https://console.aws.amazon.com/cloudformation/home?region=<Region>
#/" } ], "latestExecution": { "pipelineExecutionId": "<execution-id>
", "status": "Succeeded" } } ], "created": "<timestamp>
", "updated": "<timestamp>
", "ResponseMetadata": { "RequestId": "<request-ID>
", "HTTPStatusCode": 200, "HTTPHeaders": { "x-amzn-requestid": "<request-id>
", "date": "<date>
", "content-type": "application/x-amz-json-1.1", "content-length": "<number>
" }, "RetryAttempts": 0 } }
-
You can use the following API method for inference of the deployed real-time inference pipeline. The AWS Gateway API URL can be found in the outputs of the pipeline’s AWS CloudFormation stack.
-
/inference
-
Method: POST
-
Body
-
payload
: The data to be sent for inference. -
content_type
: MIME content type for the payload.{ "payload": "1.0, 2.0, 3.2", "content_type": "text/csv" }
-
-
Expected responses of APIs requests to
/inference
:-
The request returns a single prediction value, if one data point was in the request, and returns multiple prediction values (separated by a “,”), if several data points were sent in the APIs request.
-
-
-
API responses with error messages:
-
If an API request to any one of the solution’s API endpoints results in an exception/error, the expected body of the API response is:
{ "message": "
<general error message>
", "detailedMessage ": "<detailed error message>
" } -
The
detailedMessage
attribute in the body of the API response is only included if the solution was configured to allow detailed error messages. Refer to the template’s parameters table for more details.