Template option 1: Single account deployment - MLOps Workload Orchestrator

Template option 1: Single account deployment

The MLOps Workload Orchestrator solution’s AWS API Gateway has two main API endpoints, /provisionpipeline, used to provision a pipeline, and /pipelinestatus, used to get the status of a provisioned pipeline.

  • /provisionpipeline

    • Method: POST

    • Body:

      • pipeline_type: Type of the pipeline to provision. The solution currently supports byom_realtime_builtin (real-time inference with Amazon SageMaker built-in algorithms pipeline), model_training_builtin (model training using Amazon SageMaker training pipeline), model_tuner_builtin (Amazon hyperparameter tuning pipeline), model_autopilot_training (Amazon SageMaker autopilot pipeline), byom_realtime_custom (real-time inference with custom algorithms pipeline), byom_batch_builtin, (batch transform with built-in algorithms pipeline), byom_batch_custom (batch transform with custom algorithms pipeline), byom_data_quality_monitor pipeline (data quality monitor), byom_model_quality_monitor pipeline (model quality monitor), byom_model_bias_monitor pipeline (model bias monitor), byom_model_explainability_monitor pipeline (model explainability monitor), and byom_image_builder (custom algorithm Docker image builder pipeline), and model card operations (create_model_card, describe_model_card, update_model_card, delete_model_card, list_model_cards, and export_model_cards).

      • custom_algorithm_docker: Path to a zip file inside the S3 assets bucket, containing the necessary files (for example, Dockerfile, assets, etc.) to create a Docker image that can be used by Amazon SageMaker to deploy a model trained using the custom algorithm. For more information, refer to the Example Notebooks: Use Your Own Algorithm or Model in the Amazon SageMaker Developer Guide.

      • custom_image_uri: URI of a custom algorithm image in an Amazon ECR repository.

      • ecr_repo_name: Name of an Amazon ECR repository where the custom algorithm image, created by the byom_image_builder pipeline, will be stored.

      • image_tag: custom algorithm’s image tag to assign to the created image using the byom_image_builder pipeline.

      • model_framework: Name of the built-in algorithm used to train the model.

      • model_framework_version: Version number of the built-in algorithm used to train the model.

      • model_name: Arbitrary model name for the deploying model. The solution uses this parameter to create an Amazon SageMaker model, endpoint configuration, and endpoint with extensions on model name, such as <model_name>-endpoint-config and <model_name>-endpoint. The model_name is also used in the name of the deployed AWS CloudFormation stack for all pipelines.

      • model_artifact_location: Path to a file in S3 assets bucket containing the model artifact file (the output file after training a model).

      • model_package_name: Amazon SageMaker model package name (for example, "arn:aws:sagemaker:<region>:<account_id>:model-package/<model_package_group_name>/<model_version>").

      • baseline_data: Path to a csv file in S3 assets bucket containing the data with features names used for training the model (for data quality, model bias, and model explainability monitors), or model predictions and ground truth labels (for model quality monitor), for example a csv file with the header “prediction, probability, label” for a BinaryClassification problem.

      • inference_instance: Instance type for inference (real-time or batch). Refer to Amazon SageMaker Pricing for a complete list of machine learning instance types.

      • data_capture_location: Path to a prefix in an S3 Bucket (including the bucket’s name, for example <bucket-name>/<prefix>) to store the data captured by the real-time Amazon SageMaker inference endpoint.

      • batch_inference_data: Path to a file in an S3 Bucket (including the bucket’s name, for example <bucket-name>/<path-to-file>) containing the data for batch inference. This parameter is not required if your inference type is set to real-time.

      • batch_job_output_location: Path to a prefix in an S3 bucket (including the bucket’s name, for example <bucket-name>/<prefix>) to store the output of the batch transform job. This parameter is not required if your inference type is set to real-time.

      • instance_type: Instance type used by the data baseline and model monitoring jobs.

      • instance_volume_size: Size of the EC2 volume in GB to use for the baseline and monitoring job. The size must be enough to hold your training data and create the data baseline.

      • instance_count: the number of EC2 instances used by the training job.

      • endpoint_name: The name of the deployed Amazon SageMaker endpoint to monitor when deploying data and model quality monitor pipelines. Optionally, provide the endpoint_name when creating a real-time inference pipeline which will be used to name the created Amazon SageMaker endpoint. If you do not provide endpoint_name, it will be automatically generated.

      • baseline_job_output_location: Path to a prefix in an S3 bucket (including the bucket’s name, for example <bucket-name>/<prefix>) to store the output of the data baseline job.

      • monitoring_output_location: Path to a prefix in an S3 bucket (including the bucket’s name, for example <bucket-name>/<prefix>) to store the output of the monitoring job.

      • schedule_expression: Cron job expression to run the monitoring job. For example, cron(0 * ? * * *) will run the monitoring job hourly, cron(0 0 ? * * *) daily, etc.

      • baseline_max_runtime_seconds: Specifies the maximum time, in seconds, the baseline job is allowed to run. If the attribute is not provided, the job will run until it finishes.

      • monitor_max_runtime_seconds: Specifies the maximum time, in seconds, the monitoring job is allowed to run. For data quality and model explainability monitors, the value can be up to 3300 seconds for an hourly schedule. For model quality and model bias hourly schedules, this can be up to 1800 seconds.

      • kms_key_arn: Optional customer managed AWS Key Management Service (AWS KMS) key to encrypt captured data from the real-time Amazon SageMaker endpoint, output of batch transform and data baseline jobs, output of model monitor, and Amazon Elastic Compute Cloud (Amazon EC2) instance's volume used by Amazon SageMaker to run the solution's pipelines. This attribute may be included in the API calls of byom_realtime_builtin, byom_realtime_custom, byom_batch_builtin, byom_batch_custom, and byom_<monitor-type>_monitor pipelines.

      • baseline_inference_attribute: Index or JSON path to locate predicted label(s) required for Regression or MulticlassClassification problems. The attribute is used by the model quality baseline. If baseline_probability_attribute and probability_threshold_attribute are provided, baseline_inference_attribute is not required for a BinaryClassification problem.

      • baseline_probability_attribute: Index or JSON path to locate predicted label(s) required for Regression or MulticlassClassification problems. The attribute is used by the model quality baseline. If baseline_probability_attribute and probability_threshold_attribute are provided, baseline_inference_attribute is not required for a BinaryClassification problem.

      • baseline_ground_truth_attribute: Index or JSON path to locate actual label(s). Used by the model quality baseline.

      • problem_type: Type of Machine learning problem. Valid values are "Regression", “BinaryClassification”, or “MulticlassClassification”. Used by the model quality, model bias, and model explainability monitoring schedules. It is an optional attribute for the model_autopilot_training pipeline. If not provided, the autopilot job will infer the problem type from the target_attribute. If provided, the job_objective attribute must be provided too.

      • job_objetive: (optional) Metric to optimize, used by the model_autopilot_training pipeline. If provided, the problem_type must be provided. Valid values "Accuracy", "MSE", "F1", "F1macro", "AUC".

      • job_name: (optional) The name of the training job. If not provided, a name will be automatically generated by the solution. Used by all training pipelines. Note: The given name must be unique (no previous jobs created by the same name).

      • training_data: The S3 file key/prefix of the training data in the solution’s S3 assets bucket. This attribute is required by all training pipelines. Note: For model_training_builtin and model_tuner_builtin pipelines, the csv should not have a header. The target attribute should be the first column. For model_autopilot_training pipeline, the file should have a header.

      • validation_data: (optional) The S3 file key/prefix of the training data in the solution’s S3 assets bucket. This attribute is used by the model_training_builtin and model_tuner_builtin pipelines.

      • target_attribute: Target attribute name in the training data. Required by the model_autopilot_training pipeline.

      • compression_type: (optional) Compression type used with the training/validation data. Valid values “Gzip”.

      • content_type: (optional) The MIME type of the training data. Default: “csv”.

      • s3_data_type: (optional) Training S3 data type. Valid values “S3Prefix”, “ManifestFile”, or “AugmentedManifestFile”. Used by the model_training_builtin and model_tuner_builtin pipelines. Default: “S3Prefix”.

      • data_distribution: (optional) Data distribution. Valid values “FullyReplicated” or “ShardedByS3Key”. Used by the model_training_builtin and model_tuner_builtin pipelines. Default: “FullyReplicated”.

      • data_input_mode: (optional) Training data input mode. Valid “File”, “Pipe”,   “FastFile”. Used by the model_training_builtin and model_tuner_builtin pipelines. Default: “File”.

      • data_record_wrapping: (optional) Training data record wrapping, if any. Valid values “RecordIO”. Used by the model_training_builtin and model_tuner_builtin pipelines.

      • attribute_names: (optional) List of one or more attribute names to use that are found in a specified AugmentedManifestFile (if s3_data_type = “AugmentedManifestFile”). Used by the model_training_builtin and model_tuner_builtin pipelines.

      • job_output_location: S3 prefix in the solution’s S3 assets bucket, where the output of the training jobs will be saved.

      • job_max_candidates: (optional) Maximum number of candidates to be tried by the autopilot job. Default: 10.

      • max_runtime_per_job: (optional) Maximum runtime in seconds the training job is allowed to run. Default: 86400.

      • total_max_runtime: (optional) Autopilot total runtime in seconds allowed for the job. Default: 2592000.

      • generate_definition_only: (optional) Generate candidate definitions only by the autopilot job. Used by the model_autopilot_training pipeline. Default: “False”.

      • encrypt_inner_traffic: (optional) Encrypt inner-container traffic for the job. Used by training pipelines. Default: “True”.

      • use_spot_instances: (optional) Use managed spot instances with the training job. Used by the model_training_builtin and model_tuner_builtin pipelines. Default: “True”.

      • Max_wait_time_spot_instances: (optional) Maximum wait time in seconds for Spot instances (required if use_spot_instances = True). Must be greater than max_runtime_per_job. Default: 172800.

      • algo_hyperparamaters: Amazon SageMaker built-in Algorithm hyperparameters provided as a JSON object. Used by the model_training_builtin and model_tuner_builtin pipelines. Example: {"eval_metric": "auc", "objective": "binary:logistic", "num_round": 400, "rate_drop": 0.3}.

      • tuner_configs: sagemaker.tuner.HyperparameterTuner configs (objective_metric_name, metric_definitions, strategy, objective_type, max_jobs, max_parallel_jobs, base_tuning_job_name=None, early_stopping_type) provided as a JSON object. Required by the model_tuner_builtin pipeline.

        Note

        Note: Some have default values and are not required to be specified. Example: {"early_stopping_type": "Auto", "objective_metric_name": "validation:auc", "max_jobs": 10, "max_parallel_jobs": 2}.

      • hyperparamaters_ranges: Algorithm hyperparameters range used by the Hyperparameters job provided as a JSON object, where the key is hyperparameter name, and the value is list with the first item the type ("continuous"|"integer"|"categorical") and the second item is a list of [min_value, max_value] for "continuous"|"integer" and a list of values for "categorical". Required by the model_tuner_builtin pipeline.

        Example: {“min_child_weight”: [“continuous”,[0, 120]], “max_depth”: [“integer”,[1, 15]], “optimizer”: [“categorical”, [“sgd”, “Adam”]])}

      • monitor_inference_attribute: Index or JSON path to locate predicted label(s). Required for Regression or MulticlassClassification problems, and not required for a BinaryClassification problem. Used by the model quality, model bias, and model explainability monitoring schedules.

      • monitor_probability_attribute: Index or JSON path to locate probabilities. Used only with a BinaryClassification problem. Used by the model quality monitoring schedule.

      • probability_threshold_attribute: Threshold to convert probabilities to binaries. Used by the model quality monitoring schedule, and only with a BinaryClassification problem.

      • monitor_ground_truth_input: Used by the model quality and model bias monitoring schedules to locate the ground truth labels. The solution expects you to use eventId to label the captured data by the Amazon SageMaker endpoint. For more information, refer to the Amazon SageMaker developer guide on how to Ingest Ground Truth Labels and Merge Them with Predictions.

      • bias_config: a JSON object representing the attributes of sagemaker.clarify.BiasConfig. Required only for model bias monitor pipeline.

      • model_predicted_label_config: a JSON object representing the attributes of sagemaker.clarify.ModelPredictedLabelConfig. Required only for model bias monitor pipeline and problem_type is BinaryClassification, or MulticlassClassification.

      • shap_config: a JSON object representing the attributes of sagemaker.clarify.SHAPConfig. Required only for model explainability monitor. For the “baseline” attribute, you can provide a list of lists or as s3 csv file’s key (representing features values to be used as the baseline dataset in the kernel SHAP algorithm). If a file key is provided, the file must be uploaded to the solution’s S3 assets bucket before making the API call.

      • name: A unique name of the model card.

      • Status: optional) The status of model card. Possible values include: Approved, Archived, Draft (default), and PendingReview.

      • Version: (optional) The model card version (integer).

      • created_by: (optional) A JSON object, the group or individual that created the model card.

      • last_modified_by: (optional) A JSON object, the group or individual that last modified the model card.

      • model_overview: (optional) A JSON object, an overview of the model (used with model card operations) with the following attributes:

        • model_name: (optional) The name of an existing SageMaker model. If provided, the model overview will be automatically extracted from the model.

        • model_id: (optional) A SageMaker model ARN or non-SageMaker model ID.

        • model_description: (optional) A description of the model.

        • model_version: (optional) The model version (integer or float).

        • problem_type: (optional) The type of problem that the model solves. For example, Binary Classification, Multiclass Classification, Linear Regression, Computer Vision, or Natural Language Processing.

        • algorithm_type: (optional) The algorithm used to solve the problem type.

        • model_creator: (optional) The organization, research group, or authors that created the model.

        • model_owner: (optional) The individual or group that maintains the model in your organization.

        • model_artifact: (optional) A list of model artifact location URIs. The maximum list size is 15.

        • inference_environment: (optional) A list of a model’s inference docker image(s).

      • intended_uses: (optional) A JSON object (used with model card operations) with the following attributes:

        • purpose_of_model: (optional) The general purpose of this model.

        • intended_uses: (optional) The intended use cases for this model.

        • factors_affecting_model_efficiency: (optional) Factors affecting model efficacy.

        • risk_rating: (optional) Your organization’s risk rating for this model. Possible values include: High, Low, Medium, or Unknown.

        • explanations_for_risk_rating: (optional) An explanation of why your organization categorizes this model with this risk rating.

      • training_details: (optional) A JSON object (used with model card operations) with the following attributes:

        • model_name: (optional) An existing SageMaker model name. If provided, training details are auto-discovered from model_overview.

        • training_job_name: (optional) SageMaker training job name used to train the model. If provided, training details are be auto-discovered.

        • objective_function: (optional) A JSON object with the following attributes:

          • function: (optional) The optimization direction of the model’s objective function. Possible values include Maximize or Minimize.

          • facet: (optional) The metric of the model’s objective function. Possible values include Accuracy, AUC, Loss, MAE, or RMSE.

          • condition: (optional) Description of your objective function metric conditions.

          • notes: (optional) Additional notes about the objective function.

        • training_observations: optional) Observations about training.

        • training_job_details: (optional) A JSON object with the following attributes:

          • training_arn: (optional) The SageMaker training job ARN.

          • training_datasets: (optional) A list of Amazon S3 bucket URLs for the datasets used to train the model. The maximum list size is 15.

          • training_environment: (optional) a list of SageMaker training image URI.

          • training_metrics: (optional) A JSON object with the following attributes:

            • name: The metric name.

            • value: The metric value (integer or float).

            • notes: (optional) Notes on the metric.

          • user_provided_training_metrics: (optional) A list of training_metrics JSON objects. The maximum list length is 50.

      • evaluation_details: (optional) A list of JSON object(s) (used with model card operations). Each JSON object has the following attributes:

        • name: The evaluation job name.

        • metric_file_s3_url: (optional) The metric file’s Amazon S3 bucket URL, which the solution uses to auto-discover evaluation metrics. The file must be uploaded to the solution’s Amazon S3 Assets bucket. If provided, evaluation metrics are extracted from the file.

        • metric_type: (required if metric_file_s3_url is provided) The type of evaluation. Possible values include model_card_metric_schema, clarify_bias, clarify_explainability, regression, binary_classification, or multiclass_classification.

        • evaluation_observation: (optional) Observations made during model evaluation.

        • evaluation_job_arn: (optional) The ARN of the evaluation job.

        • datasets: (optional) A list of valuation dataset Amazon S3 bucket URLs. Maximum list length is 10.

        • metadata: (optional) A JSON object with additional attributes associated with the evaluation results.

        • metric_groups: (optional) A JSON object with the following attributes:

          • name: The metric group name.

          • metric_data: A list of JSON object(s) with the following attributes:

            • name: The name of the metric.

            • type: Metric type. Possible values include: bar_char, boolean, linear_graph, matrix, number, or string.

            • value: The data type of the metric (integer, float, string, boolean, or list).

            • notes: (optional) Notes to add to the metric.

            • x_axis_name: The name of the x axis.

            • y_axis_name: The name of the y axis.

      • additional_information: (optional) A JSON object (used with model card operations). The JSON object has the following attributes:

        • ethical_considerations: (optional) Ethical considerations to document about the model.

        • caveats_and_recommendations: (optional) Caveats and recommendations for users who might use this model in their applications.

        • custom_details: (optional) A JSON object of any additional custom information to document about the model.

    • Required attributes per pipeline type (Amazon SageMaker model registry is not used):

      • Model training using Amazon SageMaker training job (with required attributes):

        { "pipeline_type": "model_training_builtin", "model_name": "<my-model-name>", "model_framework": "xgboost", "model_framework_version": "1", "job_output_location": "<s3-prefix-in-assets-bucket>", "training_data": "<path/to/training_data.csv>", "validation_data": "<path/to/validation_data.csv>", "algo_hyperparamaters": "<algo-hyperparameters-json-object>" }
      • Model training using Amazon SageMaker hyperparameter tuning Job (with required attributes):

        {
  "pipeline_type": "model_tuner_builtin", "model_name": "<my-model-name>", "model_framework": "xgboost",   "model_framework_version": "1",   "job_output_location": "<s3-prefix-in-assets-bucket>",   "training_data": "<path/to/training_data.csv>",   "validation_data": "<path/to/validation_data.csv>",   "algo_hyperparamaters": "<algo-hyperparameters-json-object>",   "tuner_configs": "<tuner-configs-json-object>",   "hyperparamaters_ranges": "<hyperparamaters-ranges-json-object>" } }
      • Model training using Amazon SageMaker autopilot job (with required attributes):

        {
  "pipeline_type": "model_autopilot_training",  "model_name": "<my-model-name>",  "job_output_location": "<s3-prefix-in-assets-bucket>", "training_data": "<path/to/training_data.csv>",   "target_attribute": "<target-attribute-name>" }
      • Real-time inference with a custom algorithm for a machine learning model:

        { "pipeline_type": "byom_realtime_custom", "custom_image_uri": "<docker-image-uri-in-Amazon-ECR-repo>", "model_name": "<my-model-name>", "model_artifact_location": "<path/to/model.tar.gz>", "data_capture_location": "<bucket-name>/<prefix>", "inference_instance": "ml.m5.large", "endpoint_name": "<custom-endpoint-name>" }
      • Real-time inference with an Amazon SageMaker built-in model:

        { "pipeline_type": "byom_realtime_builtin", "model_framework": "xgboost", "model_framework_version": "1", "model_name": "<my-model-name>", "model_artifact_location": "<path/to/model.tar.gz>", "data_capture_location": "<bucket-name>/<prefix>", "inference_instance": "ml.m5.large", "endpoint_name": "<custom-endpoint-name>" }
      • Batch inference with a custom algorithm for a machine learning model:

        { "pipeline_type": "byom_batch_custom", "custom_image_uri": "<docker-image-uri-in-Amazon-ECR-repo>", "model_name": "<my-model-name>", "model_artifact_location": "<path/to/model.tar.gz>", "inference_instance": "ml.m5.large", "batch_inference_data": "<bucket-name>/<prefix>/inference_data.csv", "batch_job_output_location": "<bucket-name>/<prefix>" }
      • Batch inference with an Amazon SageMaker built-in model:

        { "pipeline_type": "byom_batch_builtin", "model_framework": "xgboost", "model_framework_version": "1", "model_name": "<my-model-name>", "model_artifact_location": "<path/to/model.tar.gz>", "inference_instance": "ml.m5.large", "batch_inference_data": "<bucket-name>/<prefix>/inference_data.csv",, "batch_job_output_location": "<bucket-name>/<prefix>" }
      • Data quality monitor pipeline:

        { "pipeline_type": "byom_data_quality_monitor", "model_name": "<my-model-name>", "endpoint_name": "xgb-churn-prediction-endpoint", "baseline_data": "<path/to/training_data_with_header.csv>", "baseline_job_output_location": "<bucket-name>/<prefix>", "data_capture_location": "<bucket-name>/<prefix>", "monitoring_output_location": "<bucket-name>/<prefix>", "schedule_expression”: "cron(0 * ? * * *)", "instance_type": "ml.m5.large", "instance_volume_size": "20", "baseline_max_runtime_seconds": "3300", "monitor_max_runtime_seconds": "3300" }
      • Model quality monitor pipeline (BinaryClassification problem):

        {
 "pipeline_type": "byom_model_quality_monitor", "model_name": "<my-model-name>", "endpoint_name": "xgb-churn-prediction-endpoint", "baseline_data": "<path/to/baseline_dataset.csv>", "baseline_job_output_location": "<bucket-name>/<prefix>", "data_capture_location": "<bucket-name>/<prefix>", "monitoring_output_location": "<bucket-name>/<prefix>", "schedule_expression": "cron(0 0 ? * * *)", "instance_type": "ml.m5.large", "instance_volume_size": "20", "baseline_max_runtime_seconds": "3300", "monitor_max_runtime_seconds": "1800", "baseline_inference_attribute": "prediction", "baseline_probability_attribute": "probability", "baseline_ground_truth_attribute": "label", "probability_threshold_attribute": "0.5", "problem_type": "BinaryClassification", "monitor_probability_attribute": "0", "monitor_ground_truth_input": "<bucket-name>/<prefix>/<yyyy>/<mm>/<dd>/<hh>" }
      • Model quality monitor pipeline (Regression or MulticlassClassification problem):

        {
 "pipeline_type": "byom_model_quality_monitor", "model_name": "<my-model-name>", "endpoint_name": "xgb-churn-prediction-endpoint", "baseline_data": "<path/to/baseline_data.csv>", "baseline_job_output_location": "<bucket-name>/<prefix>", "data_capture_location": "<bucket-name>/<prefix>", "monitoring_output_location": "<bucket-name>/<prefix>", "schedule_expression": "cron(0 0 ? * * *)", "instance_type": "ml.m5.large", "instance_volume_size": "20", "baseline_max_runtime_seconds": "3300", "monitor_max_runtime_seconds": "1800", "baseline_inference_attribute": "prediction", "baseline_ground_truth_attribute": "label", "problem_type": "Regression", "monitor_inference_attribute": "0", "monitor_ground_truth_input": "<bucket-name>/<prefix>/<yyyy>/<mm>/<dd>/<hh>" }
      • Model bias monitor pipeline (BinaryClassification problem):

        { "pipeline_type": "byom_model_bias_monitor", "model_name": "<my-model-name>", "endpoint_name": "xgb-churn-prediction-endpoint", "baseline_data": "path/to/traing_data_with_header.csv", "baseline_job_output_location": "<bucket-name>/<prefix>", "data_capture_location”: "<bucket-name>/<prefix>", "monitoring_output_location": "<bucket-name>/<prefix>", "schedule_expression": "cron(0 0 ? * * *)", "instance_type": "ml.m5.large", "instance_volume_size": "20", "baseline_max_runtime_seconds": "3300", "monitor_max_runtime_seconds": "1800", "probability_threshold_attribute": "0.5", "problem_type": "BinaryClassification", "monitor_probability_attribute": "0", "bias_config": { "label_values_or_threshold": "<value>", "facet_name": "<value>", "facet_values_or_threshold": "<value>" }, "model_predicted_label_config":{"probability": 0}, "monitor_ground_truth_input": "<bucket-name>/<prefix>/<yyyy>/<mm>/<dd>/<hh>" }
      • Model bias monitor pipeline (Regression problem):

        { "pipeline_type": "byom_model_bias_monitor", "model_name": "<my-model-name>", "endpoint_name": "xgb-churn-prediction-endpoint", "baseline_data": "<path/to/training_data_with_header.csv>", "baseline_job_output_location": "<bucket-name>/<prefix>", "data_capture_location”: "<bucket-name>/<prefix>", "monitoring_output_location": "<bucket-name>/<prefix>", "schedule_expression": "cron(0 0 ? * * *)", "instance_type": "ml.m5.large", "instance_volume_size": "20", "baseline_max_runtime_seconds": "3300", "monitor_max_runtime_seconds": "1800", "problem_type": "Regression", "monitor_inference_attribute": "0", "bias_config": { "label_values_or_threshold": "<value>", "facet_name": "<value>", "facet_values_or_threshold": "<value>" }, "monitor_ground_truth_input": "<bucket-name>/<prefix>/<yyyy>/<mm>/<dd>/<hh>" }
      • Model explainability monitor pipeline (BinaryClassification problem):

        { "pipeline_type": "byom_model_explainability_monitor", "model_name": "<my-model-name>", "endpoint_name": "xgb-churn-prediction-endpoint", "baseline_data": "<path/to/training_data_with_header.csv>", "baseline_job_output_location": "<bucket-name>/<prefix>", "data_capture_location”: "<bucket-name>/<prefix>", "monitoring_output_location": "<bucket-name>/<prefix>", "schedule_expression": "cron(0 0 ? * * *)", "instance_type": "ml.m5.large", "instance_volume_size": "20", "baseline_max_runtime_seconds": "3300", "monitor_max_runtime_seconds": "1800", "probability_threshold_attribute": "0.5", "problem_type": "BinaryClassification", "monitor_probability_attribute": "0", "shap_config": { "baseline": "<path/to/shap_baseline_dataset.csv>", "num_samples": "<value>", "agg_method": "mean_abs|mean_sq|median" } }
      • Custom algorithm image builder pipeline:

        { "pipeline_type": "byom_image_builder", "custom_algorithm_docker": "<path/to/custom_image.zip>", "ecr_repo_name": "<name-of-Amazon-ECR-repository>", "image_tag": "<image-tag>" }
      • Model card's create operation:

        { "pipeline_type": "create_model_card", "name": "<model-card-name>", "model_overview": { "model_name": "<name-of-existing-model>", "model_description": "<model description>", "model_version": <version number>, "problem_type": "<type of problem the model solves>", "algorithm_type": "<algorithm name>", "model_creator": "<name of the model creator>", "model_owner": "<model owner>", "model_artifact": ["<model artifact>"], "inference_environment": ["<image used for inference>"] }, "intended_uses": { "purpose_of_model": "<description of purpose of model>", "intended_uses": "<description of intended uses>", "factors_affecting_model_efficiency": "<any factors>", "risk_rating": "Low", "explanations_for_risk_rating":"<risk rating>" }, "training_details":{ "training_job_name": "<training job name>", "objective_function": { "function": "<one of Maximize|Minimize>", "facet": "<one of Accuracy|AUC|Loss|MAE|RMSE>", "condition": "<description of any conditions>", "notes": "<any notes>" }, "training_observations": "<any observations>", "training_job_details": { "user_provided_training_metrics": [{"name": "<metric-name>", "value": <metric value>, "notes": "<metric notes>"}] } }, "evaluation_details": [ { "name": "<evaluation name>", "metric_file_s3_url": "<s3 url for the JSON evaluation file in the solution's asset S3 bucket>", "metric_type": "<one of model_card_metric_schema|clarify_bias|clarify_explainability|regression|binary_classification|multiclass_classification>" }, { "name": "<evaluation name>", "evaluation_observation": "<any-observation>", "evaluation_job_arn": "<job-arn>", "datasets": ["<s3 url for training data>"], "metadata": {"key": "value"}, "metric_groups": [{"name": "<group-name>", "metric_data": [{"name":"<metric-name>", "type": "<one of bar_char|boolean|linear_graph|matrix|number|string>", "value": <value>, "notes": "<metric notes>"}]}] }], "additional_information": { "ethical_considerations": "make sure data is representative", "caveats_and_recommendations": "some recommendations", "custom_details": { "key": "value" } } }
      • Model card's describe operation:

        { "pipeline_type": "describe_model_card", "name": "<model card name>" }
      • Model card's delete operation:

        { "pipeline_type": "delete_model_card", "name": "<model card name>" }
      • Model card's update operation:

        { "pipeline_type": "update_model_card", "name": "<model card name>", "status": "<status>", "training_details":{ "training_job_name": "<training job name>" } }
      • Model card's export operation:

        { "pipeline_type": "export_model_card", "name": "<model card name>" }
      • Model card's list cards:

        { "pipeline_type": "list_model_cards" }

    Required attributes per pipeline type when the Amazon SageMaker model registry is used. When the model registry is used, the following attributes must be modified:

    • Real-time inference and batch pipelines with custom algorithms:

      • Remove custom_image_uri and model_artifact_location

      • Add model_package_name

    • Real-time inference and batch pipelines with Amazon SageMaker built-in algorithms:

      • Remove model_framework, model_framework_version, and model_artifact_location

      • Add model_package_name

    Expected responses of API requests to /provisonpipeline:

    • If the pipeline is provisioned for the first time (that is, if no existing pipeline with the same name), the response is:

    • {
 "message": "success: stack creation started", "pipeline_id": "arn:aws:cloudformation:<region>:<account-id>:stack/<stack-id>" }
    • If the pipeline is already provisioned, the response is:

      { "message": "Pipeline <stack-name> is already provisioned. Updating template parameters.", "pipeline_id": "arn:aws:cloudformation:<region>:<account-id>:stack/<stack-id>" }
    • If the pipeline is already provisioned, the pipeline_type is byom_image_builder, and there are updates to be performed, the response is:

      { "message": "Pipeline <stack-name> is being updated.", "pipeline_id": " arn:aws:cloudformation:<region>:<account-id>:stack/<stack-id>" }
    • If the pipeline is already provisioned, the pipeline_type is byom_image_builder, and there are no updates to be performed, the response is:

      { "message": "Pipeline <stack-name> is already provisioned. No updates are to be performed. "pipeline_id": " arn:aws:cloudformation:<region>:<account-id>:stack/<stack-id>" }
    • If the pipeline type is one of the model card operations (create, describe, update, delete, export, and list model cards), the response is:

      { "message": "<message based on the model card operation>" }
    • /pipelinestatus

      • Method: POST

      • Body

        • pipeline_id: The ARN of the created CloudFormation stack after provisioning a pipeline. (This information can be retrieved from /provisionpipeline.)

      • Example structure:

        { "pipeline_id": "arn:aws:cloudformation:us-west-1:123456789123:stack/my-mlops-pipeline/12abcdef-abcd-1234-ab12-abcdef123456" }
    • Expected responses of APIs requests to /pipelinestatus:

      • The returned response depends on the solution’s option (single- or multi-account deployment). Example response for the single-account option:

        { "pipelineName": "<pipeline-name>", "pipelineVersion": 1, "stageStates": [ { "stageName": "Source", "inboundTransitionState": { "enabled": true }, "actionStates": [ { "actionName": "S3Source", "currentRevision": { "revisionId": "<version-id>" }, "latestExecution": { "actionExecutionId": "<execution-id>", "status": "Succeeded", "summary": "Amazon S3 version id: "<id>", "lastStatusChange": "<timestamp>", "externalExecutionId": "<execution-id>" }, "entityUrl": "https://console.aws.amazon.com/s3/home?region=<region>#" } ], "latestExecution": { "pipelineExecutionId": "<execution-id>", "status": "Succeeded" } }, { "stageName": "DeployCloudFormation", "inboundTransitionState": { "enabled": true }, "actionStates": [ { "actionName": "deploy_stack", "latestExecution": { "actionExecutionId": "<execution-id>", "status": "Succeeded", "summary": "Stack <pipeline-name> was created.", "lastStatusChange": "<timestamp>", "externalExecutionId": "<stack-id>", "externalExecutionUrl": ""<stack-url>" }, "entityUrl": "https://console.aws.amazon.com/cloudformation/home?region=<Region>#/" } ], "latestExecution": { "pipelineExecutionId": "<execution-id>", "status": "Succeeded" } } ], "created": "<timestamp>", "updated": "<timestamp>", "ResponseMetadata": { "RequestId": "<request-ID>", "HTTPStatusCode": 200, "HTTPHeaders": { "x-amzn-requestid": "<request-id>", "date": "<date>", "content-type": "application/x-amz-json-1.1", "content-length": "<number>" }, "RetryAttempts": 0 } }

    You can use the following API method for inference of the deployed real-time inference pipeline. The AWS Gateway API URL can be found in the outputs of the pipeline’s AWS CloudFormation stack.

    • /inference

      • Method: POST

      • Body

        • payload: The data to be sent for inference.

        • content_type: MIME content type for the payload.

          { "payload": "1.0, 2.0, 3.2", "content_type": "text/csv" }
      • Expected responses of APIs requests to /inference:

        • The request returns a single prediction value, if one data point was in the request, and returns multiple prediction values (separated by a “,”), if several data points were sent in the APIs request.

API responses with error messages:

  • If an API request to any one of the solution’s API endpoints results in an exception/error, the expected body of the API response is:

    { "message": "<general error message>", "detailedMessage ": "<detailed error message>" }
  • The detailedMessage attribute in the body of the API response is only included if the solution was configured to allow detailed error messages. Refer to the template’s parameters table for more details.