Container Contract Inputs
The Amazon SageMaker Model Monitor platform invokes your container code according to a specified schedule. If you choose to write your own container code, the following environment variables are available. In this context, you can analyze the current dataset or evaluate the constraints if you choose and emit metrics, if applicable.
The available environment variables are the same for real-time endpoints
and batch transform jobs, except for the dataset_format
variable. If you are using a real-time endpoint, the
dataset_format
variable supports the following
options:
{\"sagemakerCaptureJson\": {\"captureIndexNames\": [\"endpointInput\",\"endpointOutput\"]}}
If you are using a batch transform job, the dataset_format
supports the following options:
{\"csv\": {\"header\": [\"true\",\"false\"]}}
{\"json\": {\"line\": [\"true\",\"false\"]}}
{\"parquet\": {}}
The following code sample shows the complete set of environment variables
available for your container code (and uses the dataset_format
format for a real-time endpoint).
"Environment": { "dataset_format": "{\"sagemakerCaptureJson\": {\"captureIndexNames\": [\"endpointInput\",\"endpointOutput\"]}}", "dataset_source": "/opt/ml/processing/endpointdata", "end_time": "2019-12-01T16: 20: 00Z", "output_path": "/opt/ml/processing/resultdata", "publish_cloudwatch_metrics": "Disabled", "sagemaker_endpoint_name": "endpoint-name", "sagemaker_monitoring_schedule_name": "schedule-name", "start_time": "2019-12-01T15: 20: 00Z" }
Parameters
Parameter Name | Description |
---|---|
dataset_format |
For a job started from a
|
dataset_source |
If you are using a real-time endpoint, the local path
in which the data corresponding to the monitoring
period, as specified by We sometimes download more than what is specified by the start and end times. It is up to the container code to parse the data as required. |
output_path |
The local path to write output reports and other
files. You specify this parameter in the
|
publish_cloudwatch_metrics |
For a job launched by
|
sagemaker_endpoint_name |
If you are using a real-time endpoint, the name of the
|
sagemaker_monitoring_schedule_name |
The name of the |
*sagemaker_endpoint_datacapture_prefix* |
If you are using a real-time endpoint, the prefix
specified in the |
start_time, end_time |
The time window for this analysis run. For example,
for a job scheduled to run at 05:00 UTC and a job that
runs on 20/02/2020, |
baseline_constraints: |
The local path of the baseline constraint file
specified in |
baseline_statistics |
The local path to the baseline statistics file
specified in
|