Run batch inference jobs
Batch inferencing, also known as offline inferencing, generates model predictions on a
batch of observations. Batch inference is a good option for large datasets or if you don't
need an immediate response to a model prediction request. By contrast, online inference (real-time inferencing) generates predictions in real time. You can make batch inferences from an Autopilot model using the SageMaker Python SDK
The following tabs show three options for deploying your model: Using APIs, Autopilot UI, or using APIs to deploy from different accounts. These instructions assume that you have already created a model in Autopilot. If you don't have a model, see Create Regression or Classification Jobs for Tabular Data Using the AutoML API. To see examples for each option, open each tab.
The Autopilot UI contains helpful dropdown menus, toggles, tooltips, and more to help you navigate through model deployment.
The following steps show how to deploy a model from an Autopilot experiment for batch predictions.
-
Sign in at https://console.aws.amazon.com/sagemaker/
and select Studio from the navigation pane. -
On the left navigation pane, choose Studio.
-
Under Get started, select the Domain that you want to launch the Studio application in. If your user profile only belongs to one Domain, you do not see the option for selecting a Domain.
-
Select the user profile that you want to launch the Studio Classic application for. If there is no user profile in the domain, choose Create user profile. For more information, see Add user profiles.
-
Choose Launch Studio. If the user profile belongs to a shared space, choose Open Spaces.
-
When the SageMaker Studio Classic console opens, choose the Launch SageMaker Studio button.
-
Select AutoML from the left navigation pane.
-
Under Name, select the Autopilot experiment corresponding to the model that you want to deploy. This opens a new AUTOPILOT JOB tab.
-
In the Model name section, select the model that you want to deploy.
-
Choose Deploy model. This opens a new tab.
-
Choose Make batch predictions at the top of the page.
-
For Batch transform job configuration, input the Instance type, Instance count and other optional information.
-
In the Input data configuration section, open the dropdown menu.
-
For S3 data type, choose ManifestFile or S3Prefix.
-
For Split type, choose Line, RecordIO, TFRecord or None.
-
For Compression, choose Gzip or None.
-
-
For S3 location, enter the Amazon S3 bucket location of the input data and other optional information.
-
Under Output data configuration, enter the S3 bucket for the output data, and choose how to assemble the output of your job.
-
For Additional configuration (optional), you can enter a MIME type and an S3 Encryption key.
-
-
For Input/output filtering and data joins (optional), you enter a JSONpath expression to filter your input data, join the input source data with your output data, and enter a JSONpath expression to filter your output data.
-
For examples for each type of filter, see the DataProcessing API.
-
-
To perform batch predictions on your input dataset, select Create batch transform job. A new Batch Transform Jobs tab appears.
-
In the Batch Transform Jobs tab: Locate the name of your job in Status section. Then check the progress of the job.
To use the SageMaker APIs for batch inferencing, there are three steps:
-
Obtain candidate definitions
Candidate definitions from InferenceContainers are used to create a SageMaker model.
The following example shows how to use the DescribeAutoMLJob API to obtain candidate definitions for the best model candidate. See the following AWS CLI command as an example.
aws sagemaker describe-auto-ml-job --auto-ml-job-name
<job-name>
--region<region>
Use the ListCandidatesForAutoMLJob API to list all candidates. See the following AWS CLI command as an example.
aws sagemaker list-candidates-for-auto-ml-job --auto-ml-job-name
<job-name>
--region<region>
-
Create a SageMaker model
To create a SageMaker model using the CreateModel API, use the container definitions from the previous steps. See the following AWS CLI command as an example.
aws sagemaker create-model --model-name '
<your-custom-model-name>
' \ --containers ['<container-definition1
>,<container-definition2>
,<container-definition3>
]' \ --execution-role-arn '<execution-role-arn>
' --region '<region>
-
Create a SageMaker transform job
The following example creates a SageMaker transform job with the CreateTransformJob API. See the following AWS CLI command as an example.
aws sagemaker create-transform-job --transform-job-name '
<your-custom-transform-job-name>
' --model-name '<your-custom-model-name-from-last-step>
'\ --transform-input '{ "DataSource": { "S3DataSource": { "S3DataType": "S3Prefix", "S3Uri": "<your-input-data>
" } }, "ContentType": "text/csv
", "SplitType": "Line" }'\ --transform-output '{ "S3OutputPath": "<your-output-path>
", "AssembleWith": "Line" }'\ --transform-resources '{ "InstanceType": "<instance-type>
", "InstanceCount":1
}' --region '<region>
'
Check the progress of your transform job using the DescribeTransformJob API. See the following AWS CLI command as an example.
aws sagemaker describe-transform-job --transform-job-name '
<your-custom-transform-job-name>
' --region<region>
After the job is finished, the predicted result will be available in
<your-output-path>
.
The output file name has the following format:
<input_data_file_name>.out
. As an example, if your input file is
text_x.csv
, the output name will be text_x.csv.out
.
The following tabs show code examples for SageMaker Python SDK, AWS SDK for Python (boto3), and the AWS CLI.
To create a batch inferencing job in a different account than the one that the model was generated in, follow the instructions in Deploy models from different accounts. Then you can create models and transform jobs by following the Deploy using SageMaker APIs.