Batch transforms with inference pipelines

To get inferences on an entire dataset you run a batch transform on a trained model. To run inferences on a full dataset, you can use the same inference pipeline model created and deployed to an endpoint for real-time processing in a batch transform job. To run a batch transform job in a pipeline, you download the input data from Amazon S3 and send it in one or more HTTP requests to the inference pipeline model. For an example that shows how to prepare data for a batch transform, see "Section 2 - Preprocess the raw housing data using Scikit Learn" of the Amazon SageMaker Multi-Model Endpoints using Linear Learner sample notebook. For information about Amazon SageMaker AI batch transforms, see Batch transform for inference with Amazon SageMaker AI.

Note

To use custom Docker images in a pipeline that includes Amazon SageMaker AI built-in algorithms, you need an Amazon Elastic Container Registry (ECR) policy. Your Amazon ECR repository must grant SageMaker AI permission to pull the image. For more information, see Troubleshoot Amazon ECR Permissions for Inference Pipelines.

The following example shows how to run a transform job using the Amazon SageMaker Python SDK. In this example, model_name is the inference pipeline that combines SparkML and XGBoost models (created in previous examples). The Amazon S3 location specified by input_data_path contains the input data, in CSV format, to be downloaded and sent to the Spark ML model. After the transform job has finished, the Amazon S3 location specified by output_data_path contains the output data returned by the XGBoost model in CSV format.


import sagemaker
input_data_path = 's3://{}/{}/{}'.format(default_bucket, 'key', 'file_name')
output_data_path = 's3://{}/{}'.format(default_bucket, 'key')
transform_job = sagemaker.transformer.Transformer(
    model_name = model_name,
    instance_count = 1,
    instance_type = 'ml.m4.xlarge',
    strategy = 'SingleRecord',
    assemble_with = 'Line',
    output_path = output_data_path,
    base_transform_job_name='inference-pipelines-batch',
    sagemaker_session=sagemaker.Session(),
    accept = CONTENT_TYPE_CSV)
transform_job.transform(data = input_data_path, 
                        content_type = CONTENT_TYPE_CSV, 
                        split_type = 'Line')

Warning Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Document Conventions

Real-time Inference

Logs and Metrics