Step 5: Deploy the Model to Amazon EC2 - Amazon SageMaker

Step 5: Deploy the Model to Amazon EC2

To get predictions, deploy your model to Amazon EC2 using Amazon SageMaker.

Deploy the Model to SageMaker Hosting Services

To host a model through Amazon EC2 using Amazon SageMaker, deploy the model that you trained in Create and Run a Training Job by calling the deploy method of the xgb_model estimator. When you call the deploy method, you must specify the number and type of EC2 ML instances that you want to use for hosting an endpoint.

import sagemaker from sagemaker.serializers import CSVSerializer xgb_predictor=xgb_model.deploy( initial_instance_count=1, instance_type='ml.t2.medium', serializer=CSVSerializer() )
  • initial_instance_count (int) – The number of instances to deploy the model.

  • instance_type (str) – The type of instances that you want to operate your deployed model.

  • serializer (int) – Serialize input data of various formats (a NumPy array, list, file, or buffer) to a CSV-formatted string. We use this because the XGBoost algorithm accepts input files in CSV format.

The deploy method creates a deployable model, configures the SageMaker hosting services endpoint, and launches the endpoint to host the model. For more information, see the SageMaker generic Estimator's deploy class method in the Amazon SageMaker Python SDK. To retrieve the name of endpoint that's generated by the deploy method, run the following code:


This should return the endpoint name of the xgb_predictor. The format of the endpoint name is "sagemaker-xgboost-YYYY-MM-DD-HH-MM-SS-SSS". This endpoint stays active in the ML instance, and you can make instantaneous predictions at any time unless you shut it down later. Copy this endpoint name and save it to reuse and make real-time predictions elsewhere in SageMaker Studio or SageMaker notebook instances.


To learn more about compiling and optimizing your model for deployment to Amazon EC2 instances or edge devices, see Compile and Deploy Models with Neo.

(Optional) Use SageMaker Predictor to Reuse the Hosted Endpoint

After you deploy the model to an endpoint, you can set up a new SageMaker predictor by pairing the endpoint and continuously make real-time predictions in any other notebooks. The following example code shows how to use the SageMaker Predictor class to set up a new predictor object using the same endpoint. Re-use the endpoint name that you used for the xgb_predictor.

import sagemaker xgb_predictor_reuse=sagemaker.predictor.Predictor( endpoint_name="sagemaker-xgboost-YYYY-MM-DD-HH-MM-SS-SSS", sagemaker_session=sagemaker.Session(), serializer=sagemaker.serializers.CSVSerializer() )

The xgb_predictor_reuse Predictor behaves exactly the same as the original xgb_predictor. For more information, see the SageMaker Predictor class in the Amazon SageMaker Python SDK.

(Optional) Make Prediction with Batch Transform

Instead of hosting an endpoint in production, you can run a one-time batch inference job to make predictions on a test dataset using the SageMaker batch transform. After your model training has completed, you can extend the estimator to a transformer object, which is based on the SageMaker Transformer class. The batch transformer reads in input data from a specified S3 bucket and makes predictions.

To run a batch transform job
  1. Run the following code to convert the feature columns of the test dataset to a CSV file and uploads to the S3 bucket:

    X_test.to_csv('test.csv', index=False, header=False) boto3.Session().resource('s3').Bucket(bucket).Object( os.path.join(prefix, 'test/test.csv')).upload_file('test.csv')
  2. Specify S3 bucket URIs of input and output for the batch transform job as shown following:

    # The location of the test dataset batch_input = 's3://{}/{}/test'.format(bucket, prefix) # The location to store the results of the batch transform job batch_output = 's3://{}/{}/batch-prediction'.format(bucket, prefix)
  3. Create a transformer object specifying the minimal number of parameters: the instance_count and instance_type parameters to run the batch transform job, and the output_path to save prediction data as shown following:

    transformer = xgb_model.transformer( instance_count=1, instance_type='ml.m4.xlarge', output_path=batch_output )
  4. Initiate the batch transform job by executing the transform() method of the transformer object as shown following:

    transformer.transform( data=batch_input, data_type='S3Prefix', content_type='text/csv', split_type='Line' ) transformer.wait()
  5. When the batch transform job is complete, SageMaker creates the test.csv.out prediction data saved in the batch_output path, which should be in the following format: s3://sagemaker-<region>-111122223333/demo-sagemaker-xgboost-adult-income-prediction/batch-prediction. Run the following AWS CLI to download the output data of the batch transform job:

    ! aws s3 cp {batch_output} ./ --recursive

    This should create the test.csv.out file under the current working directory. You'll be able to see the float values that are predicted based on the logistic regression of the XGBoost training job.