Use the following the code examples to request inferences from your deployed
service based on the framework you used to train your model. The code examples for the different
frameworks are similar. The main difference is that TensorFlow requires
application/json
as the content type.
PyTorch and MXNet
If you are using PyTorch v1.4 or later or MXNet 1.7.0 or later and you have
an Amazon SageMaker AI endpoint InService
, you can make inference requests using the predictor
package of the
SageMaker AI SDK for Python.
Note
The API varies based on the SageMaker AI SDK for Python version:
-
For version 1.x, use the
RealTimePredictor
and Predict
API.
The following code example shows how to use these APIs to send an image for inference:
from sagemaker.predictor import RealTimePredictor
endpoint = 'insert name of your endpoint here'
# Read image into memory
payload = None
with open("image.jpg", 'rb') as f:
payload = f.read()
predictor = RealTimePredictor(endpoint=endpoint, content_type='application/x-image')
inference_response = predictor.predict(data=payload)
print (inference_response)
TensorFlow
The following code example shows how to use the SageMaker Python SDK API to send an image for inference:
from sagemaker.predictor import Predictor
from PIL import Image
import numpy as np
import json
endpoint = 'insert the name of your endpoint here'
# Read image into memory
image = Image.open(input_file)
batch_size = 1
image = np.asarray(image.resize((224, 224)))
image = image / 128 - 1
image = np.concatenate([image[np.newaxis, :, :]] * batch_size)
body = json.dumps({"instances": image.tolist()})
predictor = Predictor(endpoint)
inference_response = predictor.predict(data=body)
print(inference_response)