PyTorch Elastic Inference with Python
The Amazon Elastic Inference enabled version of PyTorch lets you use Elastic Inference seamlessly, with few changes to your PyTorch code. The following tutorial shows how to perform inference using an Elastic Inference accelerator.
Elastic Inference enabled PyTorch is only available with Amazon Deep Learning Containers version 27 and later.
Topics
Install Elastic Inference Enabled PyTorch
Preinstalled Elastic Inference Enabled PyTorch
The Elastic Inference enabled packages are available in the AWS Deep Learning AMI
Installing Elastic Inference Enabled PyTorch
If you're not using a AWS Deep Learning AMI instance, you can download the packages
from the Amazon S3 bucket
Activate the PyTorch Elastic Inference Environment
If you are using the AWS Deep Learning AMI, activate the Python 3 Elastic Inference enabled PyTorch environment. Python 2 is not supported for Elastic Inference enabled PyTorch.
For Python 3, run the following to activate the environment:
source activate amazonei_pytorch_p36
If you are using a different AMI or a container, access the environment where PyTorch is installed.
The remaining parts of this guide assume you are using the
amazonei_pytorch_p36
environment. If you are switching from
MXNet or TensorFlow Elastic Inference environments, you must stop and then start your
instance in order to reattach the Elastic Inference accelerator. Rebooting is not
sufficient
since the process of switching frameworks requires a complete shut down.
Use Elastic Inference with PyTorch for inference
With Elastic Inference enabled PyTorch, the inference API is largely unchanged. However,
you
must use the with torch.jit.optimized_execution()
context to trace
or script your models into TorchScript, then perform inference.
Run Inference with a ResNet-50 Model
To run inference using Elastic Inference enabled PyTorch, do the following.
-
Download a picture of a cat to your current directory.
curl -O https://s3.amazonaws.com/model-server/inputs/kitten.jpg
-
Download a list of ImageNet class mappings to your current directory.
wget https://aws-dlc-sample-models.s3.amazonaws.com/pytorch/imagenet_classes.txt
-
Use the built-in
EI Tool
to get the device ordinal number of all attached Elastic Inference accelerators. For more information onEI Tool
, see Monitoring Elastic Inference Accelerators./opt/amazon/ei/ei_tools/bin/ei describe-accelerators --json
Your output should look like the following:
{ "ei_client_version": "1.5.0", "time": "Fri Nov 1 03:09:38 2019", "attached_accelerators": 2, "devices": [ { "ordinal": 0, "type": "eia1.xlarge", "id": "eia-679e4c622d584803aed5b42ab6a97706", "status": "healthy" }, { "ordinal": 1, "type": "eia1.xlarge", "id": "eia-6c414c6ee37a4d93874afc00825c2f28", "status": "healthy" } ] }
You use the device ordinal of your desired Elastic Inference accelerator to run inference.
-
Use your preferred text editor to create a script that has the following content. Name it
pytorch_resnet50_inference.py
. This script uses ImageNet pretrained TorchVision model weights for ResNet-50, a popular convolutional neural network for image classification. It traces the weights with an image tensor and saves it. The script then loads the saved model, performs inference on the input, and prints out the top predicted ImageNet classes.This script uses the
torch.jit.optimized_execution
context, which is necessary to use the Elastic Inference accelerator. If you don't use thetorch.jit.optimized_execution
context correctly, then inference will run entirely on the client instance and won't use the attached accelerator. The Elastic Inference enabled PyTorch framework accepts two parameters for this context, while the vanilla PyTorch framework accepts only one parameter. The second parameter is used to specify the accelerator device ordinal.target_device
should be set to the device's ordinal number, not its ID. Ordinals are numbered beginning with 0.Note This script specifies the CPU device when loading the model. This avoids potential problems if the model was traced and saved using a GPU context.
import torch, torchvision import PIL from torchvision import transforms from PIL import Image def get_image(filename): im = Image.open(filename) # ImageNet pretrained models required input images to have width/height of 224 # and color channels normalized according to ImageNet distribution. im_process = transforms.Compose([transforms.Resize([224, 224]), transforms.ToTensor(), transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])]) im = im_process(im) # 3 x 224 x 224 return im.unsqueeze(0) # Add dimension to become 1 x 3 x 224 x 224 im = get_image('kitten.jpg') # eval() toggles inference mode model = torchvision.models.resnet50(pretrained=True).eval() # Compile model to TorchScript via tracing # Here want to use the first attached accelerator, so we specify ordinal 0. with torch.jit.optimized_execution(True, {'target_device': 'eia:0'}): # You can trace with any input model = torch.jit.trace(model, im) # Serialize model torch.jit.save(model, 'resnet50_traced.pt') # Deserialize model model = torch.jit.load('resnet50_traced.pt', map_location=torch.device('cpu')) # Perform inference. Make sure to disable autograd and use EI execution context with torch.no_grad(): with torch.jit.optimized_execution(True, {'target_device': 'eia:
device ordinal
'}): probs = model(im) # Torchvision implementation doesn't have Softmax as last layer. # Use Softmax to convert activations to range 0-1 (probabilities) probs = torch.nn.Softmax(dim=1)(probs) # Get top 5 predicted classes classes = eval(open('imagenet_classes.txt').read()) pred_probs, pred_indices = torch.topk(probs, 5) pred_probs = pred_probs.squeeze().numpy() pred_indices = pred_indices.squeeze().numpy() for i in range(len(pred_indices)): curr_class = classes[pred_indices[i]] curr_prob = pred_probs[i] print('{}: {:.4f}'.format(curr_class, curr_prob))Note You don’t have to save and load your model. You can compile your model, then directly do inference with it. The benefit to saving your model is that it will save time for future inference jobs.
-
Run the inference script.
python pytorch_resnet50_inference.py
Your output should be similar to the following. The model predicts that the image is most likely to be a tabby cat, followed by a tiger cat.
Using Amazon Elastic Inference Client Library Version: 1.6.2 Number of Elastic Inference Accelerators Available: 1 Elastic Inference Accelerator ID: eia-53ab0670550948e88d7aac0bd331a583 Elastic Inference Accelerator Type: eia2.medium Elastic Inference Accelerator Ordinal: 0 tabby, tabby cat: 0.4674 tiger cat: 0.4526 Egyptian cat: 0.0667 plastic bag: 0.0025 lynx, catamount: 0.0007