Using PyTorch Elastic Inference accelerators on Amazon EC2 - Amazon Elastic Inference

Using PyTorch Elastic Inference accelerators on Amazon EC2

When using Elastic Inference, you can use the same Amazon EC2 instance for models on multiple frameworks. To do so, use the console to stop the Amazon EC2 instance and restart it, instead of rebooting it. The Elastic Inference accelerator doesn't detach when you reboot the instance.

To use the Elastic Inference accelerators with PyTorch

  1. From the terminal of your Amazon EC2 instance, pull the Elastic Inference enabled PyTorch image from Amazon Elastic Container Registry (Amazon ECR) with the following code. To select an image, see Deep Learning Containers Images.

    docker pull 763104351884.dkr.ecr.<region>.amazonaws.com/pytorch-inference-eia:<image_tag>
  2. Run the container with the following command. You can get the <image_id> by running the docker images command.

    docker run -itd --name pytorch_inference_eia -p 80:8080 -p 8081:8081 <image_id> \ mxnet-model-server --start --foreground \ --mms-config /home/model-server/config.properties \ --models densenet-eia=https://aws-dlc-sample-models.s3.amazonaws.com/pytorch/densenet_eia/densenet_eia.mar
  3. Download an image of a flower to use as the input image for the test.

    curl -O https://s3.amazonaws.com/model-server/inputs/flower.jpg
  4. Begin inference using a query with the REST API.

    curl -X POST http://127.0.0.1:80/predictions/densenet-eia -T flower.jpg
  5. The results should look something like the following.

    [ [ "pot, flowerpot", 14.690367698669434 ], [ "sulphur butterfly, sulfur butterfly", 9.29893970489502 ], [ "bee", 8.29178237915039 ], [ "vase", 6.987090587615967 ], [ "hummingbird", 4.341294765472412 ] ]