쿠키 기본 설정 선택

당사는 사이트와 서비스를 제공하는 데 필요한 필수 쿠키 및 유사한 도구를 사용합니다. 고객이 사이트를 어떻게 사용하는지 파악하고 개선할 수 있도록 성능 쿠키를 사용해 익명의 통계를 수집합니다. 필수 쿠키는 비활성화할 수 없지만 '사용자 지정' 또는 ‘거부’를 클릭하여 성능 쿠키를 거부할 수 있습니다.

사용자가 동의하는 경우 AWS와 승인된 제3자도 쿠키를 사용하여 유용한 사이트 기능을 제공하고, 사용자의 기본 설정을 기억하고, 관련 광고를 비롯한 관련 콘텐츠를 표시합니다. 필수가 아닌 모든 쿠키를 수락하거나 거부하려면 ‘수락’ 또는 ‘거부’를 클릭하세요. 더 자세한 내용을 선택하려면 ‘사용자 정의’를 클릭하세요.

Inference - AWS Deep Learning Containers
이 페이지는 귀하의 언어로 번역되지 않았습니다. 번역 요청

Inference

This section shows how to run inference on AWS Deep Learning Containers for Amazon Elastic Compute Cloud using PyTorch, and TensorFlow.

PyTorch Inference

Deep Learning Containers with PyTorch version 1.6 and later use TorchServe for inference calls. Deep Learning Containers with PyTorch version 1.5 and earlier use multi-model-server for inference calls.

PyTorch 1.6 and later

To run inference with PyTorch, this example uses a model pretrained on Imagenet from a public S3 bucket. Inference is served using TorchServe. For more information, see this blog on Deploying PyTorch inference with TorchServe.

For CPU instances:

$ docker run -itd --name torchserve -p 80:8080 -p 8081:8081 <your container image id> \ torchserve --start --ts-config /home/model-server/config.properties \ --models pytorch-densenet=https://torchserve.s3.amazonaws.com/mar_files/densenet161.mar

For GPU instances

$ nvidia-docker run -itd --name torchserve -p 80:8080 -p 8081:8081 <your container image id> \ torchserve --start --ts-config /home/model-server/config.properties \ --models pytorch-densenet=https://torchserve.s3.amazonaws.com/mar_files/densenet161.mar

If you have docker-ce version 19.03 or later, you can use the --gpus flag when you start Docker.

The configuration file is included in the container.

With your server started, you can now run inference from a different window by using the following.

$ curl -O https://s3.amazonaws.com/model-server/inputs/flower.jpg curl -X POST http://127.0.0.1:80/predictions/pytorch-densenet -T flower.jpg

After you are done using your container, you can remove it using the following.

$ docker rm -f torchserve

PyTorch 1.5 and earlier

To run inference with PyTorch, this example uses a model pretrained on Imagenet from a public S3 bucket. Inference is served using multi-model-server, which can support any framework as the backend. For more information, see multi-model-server.

For CPU instances:

$ docker run -itd --name mms -p 80:8080 -p 8081:8081 <your container image id> \ multi-model-server --start --mms-config /home/model-server/config.properties \ --models densenet=https://dlc-samples.s3.amazonaws.com/pytorch/multi-model-server/densenet/densenet.mar

For GPU instances

$ nvidia-docker run -itd --name mms -p 80:8080 -p 8081:8081 <your container image id> \ multi-model-server --start --mms-config /home/model-server/config.properties \ --models densenet=https://dlc-samples.s3.amazonaws.com/pytorch/multi-model-server/densenet/densenet.mar

If you have docker-ce version 19.03 or later, you can use the --gpus flag when you start Docker.

The configuration file is included in the container.

With your server started, you can now run inference from a different window by using the following.

$ curl -O https://s3.amazonaws.com/model-server/inputs/flower.jpg curl -X POST http://127.0.0.1/predictions/densenet -T flower.jpg

After you are done using your container, you can remove it using the following.

$ docker rm -f mms

TensorFlow Inference

To demonstrate how to use Deep Learning Containers for inference, this example uses a simple half plus two model with TensorFlow 2 Serving. We recommend using the Deep Learning Base AMI for TensorFlow 2. After you log into your instance run the following.

$ git clone -b r2.0 https://github.com/tensorflow/serving.git $ cd serving

Use the commands here to start TensorFlow Serving with the Deep Learning Containers for this model. Unlike the Deep Learning Containers for training, model serving starts immediately upon running the container and runs as a background process.

  • For CPU instances:

    $ docker run -p 8500:8500 -p 8501:8501 --name tensorflow-inference --mount type=bind,source=$(pwd)/tensorflow_serving/servables/tensorflow/testdata/saved_model_half_plus_two_cpu,target=/models/saved_model_half_plus_two -e MODEL_NAME=saved_model_half_plus_two -d <cpu inference container>

    For example:

    $ docker run -p 8500:8500 -p 8501:8501 --name tensorflow-inference --mount type=bind,source=$(pwd)/tensorflow_serving/servables/tensorflow/testdata/saved_model_half_plus_two_cpu,target=/models/saved_model_half_plus_two -e MODEL_NAME=saved_model_half_plus_two -d 763104351884.dkr.ecr.us-east-1.amazonaws.com/tensorflow-inference:2.0.0-cpu-py36-ubuntu18.04
  • For GPU instances:

    $ nvidia-docker run -p 8500:8500 -p 8501:8501 --name tensorflow-inference --mount type=bind,source=$(pwd)/tensorflow_serving/servables/tensorflow/testdata/saved_model_half_plus_two_gpu,target=/models/saved_model_half_plus_two -e MODEL_NAME=saved_model_half_plus_two -d <gpu inference container>

    For example:

    $ nvidia-docker run -p 8500:8500 -p 8501:8501 --name tensorflow-inference --mount type=bind,source=$(pwd)/tensorflow_serving/servables/tensorflow/testdata/saved_model_half_plus_two_gpu,target=/models/saved_model_half_plus_two -e MODEL_NAME=saved_model_half_plus_two -d 763104351884.dkr.ecr.us-east-1.amazonaws.com/tensorflow-inference:2.0.0-gpu-py36-cu100-ubuntu18.04
    Note

    Loading the GPU model server may take some time.

Next, run inference with the Deep Learning Containers.

$ curl -d '{"instances": [1.0, 2.0, 5.0]}' -X POST http://127.0.0.1:8501/v1/models/saved_model_half_plus_two:predict

The output is similar to the following.

{ "predictions": [2.5, 3.0, 4.5 ] }
Note

To debug the container's output, you can use the name to attach to it as shown in the following command:

$ docker attach <your docker container name>

This example used tensorflow-inference.

Next steps

To learn about using custom entrypoints with Deep Learning Containers on Amazon ECS, see Custom Entrypoints.

프라이버시사이트 이용 약관쿠키 기본 설정
© 2025, Amazon Web Services, Inc. 또는 계열사. All rights reserved.