Using TensorFlow Elastic Inference accelerators on Amazon ECS
To use the Elastic Inference accelerator with TensorFlow
-
Create an Amazon ECS cluster named
tensorflow-eia
on AWS in an AWS Region that has access to Elastic Inference.aws ecs create-cluster --cluster-name tensorflow-eia \ --region
<region>
-
Create a text file called
tf_script.txt
and add the following text.#!/bin/bash echo ECS_CLUSTER=tensorflow-eia >> /etc/ecs/ecs.config
-
Create a text file called
my_mapping.txt
and add the following text.[ { "DeviceName": "/dev/xvda", "Ebs": { "VolumeSize": 100 } } ]
-
Launch an Amazon EC2 instance in the cluster that you created in Step 1 without attaching an Elastic Inference accelerator. Use Amazon ECS-optimized AMIs to get an image-id.
aws ec2 run-instances --image-id
<ECS_Optimized_AMI>
\ --count 1 \ --instance-type<cpu_instance_type>
\ --key-name<name_of_key_pair_on_ec2_console>
--security-group-ids<sg_created_with_vpc>
\ --iam-instance-profile Name="ecsInstanceRole" \ --user-data file://tf_script.txt \ --block-device-mapping file://my_mapping.txt \ --region<region>
\ --subnet-id<subnet_with_ei_endpoint>
-
For all Amazon EC2 instances that you launch, use the ecsInstanceRole IAM role. Make note of the public IPv4 address when the instance is started.
-
Create a TensorFlow inference task definition with the name tf_task_def.json. Set
“image”
to any TensorFlow image name. To select an image, see Prebuilt Amazon SageMaker Docker Images. For "deviceType" options, see Launching an Instance with Elastic Inference.{ "requiresCompatibilities":[ "EC2" ], "containerDefinitions":[ { "entryPoint":[ "/bin/bash", "-c", "mkdir -p /test && cd /test && git clone -b r1.14 https://github.com/tensorflow/serving.git && cd / && /usr/bin/tensorflow_model_server --port=8500 --rest_api_port=8501 --model_name=saved_model_half_plus_three --model_base_path=/test/serving/tensorflow_serving/servables/tensorflow/testdata/saved_model_half_plus_three" ], "name":"tensorflow-inference-container", "image":"
<tensorflow-image-uri>
", "memory":8111, "cpu":256, "essential":true, "portMappings":[ { "hostPort":8500, "protocol":"tcp", "containerPort":8500 }, { "hostPort":8501, "protocol":"tcp", "containerPort":8501 }, { "containerPort":80, "protocol":"tcp" } ], "healthCheck":{ "retries":2, "command":[ "CMD-SHELL", "LD_LIBRARY_PATH=/opt/ei_health_check/lib /opt/ei_health_check/health_check" ], "timeout":5, "interval":30, "startPeriod":60 }, "logConfiguration":{ "logDriver":"awslogs", "options":{ "awslogs-group":"/ecs/tensorflow-inference-eia", "awslogs-region":"<region>
", "awslogs-stream-prefix":"half-plus-three", "awslogs-create-group":"true" } }, "resourceRequirements":[ { "type":"InferenceAccelerator", "value":"device_1" } ] } ], "inferenceAccelerators":[ { "deviceName":"device_1", "deviceType":"<EIA_instance_type>
" } ], "volumes":[ ], "networkMode":"bridge", "placementConstraints":[ ], "family":"tensorflow-eia" } -
Register the TensorFlow inference task definition. Note the task definition family and revision number from the output of the following command.
aws ecs register-task-definition --cli-input-json file://tf_task_def.json --region
<region>
-
Create a TensorFlow inference service.
aws ecs create-service --cluster tensorflow-eia --service-name tf-eia1 --task-definition tensorflow-eia:
<revision_number>
--desired-count 1 --scheduling-strategy="REPLICA" --region<region>
-
Begin inference using a query with the REST API.
curl -d '{"instances": [1.0, 2.0, 5.0]}' -X POST http://
<public-ec2-ip-address>
:8501/v1/models/saved_model_half_plus_three:predict -
The results should look something like the following.
{ "predictions": [2.5, 3.0, 4.5 ] }