Using TensorFlow Elastic Inference accelerators on Amazon ECS

To use the Elastic Inference accelerator with TensorFlow

Create an Amazon ECS cluster named tensorflow-eia on AWS in an AWS Region that has access to Elastic Inference.
```
aws ecs create-cluster --cluster-name tensorflow-eia \
                       --region <region>
```
Create a text file called tf_script.txt and add the following text.
```
#!/bin/bash
echo ECS_CLUSTER=tensorflow-eia >> /etc/ecs/ecs.config
```

Create a text file called my_mapping.txt and add the following text.


[
    {
        "DeviceName": "/dev/xvda",
        "Ebs": {
            "VolumeSize": 100
        }
    }
]

Launch an Amazon EC2 instance in the cluster that you created in Step 1 without attaching an Elastic Inference accelerator. Use Amazon ECS-optimized AMIs to get an image-id.


aws ec2 run-instances --image-id <ECS_Optimized_AMI> \
                      --count 1 \
                      --instance-type <cpu_instance_type> \
                      --key-name <name_of_key_pair_on_ec2_console>
                      --security-group-ids <sg_created_with_vpc> \
                      --iam-instance-profile Name="ecsInstanceRole" \
                      --user-data file://tf_script.txt \
                      --block-device-mapping file://my_mapping.txt \
                      --region <region> \
                      --subnet-id <subnet_with_ei_endpoint>

For all Amazon EC2 instances that you launch, use the ecsInstanceRole IAM role. Make note of the public IPv4 address when the instance is started.

Create a TensorFlow inference task definition with the name tf_task_def.json. Set “image” to any TensorFlow image name. To select an image, see Prebuilt Amazon SageMaker Docker Images. For "deviceType" options, see Launching an Instance with Elastic Inference.


{
    "requiresCompatibilities":[
        "EC2"
    ],
    "containerDefinitions":[
        {
            "entryPoint":[
                "/bin/bash",
                "-c",
                "mkdir -p /test && cd /test && git clone -b r1.14 https://github.com/tensorflow/serving.git && cd / && /usr/bin/tensorflow_model_server --port=8500 --rest_api_port=8501 --model_name=saved_model_half_plus_three --model_base_path=/test/serving/tensorflow_serving/servables/tensorflow/testdata/saved_model_half_plus_three"
            ],
            "name":"tensorflow-inference-container",
            "image":"<tensorflow-image-uri>",
            "memory":8111,
            "cpu":256,
            "essential":true,
            "portMappings":[
                {
                    "hostPort":8500,
                    "protocol":"tcp",
                    "containerPort":8500
                },
                {
                    "hostPort":8501,
                    "protocol":"tcp",
                    "containerPort":8501
                },
                {
                    "containerPort":80,
                    "protocol":"tcp"
                }
            ],
            "healthCheck":{
                "retries":2,
                "command":[
                    "CMD-SHELL",
                    "LD_LIBRARY_PATH=/opt/ei_health_check/lib /opt/ei_health_check/health_check"
                ],
                "timeout":5,
                "interval":30,
                "startPeriod":60
            },
            "logConfiguration":{
                "logDriver":"awslogs",
                "options":{
                    "awslogs-group":"/ecs/tensorflow-inference-eia",
                    "awslogs-region":"<region>",
                    "awslogs-stream-prefix":"half-plus-three",
                    "awslogs-create-group":"true"
                }
            },
            "resourceRequirements":[
                {
                    "type":"InferenceAccelerator",
                    "value":"device_1"
                }
            ]
        }
     ],
     "inferenceAccelerators":[
         {
             "deviceName":"device_1",
             "deviceType":"<EIA_instance_type>"
         }
    ],
    "volumes":[

    ],
     "networkMode":"bridge",
     "placementConstraints":[
    
     ],
     "family":"tensorflow-eia"
}

Register the TensorFlow inference task definition. Note the task definition family and revision number from the output of the following command.
```
aws ecs register-task-definition --cli-input-json file://tf_task_def.json --region <region>
```

Create a TensorFlow inference service.


aws ecs create-service --cluster tensorflow-eia --service-name tf-eia1 --task-definition tensorflow-eia:<revision_number> --desired-count 1 --scheduling-strategy="REPLICA" --region <region>

Begin inference using a query with the REST API.


curl -d '{"instances": [1.0, 2.0, 5.0]}' -X POST http://<public-ec2-ip-address>:8501/v1/models/saved_model_half_plus_three:predict

The results should look something like the following.
```
{
    "predictions": [2.5, 3.0, 4.5
    ]
}
```

Warning Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Document Conventions

Using Deep Learning Containers with Amazon Deep Learning Containers on Amazon ECS

Using MXNet Elastic Inference accelerators on Amazon ECS