Prepare Model for Compilation
SageMaker Neo requires machine learning models to satisfy specific input data shapes. The input shape required for compilation depends on the deep learning framework you use. Once your model input shape is correctly formatted, save your model according to the requirements below. Once you have a saved model, compress the model artifacts.
What input data shapes does SageMaker Neo expect?
Before you compile your model, make sure your model is formatted correctly. Neo expects the name and shape of the expected data inputs for your trained model with JSON format or list format. The expected inputs are framework specific.
Below are the input shapes SageMaker Neo expects:
Specify the name and shape (NCHW format) of the expected data inputs using a dictionary format for your trained model. Note that while Keras model artifacts should be uploaded in NHWC (channel-last) format, DataInputConfig should be specified in NCHW (channel-first) format. The dictionary formats required are as follows:
For one input:
{'input_1':[1,3,224,224]}
For two inputs:
{'input_1': [1,3,224,224], 'input_2':[1,3,224,224]}
Specify the name and shape (NCHW format) of the expected data inputs using a dictionary format for your trained model. The dictionary formats required are as follows:
For one input:
{'data':[1,3,1024,1024]}
For two inputs:
{'var1': [1,1,28,28], 'var2':[1,1,28,28]}
For a PyTorch model, you don't need to provide the name and shape of the expected data inputs if you meet both of the following conditions:
-
You created your model definition file by using PyTorch 2.0 or later. For more information about how to create the definition file, see the PyTorch section under Saving Models for SageMaker Neo.
-
You are compiling your model for a cloud instance. For more information about the instance types that SageMaker Neo supports, see Supported Instance Types and Frameworks.
If you meet these conditions, SageMaker Neo gets the input configuration from the model definition file (.pt or .pth) that you create with PyTorch.
Otherwise, you must do the following:
Specify the name and shape (NCHW format) of the expected data inputs using a dictionary format for your trained model. Alternatively, you can specify the shape only using a list format. The dictionary formats required are as follows:
For one input in dictionary format:
{'input0':[1,3,224,224]}
For one input in list format:
[[1,3,224,224]]
For two inputs in dictionary format:
{'input0':[1,3,224,224], 'input1':[1,3,224,224]}
For two inputs in list format:
[[1,3,224,224], [1,3,224,224]]
Specify the name and shape (NHWC format) of the expected data inputs using a dictionary format for your trained model. The dictionary formats required are as follows:
For one input:
{'input':[1,1024,1024,3]}
For two inputs:
{'data1': [1,28,28,1], 'data2':[1,28,28,1]}
Specify the name and shape (NHWC format) of the expected data inputs using a dictionary format for your trained model. The dictionary formats required are as follows:
For one input:
{'input':[1,224,224,3]}
Note
SageMaker Neo only supports TensorFlow Lite for edge device targets. For a list of supported SageMaker Neo edge device targets, see the SageMaker Neo Devices page. For a list of supported SageMaker Neo cloud instance targets, see the SageMaker Neo Supported Instance Types and Frameworks page.
An input data name and shape are not needed.
Saving Models for SageMaker Neo
The following code examples show how to save your model to make it compatible with Neo.
Models must be packaged as compressed tar files (*.tar.gz
).
Keras models require one model definition file (.h5
).
There are two options for saving your Keras model in order to make it compatible for SageMaker Neo:
Export to
.h5
format withmodel.save("<model-name>", save_format="h5")
.Freeze the
SavedModel
after exporting.
Below is an example of how to export a tf.keras
model as a
frozen graph (option two):
import os import tensorflow as tf from tensorflow.keras.applications.resnet50 import ResNet50 from tensorflow.keras import backend tf.keras.backend.set_learning_phase(0) model = tf.keras.applications.ResNet50(weights='imagenet', include_top=False, input_shape=(224, 224, 3), pooling='avg') model.summary() # Save as a SavedModel export_dir = 'saved_model/' model.save(export_dir, save_format='tf') # Freeze saved model input_node_names = [inp.name.split(":")[0] for inp in model.inputs] output_node_names = [output.name.split(":")[0] for output in model.outputs] print("Input names: ", input_node_names) with tf.Session() as sess: loaded = tf.saved_model.load(sess, export_dir=export_dir, tags=["serve"]) frozen_graph = tf.graph_util.convert_variables_to_constants(sess, sess.graph.as_graph_def(), output_node_names) tf.io.write_graph(graph_or_graph_def=frozen_graph, logdir=".", name="frozen_graph.pb", as_text=False) import tarfile tar = tarfile.open("frozen_graph.tar.gz", "w:gz") tar.add("frozen_graph.pb") tar.close()
Warning
Do not export your model with the SavedModel
class using
model.save(<path>, save_format='tf')
. This format is suitable for training,
but it is not suitable for inference.
MXNet models must be saved as a single symbol file *-symbol.json
and a
single parameter *.params files
.
PyTorch models must be saved as a definition file (.pt
or .pth
)
with input datatype of float32
.
To save your model, use the torch.jit.trace
method followed
by the torch.save
method. This process saves an object to a
disk file and by default uses python pickle
(pickle_module=pickle
) to save the objects and some
metadata. Next, convert the saved model to a compressed tar file.
import torchvision import torch model = torchvision.models.resnet18(pretrained=True) model.eval() inp = torch.rand(1, 3, 224, 224) model_trace = torch.jit.trace(model, inp) # Save your model. The following code saves it with the .pth file extension model_trace.save('model.pth') # Save as a compressed tar file import tarfile with tarfile.open('model.tar.gz', 'w:gz') as f: f.add('model.pth') f.close()
If you save your model with PyTorch 2.0 or later, SageMaker Neo derives the input configuration for the model (the name and shape for its input) from the definition file. In that case, you don't need to specify the data input configuration to SageMaker AI when you compile the model.
If you want to prevent SageMaker Neo from deriving the input configuration, you
can set the _store_inputs
parameter of
torch.jit.trace
to False
. If you do this, you
must specify the data input configuration to SageMaker AI when you compile the
model.
For more information about the torch.jit.trace
method, see
TORCH.JIT.TRACE
TensorFlow requires one .pb
or one .pbtxt
file and a variables
directory that contains variables. For frozen models, only one .pb
or .pbtxt
file is required.
The following code example shows how to use the tar Linux command to
compress your model. Run the following in your terminal or in a Jupyter
notebook (if you use a Jupyter notebook, insert the !
magic
command at the beginning of the statement):
# Download SSD_Mobilenet trained model !wget http://download.tensorflow.org/models/object_detection/ssd_mobilenet_v2_coco_2018_03_29.tar.gz # unzip the compressed tar file !tar xvf ssd_mobilenet_v2_coco_2018_03_29.tar.gz # Compress the tar file and save it in a directory called 'model.tar.gz' !tar czvf model.tar.gz ssd_mobilenet_v2_coco_2018_03_29/frozen_inference_graph.pb
The command flags used in this example accomplish the following:
-
c
: Create an archive -
z
: Compress the archive with gzip -
v
: Display archive progress -
f
: Specify the filename of the archive
Built-in estimators are either made by framework-specific containers or
algorithm-specific containers. Estimator objects for both the built-in
algorithm and framework-specific estimator saves the model in the correct
format for you when you train the model using the built-in .fit
method.
For example, you can use a sagemaker.TensorFlow
to define a TensorFlow estimator:
from sagemaker.tensorflow import TensorFlow estimator = TensorFlow(entry_point='mnist.py', role=role, #param role can be arn of a sagemaker execution role framework_version='1.15.3', py_version='py3', training_steps=1000, evaluation_steps=100, instance_count=2, instance_type='ml.c4.xlarge')
Then train the model with .fit
built-in method:
estimator.fit(inputs)
Before finally compiling model with the build in
compile_model
method:
# Specify output path of the compiled model output_path = '/'.join(estimator.output_path.split('/')[:-1]) # Compile model optimized_estimator = estimator.compile_model(target_instance_family='ml_c5', input_shape={'data':[1, 784]}, # Batch size 1, 3 channels, 224x224 Images. output_path=output_path, framework='tensorflow', framework_version='1.15.3')
You can also use the sagemaker.estimator.Estimator
Class to initialize an
estimator object for training and compiling a built-in algorithm with the
compile_model
method from the SageMaker Python SDK:
import sagemaker from sagemaker.image_uris import retrieve sagemaker_session = sagemaker.Session() aws_region = sagemaker_session.boto_region_name # Specify built-in algorithm training image training_image = retrieve(framework='image-classification', region=aws_region, image_scope='training') training_image = retrieve(framework='image-classification', region=aws_region, image_scope='training') # Create estimator object for training estimator = sagemaker.estimator.Estimator(image_uri=training_image, role=role, #param role can be arn of a sagemaker execution role instance_count=1, instance_type='ml.p3.8xlarge', volume_size = 50, max_run = 360000, input_mode= 'File', output_path=s3_training_output_location, base_job_name='image-classification-training' ) # Setup the input data_channels to be used later for training. train_data = sagemaker.inputs.TrainingInput(s3_training_data_location, content_type='application/x-recordio', s3_data_type='S3Prefix') validation_data = sagemaker.inputs.TrainingInput(s3_validation_data_location, content_type='application/x-recordio', s3_data_type='S3Prefix') data_channels = {'train': train_data, 'validation': validation_data} # Train model estimator.fit(inputs=data_channels, logs=True) # Compile model with Neo optimized_estimator = estimator.compile_model(target_instance_family='ml_c5', input_shape={'data':[1, 3, 224, 224], 'softmax_label':[1]}, output_path=s3_compilation_output_location, framework='mxnet', framework_version='1.7')
For more information about compiling models with the SageMaker Python SDK, see Compile a Model (Amazon SageMaker AI SDK).