MLPER-11: Evaluate cloud versus edge options for machine learning deployment - Machine Learning Lens

MLPER-11: Evaluate cloud versus edge options for machine learning deployment

Evaluate if machine learning applications require near-instantaneous inference results or require inference without network connectivity. Offering the lowest latency possible might require the removal of costly roundtrips to the nearest API endpoints. A reduction in latency can be achieved by running the inference directly on the device itself (on the edge). A common use-case for such a requirement is predictive maintenance in factories.

Implementation plan

  • Optimize model deployment on the edge - Training and optimizing machine learning models require massive computing resources, so it is a natural fit for the cloud. Inference takes a lot less computing power and is often done in real time when new data is available. When getting inference results with very low latency, confirm that your IoT applications can respond quickly to local events. Evaluate and choose the option to meet your business requirements.

    • Amazon SageMaker Edge enables machine learning on edge devices by optimizing, securing, and deploying models to the edge, and then monitoring these models on your fleet of devices, such as smart cameras, robots, and other smart-electronics, to reduce ongoing operational costs. Customers who train models in TensorFlow, MXNet, PyTorch, XGBoost, and TensorFlow Lite can use SageMaker Edge to improve their performance, deploy them on edge devices, and monitor their health throughout their lifecycle. SageMaker Edge Compiler optimizes the trained model to be run on an edge device. SageMaker Edge Agent allows you to run multiple models on the same device. The Agent collects prediction data based on the logic that you control, such as intervals, and uploads it to the cloud so that you can periodically retrain your models over time. SageMaker Edge cryptographically signs your models so you can verify that they were not tampered with as they move from the cloud to edge devices.

    • Amazon SageMaker Neo enables ML models to be trained once and then run anywhere in the cloud and at the edge. SageMaker Neo consists of a compiler and a runtime. The compilation API reads models exported from various frameworks, converts them into framework-agnostic representations, and generates optimized binary code (to run faster with no loss in accuracy). The compiler uses a machine learning model to apply the performance optimizations that extract the best available performance for your model on the cloud instance or edge device. The runtime for each target platform then loads and runs the compiled model. 

    • SageMaker Neo optimizes machine learning models for inference on cloud instances and edge devices. SageMaker Neo optimizes the trained model and compiles it into an executable. You then deploy the model as a SageMaker endpoint or on supported edge devices and start making predictions. 

    • AWS IoT Greengrass enables ML inferences on edge devices using models trained in the cloud. These models can be built using Amazon SageMaker, AWS Deep Learning AMIs, or AWS Deep Learning Containers. These models can be stored in Amazon S3 before being deployed on edge devices.

Documents

Blogs

Videos