選取您的 Cookie 偏好設定

我們使用提供自身網站和服務所需的基本 Cookie 和類似工具。我們使用效能 Cookie 收集匿名統計資料,以便了解客戶如何使用我們的網站並進行改進。基本 Cookie 無法停用,但可以按一下「自訂」或「拒絕」以拒絕效能 Cookie。

如果您同意,AWS 與經核准的第三方也會使用 Cookie 提供實用的網站功能、記住您的偏好設定,並顯示相關內容,包括相關廣告。若要接受或拒絕所有非必要 Cookie,請按一下「接受」或「拒絕」。若要進行更詳細的選擇,請按一下「自訂」。

MLOE-05: Prepare an ML profile template - Machine Learning Lens
此頁面尚未翻譯為您的語言。 請求翻譯

MLOE-05: Prepare an ML profile template

Prepare an ML profile template to capture workload artifacts across ML lifecycle phases. The template helps enable evaluating the current maturity status of a workload and plan for improvements accordingly. Artifact examples to capture for the deployment phase include: model instance size, model update schedule, and model deployment location. This template should have artifact metrics with thresholds to evaluate and rank the level of maturity. Enable the ML profile template to reflect workload maturity status with snapshots of existing profiles, and alternative target profiles. Provide documentation with rationale for choosing one option over another that meets the business requirements.

Implementation plan

  • Capture ML workload deployment characteristics - Capture the most impactful deployment characteristics of your ML workload. In this paper, we will highlight the characteristics as a sample profile template on AWS. The collected design and provisioning characteristics will help identify the optimal deployment architecture, including computing and inference instance types and sizes.

  • Map ML workload characteristics across a spectrum from lower to higher ranges -Ideally, there should be at least two profile templates generated for each workload characteristic. One ML profile template gives a snapshot of the current workload profile. Another profile template can be instantiated to capture the target or future characteristics of the ML workload.

  Documentation should provide the rationale for justifying the characteristic values in the target profile.

Sample design, architecture, and provisioning characteristics include:

  • Model deployment sample characteristics include:

    • Model size (model.tar.gz) in bytes

    • Number of models deployed per endpoint

    • Instance size (for example, r5dn.4x.large) as suggested by the inference recommender

    • Retraining and model endpoint update frequency (hourly, daily, weekly, monthly, or per-event)

    • Model deployment location (on premises, Amazon EC2, container, serverless, or edge)

  • Architectural deployment sample characteristics for the internal underlying algorithm or neural architecture includes:

    • Inference pipeline architecture (single endpoint, or chained endpoints)

    • Neural architecture (single framework (Scikit-learn), or multi-framework (PyTorch+ Scikit-learn + TensorFlow))

    • Containers (SageMaker AI prebuilt container, bring your own container)

    • Location of the containers and models (on premises, cloud, or hybrid)

    • Serverless inferencing (pay as you go) 

  • Traffic pattern deployment sample characteristics include:

    • Traffic pattern (steady, or spiky)

    • Input size (number of bytes)

    • Latency (low, medium, high, or batch)

    • Concurrency (single thread, or multi-thread)

  • Cold start tolerance characteristics - Determine and document the tolerance of the various aspects of cold start in milliseconds.

  • Network deployment characteristics - Check for the applicability of network deployment characteristics including AWS KMS encryption, multi-variant endpoints, network isolation, and third-party Docker repositories.

  • Cost considerations - Discuss and document the cost considerations for elements, such as Amazon EC2 Spot Instances.

  • Determine provisioning matrix - Critical ML workloads might be vying for resources from cloud providers. For staging and production environments, include a matrix of the expected capacity requirements. This matrix consists of the number of instance types per AWS Region across training, batch interference, real-time inference, and notebooks.

隱私權網站條款Cookie 偏好設定
© 2025, Amazon Web Services, Inc.或其附屬公司。保留所有權利。