Service restrictions and quotas - AWS Marketplace

Service restrictions and quotas

This section describes restrictions and quotas on your machine learning (ML) products in AWS Marketplace.

Network isolation

For security purposes, when a buyer subscribes to your containerized product, the Docker containers are run in an isolated environment without network access. When you create your containers, don't rely on making outgoing calls over the internet because they will fail. Calls to AWS services will also fail.

Image size

Your Docker image size is governed by the Amazon Elastic Container Registry (Amazon ECR) service quotas. The Docker image size affects the startup time during training jobs, batch-transform jobs, and endpoint creation. For better performance, maintain an optimal Docker image size.

Storage size

When you create an endpoint, Amazon SageMaker attaches an Amazon Elastic Block Store (Amazon EBS) storage volume to each ML compute instance that hosts the endpoint. (An endpoint is also known as real-time inference or Amazon SageMaker hosting service.) The size of the storage volume depends on the instance type. For more information, see Host Instance Storage Volumes in the Amazon SageMaker Developer Guide

For batch transform, see Storage in Batch Transform in the Amazon SageMaker Developer Guide.

Instance size

SageMaker provides a selection of instance types that are optimized to fit different ML use cases. Instance types are comprised of varying combinations of CPU, GPU, memory, and networking capacity. Instance types give you the flexibility to choose the appropriate mix of resources for building, training, and deploying your ML models. For more information, see Amazon SageMaker ML Instance Types.

Payload size for inference

For an endpoint, limit the maximum size of the input data per invocation to 6 MB. This value can't be adjusted.

For batch transform, the maximum size of the input data per invocation is 100 MB. This value can't be adjusted.

Processing time for inference

For an endpoint, the maximum processing time per invocation is 60 seconds. This value can't be adjusted.

For batch transform, the maximum processing time per invocation is 60 minutes. This value can't be adjusted.

Service quotas

For more information about quotas related to training and inference, see Amazon SageMaker Service Quotas.

Asynchronous inference

Model packages and algorithms published in AWS Marketplace can't be deployed to endpoints configured for Amazon SageMaker Asynchronous Inference. Endpoints configured for asynchronous inference requires models to have network connectivity. All AWS Marketplace models operate in network isolation. For more information, see No network access.

Serverless inference

Model packages and algorithms published in AWS Marketplace can't be deployed to endpoints configured for Amazon SageMaker Serverless Inference. Endpoints configured for serverless inference require models to have network connectivity. All AWS Marketplace models operate in network isolation. For more information, see No network access.

Managed spot training

For all algorithms from AWS Marketplace, the value of MaxWaitTimeInSeconds is set to 3,600 seconds (60 minutes), even if the checkpoint for managed spot training is implemented. This value can't be adjusted.

Docker images and AWS accounts

For publishing, images must be stored in Amazon ECR repositories owned by the AWS account of the seller. It isn't possible to publish images that are stored in a repository owned by another AWS account.

Publishing model packages from built-in algorithms or AWS Marketplace

Model packages created from training jobs using an Amazon SageMaker built-in algorithm or an algorithm from an AWS Marketplace subscription can't be published.

You can still use the model artifacts from the training job, but your own inference image is required for publishing model packages.

Supported AWS Regions for publishing

AWS Marketplace supports publishing model package and algorithm resources from AWS Regions where the following are both true:

All assets required for publishing a model package or algorithm product must be stored in the same Region that you choose to publish from. This includes the following:

  • Model package and algorithm resources that are created in Amazon SageMaker

  • Inference and training images that are uploaded to Amazon ECR repositories

  • Model artifacts (if any) that are stored in Amazon Simple Storage Service (Amazon S3) and dynamically loaded during model deployment for model package resources

  • Test data for inference and training validation that are stored in Amazon S3

You can develop and train your product in any Region that is supported by SageMaker. But, before you can publish, you must copy all assets to and re-create resources in a Region that AWS Marketplace supports publishing from.

During the listing process, regardless of the AWS Region that you publish from, you can choose the Regions that you want to publish to and make your product available in.