Quotas - QnABot on AWS

Quotas

Service quotas, also referred to as limits, are the maximum number of service resources or operations for your AWS account.

Quotas for AWS services in this solution

Make sure you have sufficient quota for each of the services implemented in this solution. For more information, see AWS service quotas.

Click one of the following links to go to the page for that service. To view the service quotas for all AWS services in the documentation without switching pages, view the information in the Service endpoints and quotas page in the PDF instead.

AWS CloudFormation quotas

Your AWS account has AWS CloudFormation quotas that you should be aware of when launching the stack in this solution. By understanding these quotas, you can avoid limitation errors that would prevent you from deploying this solution successfully. For more information, see AWS CloudFormation quotas in the AWS CloudFormation User’s Guide.

AWS SageMaker endpoint quota

The provided LLM SageMaker API requires an ml.g5.12xlarge SageMaker instance type, which is not enabled in AWS accounts by default and must be requested on a per Region basis. If you are planning on deploying the default LLM SageMaker API model then you must request a quota increase before deploying the solution.

Sign in to the AWS Management Console, access AWS Service Quotas and search for Amazon SageMaker under the AWS services list. Once selected, search for the quota called ml.g5.12xlarge for endpoint usage. At a minimum, you must request a quota increase to one (you can request more to accommodate high-volume production deployments).

Note

The ml.g5.12xlarge instance type is not available in the ap-southeast-1 Region.