Increase model invocation capacity with Provisioned Throughput in Amazon Bedrock

Throughput refers to the number and rate of inputs and outputs that a model processes and returns. You can purchase Provisioned Throughput to provision a higher level of throughput for a model at a fixed cost. If you customized a model, you must purchase Provisioned Throughput to be able to use it.

You're billed hourly for a Provisioned Throughput that you purchase. For detailed information about pricing, see Amazon Bedrock Pricing. The price per hour depends on the following factors:

The model that you choose (for custom models, pricing is the same as the base model that it was customized from).
The number of Model Units (MUs) that you specify for the Provisioned Throughput. An MU delivers a specific throughput level for the specified model. The throughput level of an MU specifies the following:
- The number of input tokens that an MU can process across all requests within a span of one minute.
- The number of output tokens that an MU can generate across all requests within a span of one minute.
Note
For more information about what an MU specifies, pricing per MU, and to request limit increases, contact your AWS account manager.
The duration of time you commit to keeping the Provisioned Throughput. The longer the commitment duration, the more discounted the hourly price becomes. You can choose between the following levels of commitment:
- No commitment – You can delete the Provisioned Throughput at any time.
- 1 month – You can't delete the Provisioned Throughput until the one month commitment term is over.
- 6 months – You can't delete the Provisioned Throughput until the six month commitment term is over.
Note
Billing continues until you delete the Provisioned Throughput.

The following steps outline the process of setting up and using Provisioned Throughput.

Determine the number of MUs you wish to purchase for a Provisioned Throughput and the amount of time for which you want to commit to using the Provisioned Throughput.
Purchase Provisioned Throughput for a base or custom model.
After the provisioned model is created, you can use it to run model inference.

Topics

Warning Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Document Conventions

Delete an application inference profile

Supported Regions and models

Increase model invocation capacity with Provisioned Throughput in Amazon Bedrock

Note

Note

Topics