Run inference using a Provisioned Throughput - Amazon Bedrock

Run inference using a Provisioned Throughput

After you purchase a Provisioned Throughput, you can use it in model inference to increase your throughput. If you want, you can first test the Provisioned Throughput in a Amazon Bedrock console playground. When you're ready to deploy the Provisioned Throughput, you set up your application to invoke the provisioned model. Select the tab corresponding to your method of choice and follow the steps.

Console
To use a Provisioned Throughput in the Amazon Bedrock console playground
  1. Sign in to the AWS Management Console, and open the Amazon Bedrock console at https://console.aws.amazon.com/bedrock/.

  2. From the left navigation pane, select Chat, Text, or Image under Playgrounds, depending your use case.

  3. Choose Select model.

  4. In the 1. Category column, select a provider or custom model category. Then, in the 2. Model column, select the model that your Provisioned Throughput is associated with.

  5. In the 3. Throughput column, select your Provisioned Throughput.

  6. Choose Apply.

To learn how to use the Amazon Bedrock playgrounds, see Playgrounds.

API

To run inference using a Provisioned Throughput, send an InvokeModel or InvokeModelWithResponseStream request (see link for request and response formats and field details) with an Amazon Bedrock runtime endpoint. Specify the provisioned model ARN as the modelId parameter. To see requirements for the request body for different models, see Inference parameters for foundation models.

See code examples