Code samples for Provisioned Throughput in Amazon Bedrock

The following code examples demonstrate how to create, use, and manage a Provisioned Throughput with the AWS CLI and the Python SDK.

AWS CLI

Create a no-commitment Provisioned Throughput called MyPT based off a custom model called MyCustomModel that was customized from the Anthropic Claude v2.1 model by running the following command in a terminal.


aws bedrock create-provisioned-model-throughput \
   --model-units 1 \
   --provisioned-model-name MyPT \
   --model-id arn:aws:bedrock:us-east-1::custom-model/anthropic.claude-v2:1:200k/MyCustomModel

The response returns a provisioned-model-arn. Allow some time for the creation to complete. To check its status, provide the name or ARN of the provisioned model as the provisioned-model-id in the following command.


aws bedrock get-provisioned-model-throughput \
    --provisioned-model-id MyPT

Change the name of the Provisioned Throughput and associate it with a different model customized from Anthropic Claude v2.1.


aws bedrock update-provisioned-model-throughput \
    --provisioned-model-id MyPT \ 
    --desired-provisioned-model-name MyPT2 \
    --desired-model-id arn:aws:bedrock:us-east-1::custom-model/anthropic.claude-v2:1:200k/MyCustomModel2

Run inference with your updated provisioned model with the following command. You must provide the ARN of the provisioned model, returned in the UpdateProvisionedModelThroughput response, as the model-id. The output is written to a file named output.txt in your current folder.


aws bedrock-runtime invoke-model \
    --model-id ${provisioned-model-arn} \
    --body '{"inputText": "What is AWS?", "textGenerationConfig": {"temperature": 0.5}}' \
    --cli-binary-format raw-in-base64-out \
    output.txt

Delete the Provisioned Throughput using the following command. You'll no longer be charged for the Provisioned Throughput.


aws bedrock delete-provisioned-model-throughput 
  --provisioned-model-id MyPT2

Python (Boto)


import boto3 
                    
bedrock = boto3.client(service_name='bedrock')
bedrock.create_provisioned_model_throughput(
    modelUnits=1,
    provisionedModelName='MyPT', 
    modelId='arn:aws:bedrock:us-east-1::custom-model/anthropic.claude-v2:1:200k/MyCustomModel' 
)

The response returns a provisionedModelArn. Allow some time for the creation to complete. You can check its status with the following code snippet. You can provide either the name of the Provisioned Throughput or the ARN returned from the CreateProvisionedModelThroughput response as the provisionedModelId.


bedrock.get_provisioned_model_throughput(provisionedModelId='MyPT')

Change the name of the Provisioned Throughput and associate it with a different model customized from Anthropic Claude v2.1. Then send a GetProvisionedModelThroughput request and save the ARN of the provisioned model to a variable to use for inference.


bedrock.update_provisioned_model_throughput(
    provisionedModelId='MyPT',
    desiredProvisionedModelName='MyPT2',
    desiredModelId='arn:aws:bedrock:us-east-1::custom-model/anthropic.claude-v2:1:200k/MyCustomModel2'
)
                
arn_MyPT2 = bedrock.get_provisioned_model_throughput(provisionedModelId='MyPT2').get('provisionedModelArn')

Run inference with your updated provisioned model with the following command. You must provide the ARN of the provisioned model as the modelId.


import json
import logging
import boto3

from botocore.exceptions import ClientError


class ImageError(Exception):
    "Custom exception for errors returned by the model"

    def __init__(self, message):
        self.message = message


logger = logging.getLogger(__name__)
logging.basicConfig(level=logging.INFO)


def generate_text(model_id, body):
    """
    Generate text using your provisioned custom model.
    Args:
        model_id (str): The model ID to use.
        body (str) : The request body to use.
    Returns:
        response (json): The response from the model.
    """

    logger.info(
        "Generating text with your provisioned custom model %s", model_id)

    brt = boto3.client(service_name='bedrock-runtime')

    accept = "application/json"
    content_type = "application/json"

    response = brt.invoke_model(
        body=body, modelId=model_id, accept=accept, contentType=content_type
    )
    response_body = json.loads(response.get("body").read())

    finish_reason = response_body.get("error")

    if finish_reason is not None:
        raise ImageError(f"Text generation error. Error is {finish_reason}")

    logger.info(
        "Successfully generated text with provisioned custom model %s", model_id)

    return response_body


def main():
    """
    Entrypoint for example.
    """
    try:
        logging.basicConfig(level=logging.INFO,
                            format="%(levelname)s: %(message)s")

        model_id = arn_myPT2

        body = json.dumps({
            "inputText": "what is AWS?"
        })

        response_body = generate_text(model_id, body)
        print(f"Input token count: {response_body['inputTextTokenCount']}")

        for result in response_body['results']:
            print(f"Token count: {result['tokenCount']}")
            print(f"Output text: {result['outputText']}")
            print(f"Completion reason: {result['completionReason']}")

    except ClientError as err:
        message = err.response["Error"]["Message"]
        logger.error("A client error occurred: %s", message)
        print("A client error occured: " +
              format(message))
    except ImageError as err:
        logger.error(err.message)
        print(err.message)

    else:
        print(
            f"Finished generating text with your provisioned custom model {model_id}.")


if __name__ == "__main__":
    main()

Delete the Provisioned Throughput with the following code snippet. You'll no longer be charged for the Provisioned Throughput.


bedrock.delete_provisioned_model_throughput(provisionedModelId='MyPT2')

Warning Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Document Conventions

Use a Provisioned Throughput

Tag resources