Invoke model code examples Invoke model with streaming code example

Submit a single prompt with InvokeModel

Run inference on a model through the API by sending an InvokeModel or InvokeModelWithResponseStream request. To check if a model supports streaming, send a GetFoundationModel or ListFoundationModels request and check the value in the responseStreamingSupported field.

The following fields are required:

Field	Use case
modelId	To specify the model, inference profile, or prompt from Prompt management to use. To learn how to find this value, see Submit prompts and generate responses using the API.
body	To specify the inference parameters for a model. To see inference parameters for different models, see Inference request parameters and response fields for foundation models. If you specify a prompt from Prompt management in the `modelId` field, omit this field (if you include it, it will be ignored).

The following fields are optional:

Field	Use case
accept	To specify the media type for the request body. For more information, see Media Types on the Swagger website.
contentType	To specify the media type for the response body. For more information, see Media Types on the Swagger website.
explicitPromptCaching	To specify whether prompt caching is enabled or disabled. For more information, see Prompt caching for faster model inference.
guardrailIdentifier	To specify a guardrail to apply to the prompt and response. For more information, see Test a guardrail.
guardrailVersion	To specify a guardrail to apply to the prompt and response. For more information, see Test a guardrail.
trace	To specify whether to return the trace for the guardrail you specify. For more information, see Test a guardrail.

Invoke model code examples

The following examples show how to run inference with the InvokeModel API. For examples with different models, see the inference parameter reference for the desired model (Inference request parameters and response fields for foundation models).

Invoke model with streaming code example

Note

The AWS CLI does not support streaming.

The following example shows how to use the InvokeModelWithResponseStream API to generate streaming text with Python using the prompt write an essay for living on mars in 1000 words.


import boto3
import json

brt = boto3.client(service_name='bedrock-runtime')

body = json.dumps({
    'prompt': '\n\nHuman: write an essay for living on mars in 1000 words\n\nAssistant:',
    'max_tokens_to_sample': 4000
})
                   
response = brt.invoke_model_with_response_stream(
    modelId='anthropic.claude-v2', 
    body=body
)
    
stream = response.get('body')
if stream:
    for event in stream:
        chunk = event.get('chunk')
        if chunk:
            print(json.loads(chunk.get('bytes').decode()))

Warning Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Document Conventions

Generate responses using the API

Carry out a conversation with Converse