Submit a single prompt
Run inference on a model through the API by sending an InvokeModel or InvokeModelWithResponseStream request. You can specify the media type for the request and response bodies in the contentType
and accept
fields. The default value for both fields is application/json
if you don't specify a value.
Streaming is supported for all text output models except AI21 Labs Jurassic-2 models. To check if a model supports streaming, send a GetFoundationModel or ListFoundationModels request and check the value in the responseStreamingSupported
field.
Specify the following fields, depending on the model that you use.
-
modelId
– Use the ID or Amazon Resource Name (ARN) of a model or throughput. The method for finding the ID or ARN depends on the type of model or throughput that you use:-
Base model – Do one of the following:
-
To see a list of model IDs for all base models supported by Amazon Bedrock, see Amazon Bedrock base model IDs (on-demand throughput) .
-
Send a ListFoundationModels request and find the
modelId
ormodelArn
of the model to use in the response. -
In the Amazon Bedrock console, select a model in Providers and find the
modelId
in the API request example.
-
-
Inference profile – Do one of the following:
-
Send a ListInferenceProfiles request and find the
inferenceProfileArn
of the model to use in the response. -
In the Amazon Bedrock console, select Cross-region inference from the left navigation pane and find the ID or ARN of the inference profile in the Inference profiles section.
-
-
Provisioned Throughput – If you've created Provisioned Throughput for a base or custom model, do one of the following:
-
Send a ListProvisionedModelThroughputs request and find the
provisionedModelArn
of the model to use in the response. -
In the Amazon Bedrock console, select Provisioned Throughput from the left navigation pane and select a Provisioned Throughput in the Provisioned throughput section. Then, find the ID or ARN of the Provisioned Throughput in the Model details section.
-
-
Custom model – Purchase Provisioned Throughput for the custom model (for more information, see Provisioned Throughput for Amazon Bedrock) and find the model ID or ARN of the provisioned model.
-
-
body
– Each base model has its own inference parameters that you set in thebody
field. The inference parameters for a custom or provisioned model depends on the base model from which it was created. For more information, see Inference parameters for foundation models.
Invoke model code examples
The following examples show how to run inference with the InvokeModel API. For examples with different models, see the inference parameter reference for the desired model (Inference parameters for foundation models).
Invoke model with streaming code example
Note
The AWS CLI does not support streaming.
The following example shows how to use the InvokeModelWithResponseStream API to generate streaming text with Python
using the prompt
write an essay for living on mars in 1000
words
.
import boto3 import json brt = boto3.client(service_name='bedrock-runtime') body = json.dumps({ 'prompt': '\n\nHuman: write an essay for living on mars in 1000 words\n\nAssistant:', 'max_tokens_to_sample': 4000 }) response = brt.invoke_model_with_response_stream( modelId='anthropic.claude-v2', body=body ) stream = response.get('body') if stream: for event in stream: chunk = event.get('chunk') if chunk: print(json.loads(chunk.get('bytes').decode()))