Use the Converse API - Amazon Bedrock

Use the Converse API

You can use the Amazon Bedrock Converse API to create conversational applications that send and receive messages to and from an Amazon Bedrock model. For example, you can create a chat bot that maintains a conversation over many turns and uses a persona or tone customization that is unique to your needs, such as a helpful technical support assistant.

To use the Converse API, you use the Converse or ConverseStream (for streaming responses) operations to send messages to a model. It is possible to use the existing inference operations (InvokeModel or InvokeModelWithResponseStream) for conversation applications. However, we recommend using the Converse API as it provides consistent API, that works with all Amazon Bedrock models that support messages. This means you can write code once and use it with different models. Should a model have unique inference parameters, the Converse API also allows you to pass those unique parameters in a model specific structure.

You can use the Converse API to implement tool use and guardrails in your applications.

Note

With Mistral AI and Meta open source models, the Converse API embeds your input in a model-specific prompt template that enables conversations.

Supported models and model features

The Converse API supports the following Amazon Bedrock models and model features. The Converse API doesn't support any embedding models (such as Titan Embeddings G1 - Text) or image generation models (such as Stability AI).

Model Converse ConverseStream System prompts Document chat Vision Tool use Streaming tool use Guardrails

AI21 Jamba-Instruct

Yes

No

Yes

No

No

No

No

No

AI21 Labs Jurassic-2 (Text)

Limited. No chat support.

No

No

No

No

No

No

Yes

Amazon Titan models

Yes

Yes

No

Yes (except Titan Text Premier)

No

No

No

Yes

Anthropic Claude 2 and earlier

Yes

Yes

Yes

Yes

No

No

No

Yes

Anthropic Claude 3

Yes

Yes

Yes

Yes

Yes

Yes

Yes

Yes

Anthropic Claude 3.5

Yes

Yes

Yes

No

Yes

Yes

Yes

Yes

Cohere Command

Limited. No chat support.

Limited. No chat support.

No

Yes

No

No

No

Yes

Cohere Command Light

Limited. No chat support.

Limited. No chat support.

No

No

No

No

No

Yes

Cohere Command R and Command R+

Yes

Yes

Yes

Yes

No

Yes

No

No

Meta Llama 2 and Llama 3

Yes

Yes

Yes

Yes

No

No

No

Yes

Mistral AI Instruct

Yes

Yes

No

Yes

No

No

No

Yes

Mistral Large

Yes

Yes

Yes

Yes

No

Yes

No

Yes

Mistral Small Yes Yes Yes No No Yes No Yes
Note

Cohere Command (Text) and AI21 Labs Jurassic-2 (Text) don't support chat with the Converse API. The models can only handle one user message at a time and can't maintain the history of a conversation. You get an error if you attempt to pass more than one message.

Using the Converse API

To use the Converse API, you call the Converse or ConverseStream operations to send messages to a model. To call Converse, you require permission for the bedrock:InvokeModel operation. To call ConverseStream, you require permission for the bedrock:InvokeModelWithResponseStream operation.

Request

You specify the model you want to use by setting the modelId field. For a list of model IDs that Amazon Bedrock supports, see Amazon Bedrock model IDs.

A conversation is a series of messages between the user and the model. You start a conversation by sending a message as a user (user role) to the model. The model, acting as an assistant (assistant role), then generates a response that it returns in a message. If desired, you can continue the conversation by sending further user role messages to the model. To maintain the conversation context, be sure to include any assistant role messages that you receive from the model in subsequent requests. For example code, see Converse API examples.

You provide the messages that you want to pass to a model in the messages field, which maps to an array of Message objects. Each Message contains the content for the message and the role that the message plays in the conversation.

Note

Amazon Bedrock doesn't store any text, images, or documents that you provide as content. The data is only used to generate the response.

You store the content for the message in the content field, which maps to an array of ContentBlock objects. Within each ContentBlock, you can specify one of the following fields (to see what models support what modalities, see Supported models and model features):

text

The text field maps to a string specifying the prompt. The text field is interpreted alongside other fields that are specified in the same ContentBlock.

The following shows a Message object with a content array containing only a text ContentBlock:

{ "role": "user | assistant", "content": [ { "text": "string" } ] }
image

The image field maps to an ImageBlock. Pass the raw bytes, encoded in base64, for an image in the bytes field. If you use an AWS SDK, you don't need to encode the bytes in base64.

If you exclude the text field, the model will describe the image.

The following shows a Message object with a content array contaning only an image ContentBlock:

{ "role": "user", "content": [ { "image": { "format": "png | jpeg | gif | webp", "source": { "bytes": "image in bytes" } } } ] }
document

The document field maps to an DocumentBlock. If you include a DocumentBlock, check that your request conforms to the following restrictions:

  • In the content field of the Message object, you must also include a text field with a prompt related to the document.

  • Pass the raw bytes, encoded in base64, for the document in the bytes field. If you use an AWS SDK, you don't need to encode the document bytes in base64.

  • The name field can only contain the following characters:

    • Alphanumeric characters

    • Whitespace characters (no more than one in a row)

    • Hyphens

    • Parentheses

    • Square brackets

    Note

    The name field is vulnerable to prompt injections, because the model might inadvertently interpret it as instructions. Therefore, we recommend that you specify a neutral name.

The following shows a Message object with a content array containing only a document ContentBlock and a required accompanying text ContentBlock.

{ "role": "user", "content": [ { "text": "string" }, { "document": { "format": "pdf | csv | doc | docx | xls | xlsx | html | txt | md", "name": "string", "source": { "bytes": "document in bytes" } } } ] }

The other fields in ContentBlock are for tool use.

You specify the role in the role field. The role can be one of the following:

  • user — The human that is sending messages to the model.

  • assistant — The model that is sending messages back to the human user.

Note

The following restrictions pertain to the content field:

  • You can include up to 20 images. Each image's size, height, and width must be no more than 3.75 MB, 8,000 px, and 8,000 px, respectively.

  • You can include up to five documents. Each document's size must be no more than 4.5 MB.

  • You can only include images and documents if the role is user.

In the following messages example, the user asks for a list of three pop songs, and the model generates a list of songs.

[ { "role": "user", "content": [ { "text": "Create a list of 3 pop songs." } ] }, { "role": "assistant", "content": [ { "text": "Here is a list of 3 pop songs by artists from the United Kingdom:\n\n1. \"As It Was\" by Harry Styles\n2. \"Easy On Me\" by Adele\n3. \"Unholy\" by Sam Smith and Kim Petras" } ] } ]

A system prompt is a type of prompt that provides instructions or context to the model about the task it should perform, or the persona it should adopt during the conversation. You can specify a list of system prompts for the request in the system (SystemContentBlock) field, as shown in the following example.

[ { "text": "You are an app that creates playlists for a radio station that plays rock and pop music. Only return song names and the artist. " } ]

Inference parameters

The Converse API supports a base set of inference parameters that you set in the inferenceConfig field (InferenceConfiguration). The base set of inference parameters are:

  • maxTokens – The maximum number of tokens to allow in the generated response.

  • stopSequences – A list of stop sequences. A stop sequence is a sequence of characters that causes the model to stop generating the response.

  • temperature – The likelihood of the model selecting higher-probability options while generating a response.

  • topP – The percentage of most-likely candidates that the model considers for the next token.

For more information, see Inference parameters.

The following example JSON sets the temperature inference parameter.

{"temperature": 0.5}

If the model you are using has additional inference parameters, you can set those parameters by specifying them as JSON in the additionalModelRequestFields field. The following example JSON shows how to set top_k, which is available in Anthropic Claude models, but isn't a base inference parameter in the messages API.

{"top_k": 200}

You can specify the paths for additional model parameters in the additionalModelResponseFieldPaths field, as shown in the following example.

[ "/stop_sequence" ]

The API returns the additional fields that you request in the additionalModelResponseFields field.

Response

The response you get from the Converse API depends on which operation you call, Converse or ConverseStream.

Converse response

In the response from Converse, the output field (ConverseOutput) contains the message (Message) that the model generates. The message content is in the content (ContentBlock) field and the role (user or assistant) that the message corresponds to is in the role field.

The metrics field (ConverseMetrics) includes metrics for the call. To determine why the model stopped generating content, check the stopReason field. You can get information about the tokens passed to the model in the request, and the tokens generated in the response, by checking the usage field (TokenUsage). If you specified additional response fields in the request, the API returns them as JSON in the additionalModelResponseFields field.

The following example shows the response from Converse when you pass the prompt discussed in Request.

{ "output": { "message": { "role": "assistant", "content": [ { "text": "Here is a list of 3 pop songs by artists from the United Kingdom:\n\n1. \"Wannabe\" by Spice Girls\n2. \"Bitter Sweet Symphony\" by The Verve \n3. \"Don't Look Back in Anger\" by Oasis" } ] } }, "stopReason": "end_turn", "usage": { "inputTokens": 125, "outputTokens": 60, "totalTokens": 185 }, "metrics": { "latencyMs": 1175 } }

ConverseStream response

If you call ConverseStream to stream the response from a model, the stream is returned in the stream response field. The stream emits the following events in the following order.

  1. messageStart (MessageStartEvent). The start event for a message. Includes the role for the message.

  2. contentBlockStart (ContentBlockStartEvent). A Content block start event. Tool use only.

  3. contentBlockDelta (ContentBlockDeltaEvent). A Content block delta event. Includes the partial text that the model generates or the partial input json for tool use.

  4. contentBlockStop (ContentBlockStopEvent). A Content block stop event.

  5. messageStop (MessageStopEvent). The stop event for the message. Includes the reason why the model stopped generating output.

  6. metadata (ConverseStreamMetadataEvent). Metadata for the request. The metadata includes the token usage in usage (TokenUsage) and metrics for the call in metrics (ConverseStreamMetadataEvent).

ConverseStream streams a complete content block as a ContentBlockStartEvent event, one or more ContentBlockDeltaEvent events, and a ContentBlockStopEvent event. Use the contentBlockIndex field as an index to correlate the events that make up a content block.

The following example is a partial response from ConverseStream.

{'messageStart': {'role': 'assistant'}} {'contentBlockDelta': {'delta': {'text': ''}, 'contentBlockIndex': 0}} {'contentBlockDelta': {'delta': {'text': ' Title'}, 'contentBlockIndex': 0}} {'contentBlockDelta': {'delta': {'text': ':'}, 'contentBlockIndex': 0}} . . . {'contentBlockDelta': {'delta': {'text': ' The'}, 'contentBlockIndex': 0}} {'messageStop': {'stopReason': 'max_tokens'}} {'metadata': {'usage': {'inputTokens': 47, 'outputTokens': 20, 'totalTokens': 67}, 'metrics': {'latencyMs': 100.0}}}

Converse API examples

The following examples show you how to use the Converse and ConverseStream operations.

Topics
    Conversation with text message example

    This example shows how to call the Converse operation with the Anthropic Claude 3 Sonnet model. The example shows how to send the input text, inference parameters, and additional parameters that are unique to the model. The code starts a conversation by asking the model to create a list of songs. It then continues the conversation by asking that the songs are by artists from the United Kingdom.

    # Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. # SPDX-License-Identifier: Apache-2.0 """ Shows how to use the Converse API with Anthropic Claude 3 Sonnet (on demand). """ import logging import boto3 from botocore.exceptions import ClientError logger = logging.getLogger(__name__) logging.basicConfig(level=logging.INFO) def generate_conversation(bedrock_client, model_id, system_prompts, messages): """ Sends messages to a model. Args: bedrock_client: The Boto3 Bedrock runtime client. model_id (str): The model ID to use. system_prompts (JSON) : The system prompts for the model to use. messages (JSON) : The messages to send to the model. Returns: response (JSON): The conversation that the model generated. """ logger.info("Generating message with model %s", model_id) # Inference parameters to use. temperature = 0.5 top_k = 200 # Base inference parameters to use. inference_config = {"temperature": temperature} # Additional inference parameters to use. additional_model_fields = {"top_k": top_k} # Send the message. response = bedrock_client.converse( modelId=model_id, messages=messages, system=system_prompts, inferenceConfig=inference_config, additionalModelRequestFields=additional_model_fields ) # Log token usage. token_usage = response['usage'] logger.info("Input tokens: %s", token_usage['inputTokens']) logger.info("Output tokens: %s", token_usage['outputTokens']) logger.info("Total tokens: %s", token_usage['totalTokens']) logger.info("Stop reason: %s", response['stopReason']) return response def main(): """ Entrypoint for Anthropic Claude 3 Sonnet example. """ logging.basicConfig(level=logging.INFO, format="%(levelname)s: %(message)s") model_id = "anthropic.claude-3-sonnet-20240229-v1:0" # Setup the system prompts and messages to send to the model. system_prompts = [{"text": "You are an app that creates playlists for a radio station that plays rock and pop music." "Only return song names and the artist."}] message_1 = { "role": "user", "content": [{"text": "Create a list of 3 pop songs."}] } message_2 = { "role": "user", "content": [{"text": "Make sure the songs are by artists from the United Kingdom."}] } messages = [] try: bedrock_client = boto3.client(service_name='bedrock-runtime') # Start the conversation with the 1st message. messages.append(message_1) response = generate_conversation( bedrock_client, model_id, system_prompts, messages) # Add the response message to the conversation. output_message = response['output']['message'] messages.append(output_message) # Continue the conversation with the 2nd message. messages.append(message_2) response = generate_conversation( bedrock_client, model_id, system_prompts, messages) output_message = response['output']['message'] messages.append(output_message) # Show the complete conversation. for message in messages: print(f"Role: {message['role']}") for content in message['content']: print(f"Text: {content['text']}") print() except ClientError as err: message = err.response['Error']['Message'] logger.error("A client error occurred: %s", message) print(f"A client error occured: {message}") else: print( f"Finished generating text with model {model_id}.") if __name__ == "__main__": main()
    Conversation with image example

    This example shows how to send an image as part of a message and requests that the model describe the image. The example uses Converse operation and the Anthropic Claude 3 Sonnet model.

    # Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. # SPDX-License-Identifier: Apache-2.0 """ Shows how to send an image with the Converse API to Anthropic Claude 3 Sonnet (on demand). """ import logging import boto3 from botocore.exceptions import ClientError logger = logging.getLogger(__name__) logging.basicConfig(level=logging.INFO) def generate_conversation(bedrock_client, model_id, input_text, input_image): """ Sends a message to a model. Args: bedrock_client: The Boto3 Bedrock runtime client. model_id (str): The model ID to use. input text : The input message. input_image : The input image. Returns: response (JSON): The conversation that the model generated. """ logger.info("Generating message with model %s", model_id) # Message to send. with open(input_image, "rb") as f: image = f.read() message = { "role": "user", "content": [ { "text": input_text }, { "image": { "format": 'png', "source": { "bytes": image } } } ] } messages = [message] # Send the message. response = bedrock_client.converse( modelId=model_id, messages=messages ) return response def main(): """ Entrypoint for Anthropic Claude 3 Sonnet example. """ logging.basicConfig(level=logging.INFO, format="%(levelname)s: %(message)s") model_id = "anthropic.claude-3-sonnet-20240229-v1:0" input_text = "What's in this image?" input_image = "path/to/image" try: bedrock_client = boto3.client(service_name="bedrock-runtime") response = generate_conversation( bedrock_client, model_id, input_text, input_image) output_message = response['output']['message'] print(f"Role: {output_message['role']}") for content in output_message['content']: print(f"Text: {content['text']}") token_usage = response['usage'] print(f"Input tokens: {token_usage['inputTokens']}") print(f"Output tokens: {token_usage['outputTokens']}") print(f"Total tokens: {token_usage['totalTokens']}") print(f"Stop reason: {response['stopReason']}") except ClientError as err: message = err.response['Error']['Message'] logger.error("A client error occurred: %s", message) print(f"A client error occured: {message}") else: print( f"Finished generating text with model {model_id}.") if __name__ == "__main__": main()
    Conversation with document example

    This example shows how to send a document as part of a message and requests that the model describe the contents of the document. The example uses Converse operation and the Anthropic Claude 3 Sonnet model.

    # Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. # SPDX-License-Identifier: Apache-2.0 """ Shows how to send an document as part of a message to Anthropic Claude 3 Sonnet (on demand). """ import logging import boto3 from botocore.exceptions import ClientError logger = logging.getLogger(__name__) logging.basicConfig(level=logging.INFO) def generate_message(bedrock_client, model_id, input_text, input_document): """ Sends a message to a model. Args: bedrock_client: The Boto3 Bedrock runtime client. model_id (str): The model ID to use. input text : The input message. input_document : The input document. Returns: response (JSON): The conversation that the model generated. """ logger.info("Generating message with model %s", model_id) # Message to send. message = { "role": "user", "content": [ { "text": input_text }, { "document": { "name": "MyDocument", "format": "txt", "source": { "bytes": input_document } } } ] } messages = [message] # Send the message. response = bedrock_client.converse( modelId=model_id, messages=messages ) return response def main(): """ Entrypoint for Anthropic Claude 3 Sonnet example. """ logging.basicConfig(level=logging.INFO, format="%(levelname)s: %(message)s") model_id = "anthropic.claude-3-sonnet-20240229-v1:0" input_text = "What's in this document?" input_document = 'path/to/document.pdf' try: bedrock_client = boto3.client(service_name="bedrock-runtime") response = generate_message( bedrock_client, model_id, input_text, input_document) output_message = response['output']['message'] print(f"Role: {output_message['role']}") for content in output_message['content']: print(f"Text: {content['text']}") token_usage = response['usage'] print(f"Input tokens: {token_usage['inputTokens']}") print(f"Output tokens: {token_usage['outputTokens']}") print(f"Total tokens: {token_usage['totalTokens']}") print(f"Stop reason: {response['stopReason']}") except ClientError as err: message = err.response['Error']['Message'] logger.error("A client error occurred: %s", message) print(f"A client error occured: {message}") else: print( f"Finished generating text with model {model_id}.") if __name__ == "__main__": main()
    Conversation streaming example

    This example shows how to call the ConverseStream operation with the Anthropic Claude 3 Sonnet model. The example shows how to send the input text, inference parameters, and additional parameters that are unique to the model.

    # Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. # SPDX-License-Identifier: Apache-2.0 """ Shows how to use the Converse API to stream a response from Anthropic Claude 3 Sonnet (on demand). """ import logging import boto3 from botocore.exceptions import ClientError logger = logging.getLogger(__name__) logging.basicConfig(level=logging.INFO) def stream_conversation(bedrock_client, model_id, messages, system_prompts, inference_config, additional_model_fields): """ Sends messages to a model and streams the response. Args: bedrock_client: The Boto3 Bedrock runtime client. model_id (str): The model ID to use. messages (JSON) : The messages to send. system_prompts (JSON) : The system prompts to send. inference_config (JSON) : The inference configuration to use. additional_model_fields (JSON) : Additional model fields to use. Returns: Nothing. """ logger.info("Streaming messages with model %s", model_id) response = bedrock_client.converse_stream( modelId=model_id, messages=messages, system=system_prompts, inferenceConfig=inference_config, additionalModelRequestFields=additional_model_fields ) stream = response.get('stream') if stream: for event in stream: if 'messageStart' in event: print(f"\nRole: {event['messageStart']['role']}") if 'contentBlockDelta' in event: print(event['contentBlockDelta']['delta']['text'], end="") if 'messageStop' in event: print(f"\nStop reason: {event['messageStop']['stopReason']}") if 'metadata' in event: metadata = event['metadata'] if 'usage' in metadata: print("\nToken usage") print(f"Input tokens: {metadata['usage']['inputTokens']}") print( f":Output tokens: {metadata['usage']['outputTokens']}") print(f":Total tokens: {metadata['usage']['totalTokens']}") if 'metrics' in event['metadata']: print( f"Latency: {metadata['metrics']['latencyMs']} milliseconds") def main(): """ Entrypoint for streaming message API response example. """ logging.basicConfig(level=logging.INFO, format="%(levelname)s: %(message)s") model_id = "anthropic.claude-3-sonnet-20240229-v1:0" system_prompt = """You are an app that creates playlists for a radio station that plays rock and pop music. Only return song names and the artist.""" # Message to send to the model. input_text = "Create a list of 3 pop songs." message = { "role": "user", "content": [{"text": input_text}] } messages = [message] # System prompts. system_prompts = [{"text" : system_prompt}] # inference parameters to use. temperature = 0.5 top_k = 200 # Base inference parameters. inference_config = { "temperature": temperature } # Additional model inference parameters. additional_model_fields = {"top_k": top_k} try: bedrock_client = boto3.client(service_name='bedrock-runtime') stream_conversation(bedrock_client, model_id, messages, system_prompts, inference_config, additional_model_fields) except ClientError as err: message = err.response['Error']['Message'] logger.error("A client error occurred: %s", message) print("A client error occured: " + format(message)) else: print( f"Finished streaming messages with model {model_id}.") if __name__ == "__main__": main()