Running inference on a model
The following examples show how to run inference on a model with InvokeModel and, with Python, run inference with streaming with the InvokeModelWithResponseStream operation.
Note
The AWS CLI does not support streaming.
For information about the parameters each model supports, see Inference parameters for foundation models. For information about writing prompts, see Prompt engineering guidelines.
Base model inference examples
The following Python (Boto) examples show how you can perform inference with the InvokeModel operation on different Amazon Bedrock base models.
A2I Jurassic-2
This examples shows how to call the A2I Jurassic-2 Mid model.
import boto3 import json brt = boto3.client(service_name='bedrock-runtime') body = json.dumps({ "prompt": "Translate to spanish: 'Amazon Bedrock is the easiest way to build and scale generative AI applications with base models (FMs)'.", "maxTokens": 200, "temperature": 0.5, "topP": 0.5 }) modelId = 'ai21.j2-mid-v1' accept = 'application/json' contentType = 'application/json' response = brt.invoke_model( body=body, modelId=modelId, accept=accept, contentType=contentType ) response_body = json.loads(response.get('body').read()) # text print(response_body.get('completions')[0].get('data').get('text'))
Cohere Command
This examples shows how to call the Cohere Command model.
import boto3 import json brt = boto3.client(service_name='bedrock-runtime') body = json.dumps({ "prompt": "How do you tie a tie?", "max_tokens": 200, "temperature": 0.5, "p": 0.5 }) modelId = 'cohere.command-text-v14' accept = 'application/json' contentType = 'application/json' response = brt.invoke_model( body=body, modelId=modelId, accept=accept, contentType=contentType ) response_body = json.loads(response.get('body').read()) # text print(response_body.get('generations')[0].get('text'))
Meta Llama 2
This example shows how to call the Llama 2 Chat 13B model.
import boto3 import json bedrock = boto3.client(service_name='bedrock-runtime', region_name='us-east-1') body = json.dumps({ "prompt": "What is the average lifespan of a Llama?", "max_gen_len": 128, "temperature": 0.1, "top_p": 0.9, }) modelId = 'meta.llama2-13b-chat-v1' accept = 'application/json' contentType = 'application/json' response = bedrock.invoke_model(body=body, modelId=modelId, accept=accept, contentType=contentType) response_body = json.loads(response.get('body').read()) print(response_body)
Stability AI Diffusion XL
This example shows how to call the Stability AI Stability Diffusion XL model.
import boto3 import json brt = boto3.client(service_name='bedrock-runtime') prompt_data = "A photograph of an dog on the top of a mountain covered in snow." body = json.dumps({ "text_prompts": [ { "text": prompt_data } ], "cfg_scale":10, "seed":20, "steps":50 }) modelId = "stability.stable-diffusion-xl-v0" accept = "application/json" contentType = "application/json" response = brt.invoke_model( body=body, modelId=modelId, accept=accept, contentType=contentType ) response_body = json.loads(response.get("body").read()) print(response_body['result']) print(f'{response_body.get("artifacts")[0].get("base64")[0:80]}...')