Supported models and regions for Amazon Bedrock Knowledge Bases
You can choose which models you want to use for knowledge bases and which region that applies to you.
If you use the Amazon Bedrock API, take note of your model Amazon Resource Name (ARN) that's required for converting your data into vector embeddings and for knowledge base retrieval and response generation. Copy the model ID for your chosen model for knowledge bases and construct the model ARN using the model (resource) ID, following the provided ARN examples for your model resource type.
If you use the Amazon Bedrock console, you are not required to construct a model ARN, as you can select an available model as part of the steps for creating a knowledge base.
Knowledge bases use an embedding model to convert your data into vector embeddings and store the embeddings in a vector database. You can retrieve the data using the Retrieve API operation.
To generate search queries from user prompts and to summarize results, you can use the RetrieveAndGenerate API operation. Response generation uses a text or multimodal model. Generating responses after data retrieval is supported with the following throughputs:
-
On-demand – Sends model inference requests to your current region. The rate or volume of your requests might be limited during peak utilization bursts. Choose on-demand throughput in the console or specify the model ID in an RetrieveAndGenerate request.
-
Cross-region inference – Distributes model inference requests across a set of regions to allow higher throughput and facilitate greater resilience. Specify an inference profile, which defines regional endpoints to send model invocation requests to, in a RetrieveAndGenerate or CreateDataSource request. For more information, see Set up a model invocation resource using inference profiles.
Important
If you use cross-region inference, your data can be shared across regions.
Amazon Bedrock Knowledge Bases is supported in the following Regions (for more information about Regions supported in Amazon Bedrock see Amazon Bedrock endpoints and quotas):
-
US East (N. Virginia)
-
US East (Ohio)
-
US West (Oregon)
-
Asia Pacific (Tokyo)
-
Asia Pacific (Seoul)
-
Asia Pacific (Mumbai)
-
Asia Pacific (Singapore) (Gated)
-
Asia Pacific (Sydney)
-
Canada (Central)
-
Europe (Frankfurt)
-
Europe (Ireland) (Gated)
-
Europe (London)
-
Europe (Paris)
-
South America (São Paulo)
-
AWS GovCloud (US-West)
You can use the following foundation models (to see which Regions support each model, refer to Supported foundation models in Amazon Bedrock) for creating vector embeddings from a data source:
-
Amazon Titan Embeddings G1 - Text
-
Amazon Titan Text Embeddings V2
-
Cohere Embed English
-
Cohere Embed Multilingual
You can use the following foundation models (to see which Regions support each model, refer to Supported foundation models in Amazon Bedrock) for knowledge base query:
-
AI21 Labs Jamba 1.5 Large
-
AI21 Labs Jamba 1.5 Mini
-
AI21 Labs Jamba-Instruct
-
Amazon Titan Text G1 - Premier
-
Anthropic Claude 2.1
-
Anthropic Claude 2
-
Anthropic Claude 3 Haiku
-
Anthropic Claude 3 Sonnet
-
Anthropic Claude 3.5 Haiku
-
Anthropic Claude 3.5 Sonnet v2
-
Anthropic Claude 3.5 Sonnet
-
Cohere Command R+
-
Cohere Command R
-
Meta Llama 3 70B Instruct
-
Meta Llama 3 8B Instruct
-
Meta Llama 3.1 405B Instruct
-
Meta Llama 3.1 70B Instruct
-
Meta Llama 3.1 8B Instruct
-
Meta Llama 3.2 11B Instruct
-
Meta Llama 3.2 90B Instruct
-
Mistral AI Mistral Large (24.02)
-
Mistral AI Mistral Large (24.07)
-
Mistral AI Mistral Small (24.02)
You can also customize the orchestration prompt, which turns the user's prompt into a search query, and the generation prompt, which summarizes results. Knowledge bases support the following generative AI models with default prompts.
For orchestration prompts, use the $conversation_history$
and $output_format_instructions$
variables to include the conversation text, and
standardized formatting instructions in the prompt. For generation prompts, use the
$search_results$
parameter to pull search results into the prompt.
The RetrieveAndGenerate API queries the knowledge base and uses supported Amazon Bedrock
knowledge base models to generate responses from the information it retrieves. The
Retrieve API
only queries the knowledge base; it doesn't generate responses. Therefore, after retrieving
results with the Retrieve
API, you could use the results in an
InvokeModel
request with any Amazon Bedrock or SageMaker model to generate responses.
You can use other models, including models that you train on your own data. When you use a custom model, you must specify the orchestration and generation prompts. Your prompts must include information variables to access the users input and context.