Query a knowledge base and generate responses based off the retrieved data
Important
Guardrails are applied only to the input and the generated response from the LLM. They are not applied to the references retrieved from Knowledge Bases at runtime.
After your knowledge base is set up, you can query it and generate responses based on the chunks retrieved from your source data by using the RetrieveAndGenerate API operation. The responses are returned with citations to the original source data. You can also use a reranking model instead of the default Amazon Bedrock Knowledge Bases ranker to rank source chunks for relevance during retrieval.
Multimodal content limitations
RetrieveAndGenerate has limited support for multimodal content. When using Nova Multimodal Embeddings, RAG functionality is restricted to text content only. For full multimodal support including audio and video processing, use BDA with text embedding models. For details, see Build a knowledge base for multimodal content.
Note
Images returned from the Retrieve response during the RetrieveAndGenerate flow are included in the prompt for response generation. The RetrieveAndGenerate response can't include images, but it can cite the sources that contain the images.
To learn how to query your knowledge base, choose the tab for your preferred method, and then follow the steps:
Note
If you receive an error that the prompt exceeds the character limit while generating responses, you can shorten the prompt in the following ways:
-
Reduce the maximum number of retrieved results (this shortens what is filled in for the $search_results$ placeholder in the Knowledge base prompt templates: orchestration & generation).
-
Recreate the data source with a chunking strategy that uses smaller chunks (this shortens what is filled in for the $search_results$ placeholder in the Knowledge base prompt templates: orchestration & generation).
-
Shorten the prompt template.
-
Shorten the user query (this shortens what is filled in for the $query$ placeholder in the Knowledge base prompt templates: orchestration & generation).