Retrieving passages
You can use the Retrieve API as a retriever for retrieval augmented generation (RAG) systems.
RAG systems use generative artificial intelligence to build question-answering applications. RAG systems consist of a retriever and large language models (LLM). Given a query, the retriever identifies the most relevant chunks of text from a corpus of documents and feeds it to the LLM to provide the most useful answer. Then, the LLM analyzes the relevant text chunks or passages and generates a comprehensive response for the query.
The Retrieve
API looks at chunks of text or excerpts that
are referred to as passages and returns the top
passages that are most relevant to the query.
Like the Query API, the Retrieve
API also searches
for relevant information. Retrieve API's information retrieval takes into account the
query's context, and all the available information from the indexed documents. However,
by default, the Query
API only returns excerpt passages of
up to 100 token words. With the Retrieve
API, you can
retrieve longer passages of up to 200 token words and up to 100 semantically relevant
passages. This doesn't include question-answer or FAQ type responses from your index.
The passages, also called chunks, are text excerpts that can be semantically extracted
from multiple documents and multiple parts of the same document. Kendra's GenAI
Enterprise Edition index offers high accuracy results for retrieve, using a hybrid
search over vector and keyword indices along with ranking by deep learning
models.
You can also do the following with the Retrieve
API:
-
Override boosting at the index level
-
Filter based on document fields or attributes
-
Filter based on the user or their group access to documents
-
View the confidence score bucket for a retrieved passage result. The confidence bucket provides a relative ranking that indicates how confident Amazon Kendra is that the response is relevant to the query.
Note
Confidence score buckets are currently available only for English.
You can also include certain fields in the response that might provide useful additional information.
The Retrieve
API currently doesn't support the following
features: querying using advance
query syntax, suggested spell corrections
for queries, faceting, query
suggestions to autocomplete search queries, and incremental learning. Any
retrieve API queries will not surface in the analytics dashboard.
The Retrieve
API shares the number of query capacity
units that you set for your index. For more information on what's included
in a single capacity unit and the default base capacity for an index, see Adjusting
capacity.
Note
You can't add capacity if you are using the Amazon Kendra Developer Edition; you can only add capacity when using Amazon Kendra Enterprise Edition. For more information on what's included in the Developer and Enterprise Editions, see Amazon Kendra Editions.
The following is an example of using the Retrieve
API to
retrieve the top 100 most relevant passages from documents in an index for the query
"how does amazon kendra work?"