RetrieveAndGenerate
Queries a knowledge base and generates responses based on the retrieved results. The response only cites sources that are relevant to the query.
Request Syntax
POST /retrieveAndGenerate HTTP/1.1
Content-type: application/json
{
"input": {
"text": "string
"
},
"retrieveAndGenerateConfiguration": {
"externalSourcesConfiguration": {
"generationConfiguration": {
"additionalModelRequestFields": {
"string
" : JSON value
},
"guardrailConfiguration": {
"guardrailId": "string
",
"guardrailVersion": "string
"
},
"inferenceConfig": {
"textInferenceConfig": {
"maxTokens": number
,
"stopSequences": [ "string
" ],
"temperature": number
,
"topP": number
}
},
"promptTemplate": {
"textPromptTemplate": "string
"
}
},
"modelArn": "string
",
"sources": [
{
"byteContent": {
"contentType": "string
",
"data": blob
,
"identifier": "string
"
},
"s3Location": {
"uri": "string
"
},
"sourceType": "string
"
}
]
},
"knowledgeBaseConfiguration": {
"generationConfiguration": {
"additionalModelRequestFields": {
"string
" : JSON value
},
"guardrailConfiguration": {
"guardrailId": "string
",
"guardrailVersion": "string
"
},
"inferenceConfig": {
"textInferenceConfig": {
"maxTokens": number
,
"stopSequences": [ "string
" ],
"temperature": number
,
"topP": number
}
},
"promptTemplate": {
"textPromptTemplate": "string
"
}
},
"knowledgeBaseId": "string
",
"modelArn": "string
",
"retrievalConfiguration": {
"vectorSearchConfiguration": {
"filter": { ... },
"numberOfResults": number
,
"overrideSearchType": "string
"
}
}
},
"type": "string
"
},
"sessionConfiguration": {
"kmsKeyArn": "string
"
},
"sessionId": "string
"
}
URI Request Parameters
The request does not use any URI parameters.
Request Body
The request accepts the following data in JSON format.
- input
-
Contains the query to be made to the knowledge base.
Type: RetrieveAndGenerateInput object
Required: Yes
- retrieveAndGenerateConfiguration
-
Contains configurations for the knowledge base query and retrieval process. For more information, see Query configurations.
Type: RetrieveAndGenerateConfiguration object
Required: No
- sessionConfiguration
-
Contains details about the session with the knowledge base.
Type: RetrieveAndGenerateSessionConfiguration object
Required: No
- sessionId
-
The unique identifier of the session. Reuse the same value to continue the same session with the knowledge base.
Type: String
Length Constraints: Minimum length of 2. Maximum length of 100.
Pattern:
^[0-9a-zA-Z._:-]+$
Required: No
Response Syntax
HTTP/1.1 200
Content-type: application/json
{
"citations": [
{
"generatedResponsePart": {
"textResponsePart": {
"span": {
"end": number,
"start": number
},
"text": "string"
}
},
"retrievedReferences": [
{
"content": {
"text": "string"
},
"location": {
"s3Location": {
"uri": "string"
},
"type": "string"
},
"metadata": {
"string" : JSON value
}
}
]
}
],
"guardrailAction": "string",
"output": {
"text": "string"
},
"sessionId": "string"
}
Response Elements
If the action is successful, the service sends back an HTTP 200 response.
The following data is returned in JSON format by the service.
- citations
-
A list of segments of the generated response that are based on sources in the knowledge base, alongside information about the sources.
Type: Array of Citation objects
- guardrailAction
-
Specifies if there is a guardrail intervention in the response.
Type: String
Valid Values:
INTERVENED | NONE
- output
-
Contains the response generated from querying the knowledge base.
Type: RetrieveAndGenerateOutput object
- sessionId
-
The unique identifier of the session. Reuse the same value to continue the same session with the knowledge base.
Type: String
Length Constraints: Minimum length of 2. Maximum length of 100.
Pattern:
^[0-9a-zA-Z._:-]+$
Errors
For information about the errors that are common to all actions, see Common Errors.
- AccessDeniedException
-
The request is denied because of missing access permissions. Check your permissions and retry your request.
HTTP Status Code: 403
- BadGatewayException
-
There was an issue with a dependency due to a server issue. Retry your request.
HTTP Status Code: 502
- ConflictException
-
There was a conflict performing an operation. Resolve the conflict and retry your request.
HTTP Status Code: 409
- DependencyFailedException
-
There was an issue with a dependency. Check the resource configurations and retry the request.
HTTP Status Code: 424
- InternalServerException
-
An internal server error occurred. Retry your request.
HTTP Status Code: 500
- ResourceNotFoundException
-
The specified resource Amazon Resource Name (ARN) was not found. Check the Amazon Resource Name (ARN) and try your request again.
HTTP Status Code: 404
- ServiceQuotaExceededException
-
The number of requests exceeds the service quota. Resubmit your request later.
HTTP Status Code: 400
- ThrottlingException
-
The number of requests exceeds the limit. Resubmit your request later.
HTTP Status Code: 429
- ValidationException
-
Input validation failed. Check your request parameters and retry the request.
HTTP Status Code: 400
Examples
Send a basic query
The following example uses the minimally required fields to generate a response after querying a knowledge base.
POST /retrieveAndGenerate HTTP/1.1 Content-type: application/json { "input": { "text": "What is AWS?" }, "retrieveAndGenerateConfiguration": { "knowledgeBaseConfiguration": { "knowledgeBaseId": "KB12345678", "modelArn": "anthropic.claude-v2:1" }, "type": "KNOWLEDGE_BASE" } }
Send a query and include filters
To include filters in a knowledge base query, at least one of the data source files must include a .metadata.json
file. For example, if you had a data source of articles called articles.pdf
, accompanied by a metadata file called articles.metadata.json
, you could tag it for genre
, year
, and author
. In the Retrieve
request, you could apply the following filter to return all entertainment articles written after 2018
, in addition to cooking
or sports
articles written by authors starting with C
.
POST /retrieveAndGenerate HTTP/1.1 Content-type: application/json { "input": { "text": "What is AWS?", }, "retrieveAndGenerateConfiguration": { "knowledgeBaseConfiguration": { "knowledgeBaseId": "KB12345678", "modelArn": "anthropic.claude-v2:1", "retrievalConfiguration": { "vectorSearchConfiguration": { "numberOfResults": 5, "filter": { "orAll": [ { "andAll": [ { "equals": { "key": "genre", "value": "entertainment" } }, { "greaterThan": { "key": "year", "value": 2018 } } ] }, { "andAll": [ { "in": { "key": "genre", "value": ["cooking", "sports"] } }, { "startsWith": { "key": "author", "value": "C" } } ] } ] } } } }, "type": "KNOWLEDGE_BASE" } }
See Also
For more information about using this API in one of the language-specific AWS SDKs, see the following: