BatchPutDocument - Amazon Kendra

BatchPutDocument

Adds one or more documents to an index.

The BatchPutDocument operation enables you to ingest inline documents or a set of documents stored in an Amazon S3 bucket. Use this operation to ingest your text and unstructured text into an index, add custom attributes to the documents, and to attach an access control list to the documents added to the index.

The documents are indexed asynchronously. You can see the progress of the batch using AWS CloudWatch. Any error messages related to processing the batch are sent to your AWS CloudWatch log.

Request Syntax

{ "Documents": [ { "AccessControlList": [ { "Access": "string", "Name": "string", "Type": "string" } ], "Attributes": [ { "Key": "string", "Value": { "DateValue": number, "LongValue": number, "StringListValue": [ "string" ], "StringValue": "string" } } ], "Blob": blob, "ContentType": "string", "Id": "string", "S3Path": { "Bucket": "string", "Key": "string" }, "Title": "string" } ], "IndexId": "string", "RoleArn": "string" }

Request Parameters

For information about the parameters that are common to all actions, see Common Parameters.

The request accepts the following data in JSON format.

Documents

One or more documents to add to the index.

Documents have the following file size limits.

  • 5 MB total size for inline documents

  • 50 MB total size for files from an S3 bucket

  • 5 MB extracted text for any file

For more information about file size and transaction per second quotas, see Quotas.

Type: Array of Document objects

Array Members: Minimum number of 1 item. Maximum number of 10 items.

Required: Yes

IndexId

The identifier of the index to add the documents to. You need to create the index first using the CreateIndex operation.

Type: String

Length Constraints: Fixed length of 36.

Pattern: [a-zA-Z0-9][a-zA-Z0-9-]*

Required: Yes

RoleArn

The Amazon Resource Name (ARN) of a role that is allowed to run the BatchPutDocument operation. For more information, see IAM Roles for Amazon Kendra.

Type: String

Length Constraints: Minimum length of 1. Maximum length of 1284.

Pattern: arn:[a-z0-9-\.]{1,63}:[a-z0-9-\.]{0,63}:[a-z0-9-\.]{0,63}:[a-z0-9-\.]{0,63}:[^/].{0,1023}

Required: No

Response Syntax

{ "FailedDocuments": [ { "ErrorCode": "string", "ErrorMessage": "string", "Id": "string" } ] }

Response Elements

If the action is successful, the service sends back an HTTP 200 response.

The following data is returned in JSON format by the service.

FailedDocuments

A list of documents that were not added to the index because the document failed a validation check. Each document contains an error message that indicates why the document couldn't be added to the index.

If there was an error adding a document to an index the error is reported in your AWS CloudWatch log. For more information, see Monitoring Amazon Kendra with Amazon CloudWatch Logs

Type: Array of BatchPutDocumentResponseFailedDocument objects

Errors

For information about the errors that are common to all actions, see Common Errors.

AccessDeniedException

HTTP Status Code: 400

ConflictException

HTTP Status Code: 400

InternalServerException

HTTP Status Code: 500

ResourceNotFoundException

HTTP Status Code: 400

ServiceQuotaExceededException

HTTP Status Code: 400

ThrottlingException

HTTP Status Code: 400

ValidationException

HTTP Status Code: 400

See Also

For more information about using this API in one of the language-specific AWS SDKs, see the following: