S3DataSourceConfiguration - Amazon Kendra

S3DataSourceConfiguration

Provides configuration information for a data source to index documents in an Amazon S3 bucket.

Contents

AccessControlListConfiguration

Provides the path to the S3 bucket that contains the user context filtering files for the data source. For the format of the file, see Access control for S3 data sources.

Type: AccessControlListConfiguration object

Required: No

BucketName

The name of the bucket that contains the documents.

Type: String

Length Constraints: Minimum length of 3. Maximum length of 63.

Pattern: [a-z0-9][\.\-a-z0-9]{1,61}[a-z0-9]

Required: Yes

DocumentsMetadataConfiguration

Document metadata files that contain information such as the document access control information, source URI, document author, and custom attributes. Each metadata file contains metadata about a single document.

Type: DocumentsMetadataConfiguration object

Required: No

ExclusionPatterns

A list of glob patterns for documents that should not be indexed. If a document that matches an inclusion prefix or inclusion pattern also matches an exclusion pattern, the document is not indexed.

Some examples are:

  • *.png , *.jpg will exclude all PNG and JPEG image files in a directory (files with the extensions .png and .jpg).

  • *internal* will exclude all files in a directory that contain 'internal' in the file name, such as 'internal', 'internal_only', 'company_internal'.

  • **/*internal* will exclude all internal-related files in a directory and its subdirectories.

Type: Array of strings

Array Members: Minimum number of 0 items. Maximum number of 100 items.

Length Constraints: Minimum length of 1. Maximum length of 150.

Required: No

InclusionPatterns

A list of glob patterns for documents that should be indexed. If a document that matches an inclusion pattern also matches an exclusion pattern, the document is not indexed.

Some examples are:

  • *.txt will include all text files in a directory (files with the extension .txt).

  • **/*.txt will include all text files in a directory and its subdirectories.

  • *tax* will include all files in a directory that contain 'tax' in the file name, such as 'tax', 'taxes', 'income_tax'.

Type: Array of strings

Array Members: Minimum number of 0 items. Maximum number of 100 items.

Length Constraints: Minimum length of 1. Maximum length of 150.

Required: No

InclusionPrefixes

A list of S3 prefixes for the documents that should be included in the index.

Type: Array of strings

Array Members: Minimum number of 0 items. Maximum number of 100 items.

Length Constraints: Minimum length of 1. Maximum length of 150.

Required: No

See Also

For more information about using this API in one of the language-specific AWS SDKs, see the following: