Connecting Amazon Q Business to Amazon S3 using APIs

You use the CreateDataSource action to connect a data source to your Amazon Q application.

Then, you use the configuration parameter to provide a JSON schema with all other configuration information specific to your data source connector.

For an example of the API request, see CreateDataSource in the Amazon Q API Reference.

Amazon S3 JSON schema

The following is the Amazon S3 JSON schema:

{ "$schema": "", "type": "object", "properties": { "connectionConfiguration": { "type": "object", "properties": { "repositoryEndpointMetadata": { "type": "object", "properties": { "BucketName": { "type": "string" } }, "required": [ "BucketName" ] } }, "required": [ "repositoryEndpointMetadata" ] }, "repositoryConfigurations": { "type": "object", "properties": { "document": { "type": "object", "properties": { "fieldMappings": { "type": "array", "items": [ { "type": "object", "properties": { "indexFieldName": { "type": "string" }, "indexFieldType": { "type": "string", "enum": [ "STRING" ] }, "dataSourceFieldName": { "type": "string" } }, "required": [ "indexFieldName", "indexFieldType", "dataSourceFieldName" ] } ] } }, "required": [ "fieldMappings" ] } }, "required": [ "document" ] }, "additionalProperties": { "type": "object", "properties": { "inclusionPatterns": { "type": "array" }, "exclusionPatterns": { "type": "array" }, "inclusionPrefixes": { "type": "array" }, "exclusionPrefixes": { "type": "array" }, "aclConfigurationFilePath": { "type": "string" }, "metadataFilesPrefix": { "type": "string" }, "maxFileSizeInMegaBytes": { "type": "string" } } }, "syncMode": { "type": "string", "enum": [ "FULL_CRAWL", "FORCED_FULL_CRAWL" ] }, "type": { "type": "string", "pattern": "S3" }, "version": { "type": "string", "anyOf": [ { "pattern": "1.0.0" } ] } }, "required": [ "connectionConfiguration", "type", "syncMode", "repositoryConfigurations" ] }

The following provides information about important JSON keys to configure.

Configuration Description
connectionConfiguration Configuration information for the endpoint for the data source.
repositoryEndpointMetadata The endpoint information for the data source.
BucketName The name of your Amazon S3 bucket.
repositoryConfigurations Configuration information for the content of the data source. For example, configuring specific types of content and field mappings.
additionalProperties Additional configuration options for your content in your data source
  • inclusionPatterns

  • exclusionPatterns

  • inclusionPrefixes

  • exclusionPrefixes

A list of regular expression patterns to include or exclude specific files in your Amazon S3 data source. Files that match the patterns are included in the index. Files that don't match the patterns are excluded from the index. If a file matches both an inclusion and exclusion pattern, the exclusion pattern takes precedence and the file isn't included in the index.
aclConfigurationFilePath The path to the file that controls access control information for your documents in an Amazon Q index.
metadataFilesPrefix The location, in your Amazon S3 bucket, of your document metadata files.
maxFileSizeInMegaBytes Specify the maximum single file size limit in MBs that Amazon Q will crawl. Amazon Q will crawl only the files within the size limit you define. The default file size is 50MB. The maximum file size should be greater than 0MB and less than or equal to 50MB.
syncMode Specify whether Amazon Q should update your index by syncing all documents or only new, modified, and deleted documents. You can choose from the following options:
  • Use FORCED_FULL_CRAWL to freshly re-crawl all content and replace existing content each time your data source syncs with your index

  • Use FULL_CRAWL to incrementally crawl only new, modified, and deleted content each time your data source syncs with your index

type The type of data source. Specify S3 as your data source type.
version The version of the template that's supported.