Data source template schemas - Amazon Kendra

Data source template schemas

The following are template schemas for data sources where templates are supported.

Dropbox template schema

You include a JSON that contains the data source schema as part of the TemplateConfiguration object. You provide the Dropbox app key as part of your secret that stores your authentication credentials. Also specify the type of data source as DROPBOX, the type of access token you want to use (temporary or permanent), and a secret for your authentication credentials as part of the repository configuration details. You then specify TEMPLATE as the Type when you call CreateDataSource.

You can use the template provided in this developer guide. See Dropbox JSON schema.

The following provides information on important JSON keys to configure.

Configuration Description
connectionConfiguration Configuration information for the endpoint for the data source.
repositoryEndpointMetadata The endpoint information for the data source. This data source does not specify an endpoint in repositoryEndpointMetadata. Rather, the connection information is included in an AWS Secrets Manager secret that you provide the secretArn.
repositoryConfigurations Configuration information for the content of the data source. For example, configuring specific types of content and field mappings. Specify the type of data source and the secret ARN.
  • file

  • paper

  • papert

  • shortcut

A list of objects that map the attributes or field names of your Dropbox files, Dropbox Paper, and shortcuts to Amazon Kendra index field names. For more information, see Mapping data source fields. The Dropbox data source field names must exist in your Dropbox custom metadata.
secretARN The Amazon Resource Name (ARN) of a Secrets Manager secret that contains the key-value pairs required to connect to your Dropbox. The secret must contain a JSON structure with the following keys:
{ "appKey": "Dropbox app key", "appSecret": "Dropbox app secret", "accesstoken": "temporary access token or refresh access token", }
additionalProperties Additional configuration options for your content in your data source
inclusionPatterns A list of regular expression patterns to include certain files in your Dropbox data source. Files that match the patterns are included in the index. Files that don't match the patterns are excluded from the index. If a file matches both an inclusion and exclusion pattern, the exclusion pattern takes precedence and the file isn't included in the index.
exclusionPatterns A list of regular expression patterns to exclude certain files in your Dropbox data source. Files that match the patterns are excluded from the index. Files that don't match the patterns are included in the index. If a file matches both an exclusion and inclusion pattern, the exclusion pattern takes precedence and the file isn't included in the index.
  • crawlFile

  • crawlPaper

  • crawlPapert

  • crawlShortcut

true to index files in your Dropbox, Dropbox Paper documents, Dropbox Paper templates, and webpage shortcuts stored in your Dropbox.
type The type of data source. Specify DROPBOX as your data source type.
useChangeLog true to use the Dropbox change log to determine which documents require adding updating, or deleting in the index. Depending on the change log's size, it may take longer for Amazon Kendra to use the change log than to scan all of your documents in your Dropbox.
tokenType Specify your access token type: permanent or temporary access token. It's recommended that you create a refresh access token that never expires in Dropbox rather that relying on a one-time access token that expires after 4 hours. You create an app and a refresh access token in the Dropbox developer console and provide the access token in your secret.
version The version of this template that is currently supported.
{ "$schema": "http://json-schema.org/draft-04/schema#", "type": "object", "properties": { "connectionConfiguration": { "type": "object", "properties": { "repositoryEndpointMetadata": { "type": "object", "properties": { } } }, "required": [ "repositoryEndpointMetadata" ] }, "repositoryConfigurations": { "type": "object", "properties": { "file": { "type": "object", "properties": { "fieldMappings": { "type": "array", "items": { "anyOf": [ { "type": "object", "properties": { "indexFieldName": { "type": "string" }, "indexFieldType": { "type": "string", "enum": [ "STRING", "STRING_LIST", "LONG", "DATE" ] }, "dataSourceFieldName": { "type": "string" }, "dateFieldFormat": { "type": "string", "pattern": "dd-MM-yyyy HH:mm:ss" } }, "required": [ "indexFieldName", "indexFieldType", "dataSourceFieldName" ] } ] } } }, "required": [ "fieldMappings" ] }, "paper": { "type": "object", "properties": { "fieldMappings": { "type": "array", "items": { "anyOf": [ { "type": "object", "properties": { "indexFieldName": { "type": "string" }, "indexFieldType": { "type": "string", "enum": [ "STRING", "STRING_LIST", "LONG", "DATE" ] }, "dataSourceFieldName": { "type": "string" }, "dateFieldFormat": { "type": "string", "pattern": "dd-MM-yyyy HH:mm:ss" } }, "required": [ "indexFieldName", "indexFieldType", "dataSourceFieldName" ] } ] } } }, "required": [ "fieldMappings" ] }, "papert": { "type": "object", "properties": { "fieldMappings": { "type": "array", "items": { "anyOf": [ { "type": "object", "properties": { "indexFieldName": { "type": "string" }, "indexFieldType": { "type": "string", "enum": [ "STRING", "STRING_LIST", "LONG", "DATE" ] }, "dataSourceFieldName": { "type": "string" }, "dateFieldFormat": { "type": "string", "pattern": "dd-MM-yyyy HH:mm:ss" } }, "required": [ "indexFieldName", "indexFieldType", "dataSourceFieldName" ] } ] } } }, "required": [ "fieldMappings" ] }, "shortcut": { "type": "object", "properties": { "fieldMappings": { "type": "array", "items": { "anyOf": [ { "type": "object", "properties": { "indexFieldName": { "type": "string" }, "indexFieldType": { "type": "string", "enum": [ "STRING", "STRING_LIST", "LONG", "DATE" ] }, "dataSourceFieldName": { "type": "string" }, "dateFieldFormat": { "type": "string", "pattern": "dd-MM-yyyy HH:mm:ss" } }, "required": [ "indexFieldName", "indexFieldType", "dataSourceFieldName" ] } ] } } }, "required": [ "fieldMappings" ] } } }, "secretArn": { "type": "string" }, "additionalProperties": { "type": "object", "properties": { "inclusionPatterns": { "type": "array" }, "exclusionPatterns": { "type": "array" }, "crawlFile": { "type": "boolean" }, "crawlPaper": { "type": "boolean" }, "crawlPapert": { "type": "boolean" }, "crawlShortcut": { "type": "boolean" } } }, "type": { "type": "string", "pattern": "DROPBOX" }, "useChangeLog": { "type": "string", "enum": [ "true", "false" ] }, "tokenType": { "type": "string", "enum": [ "PERMANENT", "TEMPORARY" ] }, "version": { "type": "string", "anyOf": [ { "pattern": "1.0.0" } ] } }, "additionalProperties": false, "required": [ "connectionConfiguration", "repositoryConfigurations", "additionalProperties", "useChangeLog", "secretArn", "type", "tokenType" ] }

Zendesk template schema

You include a JSON that contains the data source schema as part of TemplateConfiguration object. You provide the host URL as a part of the connection configuration or repository endpoint details. You must also specify the type of data source as ZENDESK and a secret for your authentication credentials as part of the repository configuration details. You then specify TEMPLATE as the Type when you call CreateDataSource.

You can use the template provided in this developer guide. See Zendesk JSON schema.

The following provides information on important JSON keys to configure.

Configuration Description
connectionConfiguration Configuration information for the endpoint for the data source
repositoryEndpointMetadata The endpoint information for the data source
hostURL The Zendesk host URL. For example, https://yoursubdomain.zendesk.com
repositoryConfigurations Configuration information for the content of the data source. For example, configuring specific types of content and field mappings. You must specify the type of data source and the secret ARN.
  • ticket

  • ticketComment

  • ticketCommentAttachment

  • article

  • articleComment

  • articleAttachment

  • communityTopic

  • communityPost

  • communityPostComment

A list of objects that map attributes or field names of Zendesk tickets to Amazon Kendra index field names. For more information, see Mapping data source fields. The Zendesk data source field names must exist in your Zendesk custom metadata.
secretARN The Amazon Resource Name (ARN) of a Secrets Managersecret that contains the key-value pairs required to connect to your Zendesk. The secret must contain a JSON structure with the following keys: host URL, client ID, Zendesk client secret, Zendesk user name, and Zendesk password.
additionalProperties Additional configuration options for your content in your data source
organizationFilter If desired, you can choose to index tickets that exist within a specific Organization
sinceDate If desired, you can configure a sinceDate parameter so the Zendesk connector will crawl based on the sinceDate.
inclusionPatterns A list of regular expression patterns to include certain files in your Zendesk data source. Files that match the patterns are included in the index. File that don't match the patterns are excluded from the index. If a file matches both an inclusion and exclusion pattern, the exclusion pattern takes precedence and the file isn't included in the index.
exclusionPatterns A list of regular expression patterns to exclude certain files in your Zendesk data source. Files that match the patterns are excluded from the index. X that don't match the patterns are included in the index. If a file matches both an exclusion and inclusion pattern, the exclusion pattern takes precedence and the file isn't included in the index.
  • isCrawlTicket

  • isCrawlTicketComment

  • isCrawlTicketCommentAttachment

  • isCrawlArticle

  • isCrawlArticleComment

  • isCrawlArticleAttachment

  • isCrawlCommunityTopic

  • isCrawlCommunityPost

  • isCrawlCommunityPostComment

Input TRUE to index
type Specify ZENDESK as your data source type
useChangeLog Input TRUEto use the Zendesk change log to determine which documents require updating in the index. Depending on the change log's size, it might be faster to scan the documents in Zendesk. If you are syncing your Zendesk data source with your index for the first time, all documents are scanned.
{ "$schema": "http://json-schema.org/draft-04/schema#", "type": "object", "properties": { "connectionConfiguration": { "type": "object", "properties": { "repositoryEndpointMetadata": { "type": "object", "properties": { "hostUrl": { "type": "string", "pattern": "https:.*" } }, "required": [ "hostUrl" ] } }, "required": [ "repositoryEndpointMetadata" ] }, "repositoryConfigurations": { "type": "object", "properties": { "ticket": { "type": "object", "properties": { "fieldMappings": { "type": "array", "items": { "anyOf": [ { "type": "object", "properties": { "indexFieldName": { "type": "string" }, "indexFieldType": { "type": "string", "enum": ["STRING", "STRING_LIST", "LONG", "DATE"] }, "dataSourceFieldName": { "type": "string" }, "dateFieldFormat": { "type": "string", "pattern": "dd-MM-yyyy HH:mm:ss" } }, "required": [ "indexFieldName", "indexFieldType", "dataSourceFieldName" ] } ] } } }, "required": [ "fieldMappings" ] }, "ticketComment": { "type": "object", "properties": { "fieldMappings": { "type": "array", "items": { "anyOf": [ { "type": "object", "properties": { "indexFieldName": { "type": "string" }, "indexFieldType": { "type": "string", "enum": ["STRING", "STRING_LIST", "LONG", "DATE"] }, "dataSourceFieldName": { "type": "string" }, "dateFieldFormat": { "type": "string", "pattern": "dd-MM-yyyy HH:mm:ss" } }, "required": [ "indexFieldName", "indexFieldType", "dataSourceFieldName" ] } ] } } }, "required": [ "fieldMappings" ] }, "ticketCommentAttachment": { "type": "object", "properties": { "fieldMappings": { "type": "array", "items": { "anyOf": [ { "type": "object", "properties": { "indexFieldName": { "type": "string" }, "indexFieldType": { "type": "string", "enum": ["STRING", "STRING_LIST", "LONG", "DATE"] }, "dataSourceFieldName": { "type": "string" }, "dateFieldFormat": { "type": "string", "pattern": "dd-MM-yyyy HH:mm:ss" } }, "required": [ "indexFieldName", "indexFieldType", "dataSourceFieldName" ] } ] } } }, "required": [ "fieldMappings" ] }, "article": { "type": "object", "properties": { "fieldMappings": { "type": "array", "items": { "anyOf": [ { "type": "object", "properties": { "indexFieldName": { "type": "string" }, "indexFieldType": { "type": "string", "enum": ["STRING", "STRING_LIST", "LONG", "DATE"] }, "dataSourceFieldName": { "type": "string" }, "dateFieldFormat": { "type": "string", "pattern": "dd-MM-yyyy HH:mm:ss" } }, "required": [ "indexFieldName", "indexFieldType", "dataSourceFieldName" ] } ] } } }, "required": [ "fieldMappings" ] }, "communityPostComment": { "type": "object", "properties": { "fieldMappings": { "type": "array", "items": { "anyOf": [ { "type": "object", "properties": { "indexFieldName": { "type": "string" }, "indexFieldType": { "type": "string", "enum": ["STRING", "STRING_LIST", "LONG", "DATE"] }, "dataSourceFieldName": { "type": "string" }, "dateFieldFormat": { "type": "string", "pattern": "dd-MM-yyyy HH:mm:ss" } }, "required": [ "indexFieldName", "indexFieldType", "dataSourceFieldName" ] } ] } } }, "required": [ "fieldMappings" ] }, "articleComment": { "type": "object", "properties": { "fieldMappings": { "type": "array", "items": { "anyOf": [ { "type": "object", "properties": { "indexFieldName": { "type": "string" }, "indexFieldType": { "type": "string", "enum": ["STRING", "STRING_LIST", "LONG", "DATE"] }, "dataSourceFieldName": { "type": "string" }, "dateFieldFormat": { "type": "string", "pattern": "dd-MM-yyyy HH:mm:ss" } }, "required": [ "indexFieldName", "indexFieldType", "dataSourceFieldName" ] } ] } } }, "required": [ "fieldMappings" ] }, "articleAttachment": { "type": "object", "properties": { "fieldMappings": { "type": "array", "items": { "anyOf": [ { "type": "object", "properties": { "indexFieldName": { "type": "string" }, "indexFieldType": { "type": "string", "enum": ["STRING", "STRING_LIST", "LONG", "DATE"] }, "dataSourceFieldName": { "type": "string" }, "dateFieldFormat": { "type": "string", "pattern": "dd-MM-yyyy HH:mm:ss" } }, "required": [ "indexFieldName", "indexFieldType", "dataSourceFieldName" ] } ] } } }, "required": [ "fieldMappings" ] }, "communityTopic": { "type": "object", "properties": { "fieldMappings": { "type": "array", "items": { "anyOf": [ { "type": "object", "properties": { "indexFieldName": { "type": "string" }, "indexFieldType": { "type": "string", "enum": ["STRING", "STRING_LIST", "LONG", "DATE"] }, "dataSourceFieldName": { "type": "string" }, "dateFieldFormat": { "type": "string", "pattern": "dd-MM-yyyy HH:mm:ss" } }, "required": [ "indexFieldName", "indexFieldType", "dataSourceFieldName" ] } ] } } }, "required": [ "fieldMappings" ] } } }, "secretArn": { "type": "string", "minLength": 20, "maxLength": 2048 }, "additionalProperties": { "type": "object", "properties": { "organizationNameFilter": { "type": "array" }, "sinceDate": { "type": "string", "pattern": "^[0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2}$" }, "inclusionPatterns": { "type": "array" }, "exclusionPatterns": { "type": "array" }, "isCrawTicket": { "type": "string" }, "isCrawTicketComment": { "type": "string" }, "isCrawTicketCommentAttachment": { "type": "string" }, "isCrawlArticle": { "type": "string" }, "isCrawlArticleAttachment": { "type": "string" }, "isCrawlArticleComment": { "type": "string" }, "isCrawlCommunityTopic": { "type": "string" }, "isCrawlCommunityPost": { "type": "string" }, "isCrawlCommunityPostComment": { "type": "string" } } }, "type": { "type": "string", "pattern": "ZENDESK" }, "useChangeLog": { "type": "string", "enum": ["true", "false"] } }, "version": { "type": "string", "anyOf": [ { "pattern": "1.0.0" } ] }, "additionalProperties": false, "required": [ "connectionConfiguration", "repositoryConfigurations", "additionalProperties", "useChangeLog", "secretArn", "type" ] }