Connecting Amazon Q Business to GitHub (Server) using APIs - Amazon Q Business

Connecting Amazon Q Business to GitHub (Server) using APIs

You use the CreateDataSource action to connect a data source to your Amazon Q application.

Then, you use the configuration parameter to provide a JSON schema with all other configuration information specific to your data source connector.

For an example of the API request, see CreateDataSource in the Amazon Q API Reference.

You use the CreateDataSource action to connect a data source to your Amazon Q application.

Then, you use the configuration parameter to provide a JSON schema with all other configuration information specific to your data source connector.

For an example of the API request, see CreateDataSource in the Amazon Q API Reference.

GitHub JSON schema

The following is the GitHub JSON schema:

{ "$schema": "http://json-schema.org/draft-04/schema#", "type": "object", "properties": { "connectionConfiguration": { "type": "object", "properties": { "repositoryEndpointMetadata": { "type": "object", "properties": { "type": { "type": "string" }, "hostUrl": { "type": "string", "pattern": "https://.*" }, "organizationName": { "type": "string" } }, "required": [ "type", "hostUrl", "organizationName" ] } }, "required": [ "repositoryEndpointMetadata" ] }, "repositoryConfigurations": { "type": "object", "properties": { "ghRepository": { "type": "object", "properties": { "fieldMappings": { "type": "array", "items": [ { "type": "object", "properties": { "indexFieldName": { "type": "string" }, "indexFieldType": { "type": "string", "enum": [ "STRING", "STRING_LIST", "DATE" ] }, "dataSourceFieldName": { "type": "string" }, "dateFieldFormat": { "type": "string", "pattern": "yyyy-MM-dd'T'HH:mm:ss'Z'" } }, "required": [ "indexFieldName", "indexFieldType", "dataSourceFieldName" ] } ] } }, "required": [ "fieldMappings" ] }, "ghCommit": { "type": "object", "properties": { "fieldMappings": { "type": "array", "items": [ { "type": "object", "properties": { "indexFieldName": { "type": "string" }, "indexFieldType": { "type": "string", "enum": [ "STRING", "STRING_LIST", "DATE" ] }, "dataSourceFieldName": { "type": "string" }, "dateFieldFormat": { "type": "string", "pattern": "yyyy-MM-dd'T'HH:mm:ss'Z'" } }, "required": [ "indexFieldName", "indexFieldType", "dataSourceFieldName" ] } ] } }, "required": [ "fieldMappings" ] }, "ghIssueDocument": { "type": "object", "properties": { "fieldMappings": { "type": "array", "items": [ { "type": "object", "properties": { "indexFieldName": { "type": "string" }, "indexFieldType": { "type": "string", "enum": [ "STRING", "STRING_LIST", "DATE" ] }, "dataSourceFieldName": { "type": "string" }, "dateFieldFormat": { "type": "string", "pattern": "yyyy-MM-dd'T'HH:mm:ss'Z'" } }, "required": [ "indexFieldName", "indexFieldType", "dataSourceFieldName" ] } ] } }, "required": [ "fieldMappings" ] }, "ghIssueComment": { "type": "object", "properties": { "fieldMappings": { "type": "array", "items": [ { "type": "object", "properties": { "indexFieldName": { "type": "string" }, "indexFieldType": { "type": "string", "enum": [ "STRING", "STRING_LIST", "DATE" ] }, "dataSourceFieldName": { "type": "string" }, "dateFieldFormat": { "type": "string", "pattern": "yyyy-MM-dd'T'HH:mm:ss'Z'" } }, "required": [ "indexFieldName", "indexFieldType", "dataSourceFieldName" ] } ] } }, "required": [ "fieldMappings" ] }, "ghIssueAttachment": { "type": "object", "properties": { "fieldMappings": { "type": "array", "items": [ { "type": "object", "properties": { "indexFieldName": { "type": "string" }, "indexFieldType": { "type": "string", "enum": [ "STRING", "STRING_LIST", "DATE" ] }, "dataSourceFieldName": { "type": "string" }, "dateFieldFormat": { "type": "string", "pattern": "yyyy-MM-dd'T'HH:mm:ss'Z'" } }, "required": [ "indexFieldName", "indexFieldType", "dataSourceFieldName" ] } ] } }, "required": [ "fieldMappings" ] }, "ghPRDocument": { "type": "object", "properties": { "fieldMappings": { "type": "array", "items": [ { "type": "object", "properties": { "indexFieldName": { "type": "string" }, "indexFieldType": { "type": "string", "enum": [ "STRING", "STRING_LIST", "DATE" ] }, "dataSourceFieldName": { "type": "string" }, "dateFieldFormat": { "type": "string", "pattern": "yyyy-MM-dd'T'HH:mm:ss'Z'" } }, "required": [ "indexFieldName", "indexFieldType", "dataSourceFieldName" ] } ] } }, "required": [ "fieldMappings" ] }, "ghPRComment": { "type": "object", "properties": { "fieldMappings": { "type": "array", "items": [ { "type": "object", "properties": { "indexFieldName": { "type": "string" }, "indexFieldType": { "type": "string", "enum": [ "STRING", "STRING_LIST", "DATE" ] }, "dataSourceFieldName": { "type": "string" }, "dateFieldFormat": { "type": "string", "pattern": "yyyy-MM-dd'T'HH:mm:ss'Z'" } }, "required": [ "indexFieldName", "indexFieldType", "dataSourceFieldName" ] } ] } }, "required": [ "fieldMappings" ] }, "ghPRAttachment": { "type": "object", "properties": { "fieldMappings": { "type": "array", "items": [ { "type": "object", "properties": { "indexFieldName": { "type": "string" }, "indexFieldType": { "type": "string", "enum": [ "STRING", "STRING_LIST", "DATE" ] }, "dataSourceFieldName": { "type": "string" }, "dateFieldFormat": { "type": "string", "pattern": "yyyy-MM-dd'T'HH:mm:ss'Z'" } }, "required": [ "indexFieldName", "indexFieldType", "dataSourceFieldName" ] } ] } }, "required": [ "fieldMappings" ] } } }, "additionalProperties": { "type": "object", "properties": { "isCrawlAcl": { "type": "boolean" }, "maxFileSizeInMegaBytes": { "type": "string" }, "fieldForUserId": { "type": "string" }, "crawlRepository": { "type": "boolean" }, "crawlRepositoryDocuments": { "type": "boolean" }, "crawlIssue": { "type": "boolean" }, "crawlIssueComment": { "type": "boolean" }, "crawlIssueCommentAttachment": { "type": "boolean" }, "crawlPullRequest": { "type": "boolean" }, "crawlPullRequestComment": { "type": "boolean" }, "crawlPullRequestCommentAttachment": { "type": "boolean" }, "repositoryFilter": { "type": "array", "items": [ { "type": "object", "properties": { "repositoryName": { "type": "string" }, "branchNameList": { "type": "array", "items": { "type": "string" } } } } ] }, "inclusionFolderNamePatterns": { "type": "array", "items": { "type": "string" } }, "inclusionFileTypePatterns": { "type": "array", "items": { "type": "string" } }, "inclusionFileNamePatterns": { "type": "array", "items": { "type": "string" } }, "exclusionFolderNamePatterns": { "type": "array", "items": { "type": "string" } }, "exclusionFileTypePatterns": { "type": "array", "items": { "type": "string" } }, "exclusionFileNamePatterns": { "type": "array", "items": { "type": "string" } } }, "required": [] }, "type": { "type": "string", "pattern": "GITHUB" }, "syncMode": { "type": "string", "enum": [ "FULL_CRAWL", "FORCED_FULL_CRAWL", "CHANGE_LOG" ] }, "enableIdentityCrawler": { "type": "boolean" }, "secretArn": { "type": "string", "minLength": 20, "maxLength": 2048 } }, "version": { "type": "string", "anyOf": [ { "pattern": "1.0.0" } ] }, "required": [ "connectionConfiguration", "repositoryConfigurations", "syncMode", "additionalProperties", "enableIdentityCrawler" ] }

The following table provides information about important JSON keys to configure.

Configuration Description
connectionConfiguration Configuration information for the endpoint for the data source.
repositoryEndpointMetadata The endpoint information for the data source.
hostUrl The GitHub (Server) host URL. For example, if you use GitHub (Server) Enterprise Cloud: https://api.github.com. Or, if you use GitHub (Server) Enterprise Server: https://on-prem-host-url/api/v3/.
organizationName You can find your organization name when you log in to GitHub (Server) desktop and go to Your organizations under your profile picture dropdown.
repositoryConfigurations Configuration information for the content of the data source. For example, configuring specific types of content and field mappings.
  • ghRepository

  • ghCommit

  • ghIssueDocument

  • ghIssueComment

  • ghIssueAttachment

  • ghPRDocument

  • ghPRComment

  • ghPRAttachment

A list of objects that map the attributes or field names of your GitHub pages and assets to Amazon Q index field names.
additionalProperties Additional configuration options for your content in your data source.
isCrawlAcl Specify true to crawl access control information from documents.
maxFileSizeInMegaBytes Specify the maximum single file size limit in MBs that Amazon Q will crawl. Amazon Q will crawl only the files within the size limit you define. The default file size is 50MB. The maximum file size should be greater than 0MB and less than or equal to 50MB.
fieldForUserId Specify field to use for UserId for ACL crawling.
repositoryFilter A list of names of the specific repositories and branch names you want to index.
crawlRepository Specify true to crawl repositories.
crawlRepositoryDocuments Specify true to crawl repository documents.
crawlIssue Specify true to crawl issues.
crawlIssueComment Specify true to crawl issue comments.
crawlIssueCommentAttachment Specify true to crawl issue comment attachments.
crawlPullRequest Specify true to crawl pull requests.
crawlPullRequestComment Specify true to crawl pull request comments.
crawlPullRequestCommentAttachment Specify true to crawl pull request comment attachments.
  • inclusionFolderNamePatterns

  • inclusionFileTypePatterns

  • inclusionFileNamePatterns

A list of regular expression patterns to include specific content in your GitHub data source. Content that matches the patterns are included in the index. Content that doesn't match the patterns are excluded from the index. If any content matches both an inclusion and exclusion pattern, the exclusion pattern takes precedence, and the content isn't included in the index.
  • exclusionFolderNamePatterns

  • exclusionFileTypePatterns

  • exclusionFileNamePatterns

A list of regular expression patterns to exclude specific content in your GitHub data source. Content that matches the patterns are included in the index. Content that doesn't match the patterns are excluded from the index. If any content matches both an inclusion and exclusion pattern, the exclusion pattern takes precedence, and the content isn't included in the index.
type The type of data source. Specify GITHUB as your data source type.
enableIdentityCrawler Specify true to use the Amazon Q identity crawler to sync identity/principal information on users and groups with access to specific documents.
syncMode

Specify whether Amazon Q should update your index by syncing all documents or only new, modified, and deleted documents. You can choose between the following options:

  • Use FORCED_FULL_CRAWL to freshly re-crawl all content and replace existing content each time your data source syncs with your index.

  • Use FULL_CRAWL to incrementally crawl only new, modified, and deleted content each time your data source syncs with your index.

  • Use CHANGE_LOG to incrementally crawl only new and modified content each time your data source syncs with your index.

secretArn

The Amazon Resource Name (ARN) of an AWS Secrets Manager secret that contains the key-value pairs required to connect to your GitHub (Server). The secret must contain a JSON structure with the following keys:

{ "personalToken": "token" }
version The version of this template that's currently supported.