Understanding document attributes in Amazon Q Business - Amazon Q Business

Understanding document attributes in Amazon Q Business

Every document has structural attributes—or metadata—attached to it. Document attributes can include information such as document title, document author, time created, time updated, and document type.

You can map document attributes to fields in your Amazon Q Business index. Once mapped to document attributes, these index fields can be used by admin to boost results from specific sources, or by end users to filter and scope their chat results to specific data.

Note

Filtering using document attributes in chat is only supported through the API. Boosting search results using document attributes is supported on both the console and the API.

You can use document attributes to prepare your data for—and customize and control— end user chat. To learn more, see Filtering using metadata, Document enrichment in Amazon Q Business, and Relevance tuning.

Types of document attributes

Amazon Q Business supports two types of document attributes: reserved and custom.

Reserved or default document attributes are provided by Amazon Q Business to map commonly occurring document attributes to index fields. Custom attributes, on the other hand, can be used to map document attributes unique to your content to index fields.

Both reserved and custom document attributes can be used to customize end user chat experience.

The following section outlines the available document attributes.

Reserved document attributes

Amazon Q Business offers the following reserved document attributes or index fields that you can map your metadata to:

  • _authors – A list of one or more authors responsible for the content of the document.

  • _category – A category that places a document in a specific group.

  • _created_at – The date and time in ISO 8601 format that the document was created. For example, 2012-03-25T12:30:10+01:00 is the ISO 8601 date-time format for March 25, 2012 at 12:30 PM (plus 10 seconds) in Central European Time.

  • _data_source_id – The identifier of the data source that contains the document.

  • _document_body – The content of the document.

  • _document_id – A unique identifier for the document.

  • _document_title – The title of the document.

  • _file_type – The file type of the document, such as .pdf or .docx.

  • _last_updated_at – The date and time in ISO 8601 format that the document was last updated. For example, 2012-03-25T12:30:10+01:00 is the ISO 8601 date-time format for March 25, 2012 at 12:30 PM (plus 10 seconds) in Central European Time.

  • _source_uri – The URI where the document is available. For example, the URI of the document on a company website.

  • _version – An identifier for the specific version of a document.

  • _view_count – The number of times that the document has been viewed.

  • _language_code (String) – The code for a language that applies to the document. This defaults to English if you don't specify a language.

Custom document attributes

You can also create custom attributes based on your own enterprise data. Then, you can map the custom attributes to custom index fields that you create for a more tailored end user chat experience.

For example, you can create a custom field or attribute called "Department" with the values of "HR", "Sales", and "Manufacturing". Then, you can use these fields or attributes to allow your end users to filter their chat results to documents in the "HR" department, or restrict response generation to specific data stores.

You can create up to 50 custom fields or attributes.

Important

Once created, you can't delete or rename any attributes.

Mapped document attributes

When a document attribute—reserved or custom—is mapped to an index field, you can choose how the field will be used during chat. You can currently configure index fields to perform the following action:

  • Search – Allows end users the ability to search data with the specified attributes.

Document attribute data types

Document attributes—reserved or custom—can only be the data types that are shown in the following table. Additionally, document attributes can be used to perform the operations outlined.

Data type Searchable Filterable Boostable
Date No Yes Yes
Number No Yes Yes
String Yes Yes Yes
String list Yes Yes Yes

For more information on filtering and boosting using document attributes, see Filtering using document-attributes and Boosting using document attributes.

Note

You can’t change an index field type after it has been created.