Class: Aws::Kendra::Types::SeedUrlConfiguration

Inherits:
Struct
  • Object
show all
Defined in:
gems/aws-sdk-kendra/lib/aws-sdk-kendra/types.rb

Overview

Provides the configuration information for the seed or starting point URLs to crawl.

When selecting websites to index, you must adhere to the Amazon Acceptable Use Policy and all other Amazon terms. Remember that you must only use Amazon Kendra Web Crawler to index your own web pages, or web pages that you have authorization to index.

Constant Summary collapse

SENSITIVE =
[]

Instance Attribute Summary collapse

Instance Attribute Details

#seed_urlsArray<String>

The list of seed or starting point URLs of the websites you want to crawl.

The list can include a maximum of 100 seed URLs.

Returns:

  • (Array<String>)


9201
9202
9203
9204
9205
9206
# File 'gems/aws-sdk-kendra/lib/aws-sdk-kendra/types.rb', line 9201

class SeedUrlConfiguration < Struct.new(
  :seed_urls,
  :web_crawler_mode)
  SENSITIVE = []
  include Aws::Structure
end

#web_crawler_modeString

You can choose one of the following modes:

  • HOST_ONLY—crawl only the website host names. For example, if the seed URL is "abc.example.com", then only URLs with host name "abc.example.com" are crawled.

  • SUBDOMAINS—crawl the website host names with subdomains. For example, if the seed URL is "abc.example.com", then "a.abc.example.com" and "b.abc.example.com" are also crawled.

  • EVERYTHING—crawl the website host names with subdomains and other domains that the web pages link to.

The default mode is set to HOST_ONLY.

Returns:

  • (String)


9201
9202
9203
9204
9205
9206
# File 'gems/aws-sdk-kendra/lib/aws-sdk-kendra/types.rb', line 9201

class SeedUrlConfiguration < Struct.new(
  :seed_urls,
  :web_crawler_mode)
  SENSITIVE = []
  include Aws::Structure
end