public static interface CfnDataSource.WebCrawlerSeedUrlConfigurationProperty
When selecting websites to index, you must adhere to the Amazon Acceptable Use Policy and all other Amazon terms. Remember that you must only use the Amazon Kendra web crawler to index your own webpages, or webpages that you have authorization to index.
Example:
// The code below shows an example of how to instantiate this type. // The values are placeholders you should change. import software.amazon.awscdk.services.kendra.*; WebCrawlerSeedUrlConfigurationProperty webCrawlerSeedUrlConfigurationProperty = WebCrawlerSeedUrlConfigurationProperty.builder() .seedUrls(List.of("seedUrls")) // the properties below are optional .webCrawlerMode("webCrawlerMode") .build();
Modifier and Type | Interface and Description |
---|---|
static class |
CfnDataSource.WebCrawlerSeedUrlConfigurationProperty.Builder
A builder for
CfnDataSource.WebCrawlerSeedUrlConfigurationProperty |
static class |
CfnDataSource.WebCrawlerSeedUrlConfigurationProperty.Jsii$Proxy
An implementation for
CfnDataSource.WebCrawlerSeedUrlConfigurationProperty |
Modifier and Type | Method and Description |
---|---|
static CfnDataSource.WebCrawlerSeedUrlConfigurationProperty.Builder |
builder() |
java.util.List<java.lang.String> |
getSeedUrls()
The list of seed or starting point URLs of the websites you want to crawl.
|
default java.lang.String |
getWebCrawlerMode()
You can choose one of the following modes:.
|
java.util.List<java.lang.String> getSeedUrls()
The list can include a maximum of 100 seed URLs.
default java.lang.String getWebCrawlerMode()
HOST_ONLY
– crawl only the website host names. For example, if the seed URL is "abc.example.com", then only URLs with host name "abc.example.com" are crawled.SUBDOMAINS
– crawl the website host names with subdomains. For example, if the seed URL is "abc.example.com", then "a.abc.example.com" and "b.abc.example.com" are also crawled.EVERYTHING
– crawl the website host names with subdomains and other domains that the webpages link to.
The default mode is set to HOST_ONLY
.