Crawler
Specifies a crawler program that examines a data source and uses classifiers to try to determine its schema. If successful, the crawler records metadata concerning the data source in the AWS Glue Data Catalog.
Contents
- Classifiers
-
A list of UTF-8 strings that specify the custom classifiers that are associated with the crawler.
Type: Array of strings
Length Constraints: Minimum length of 1. Maximum length of 255.
Pattern:
[\u0020-\uD7FF\uE000-\uFFFD\uD800\uDC00-\uDBFF\uDFFF\t]*
Required: No
- Configuration
-
Crawler configuration information. This versioned JSON string allows users to specify aspects of a crawler's behavior. For more information, see Setting crawler configuration options.
Type: String
Required: No
- CrawlElapsedTime
-
If the crawler is running, contains the total time elapsed since the last crawl began.
Type: Long
Required: No
- CrawlerSecurityConfiguration
-
The name of the
SecurityConfiguration
structure to be used by this crawler.Type: String
Length Constraints: Minimum length of 0. Maximum length of 128.
Required: No
- CreationTime
-
The time that the crawler was created.
Type: Timestamp
Required: No
- DatabaseName
-
The name of the database in which the crawler's output is stored.
Type: String
Required: No
- Description
-
A description of the crawler.
Type: String
Length Constraints: Minimum length of 0. Maximum length of 2048.
Pattern:
[\u0020-\uD7FF\uE000-\uFFFD\uD800\uDC00-\uDBFF\uDFFF\r\n\t]*
Required: No
- LakeFormationConfiguration
-
Specifies whether the crawler should use AWS Lake Formation credentials for the crawler instead of the IAM role credentials.
Type: LakeFormationConfiguration object
Required: No
- LastCrawl
-
The status of the last crawl, and potentially error information if an error occurred.
Type: LastCrawlInfo object
Required: No
- LastUpdated
-
The time that the crawler was last updated.
Type: Timestamp
Required: No
- LineageConfiguration
-
A configuration that specifies whether data lineage is enabled for the crawler.
Type: LineageConfiguration object
Required: No
- Name
-
The name of the crawler.
Type: String
Length Constraints: Minimum length of 1. Maximum length of 255.
Pattern:
[\u0020-\uD7FF\uE000-\uFFFD\uD800\uDC00-\uDBFF\uDFFF\t]*
Required: No
- RecrawlPolicy
-
A policy that specifies whether to crawl the entire dataset again, or to crawl only folders that were added since the last crawler run.
Type: RecrawlPolicy object
Required: No
- Role
-
The Amazon Resource Name (ARN) of an IAM role that's used to access customer resources, such as Amazon Simple Storage Service (Amazon S3) data.
Type: String
Required: No
- Schedule
-
For scheduled crawlers, the schedule when the crawler runs.
Type: Schedule object
Required: No
- SchemaChangePolicy
-
The policy that specifies update and delete behaviors for the crawler.
Type: SchemaChangePolicy object
Required: No
- State
-
Indicates whether the crawler is running, or whether a run is pending.
Type: String
Valid Values:
READY | RUNNING | STOPPING
Required: No
- TablePrefix
-
The prefix added to the names of tables that are created.
Type: String
Length Constraints: Minimum length of 0. Maximum length of 128.
Required: No
- Targets
-
A collection of targets to crawl.
Type: CrawlerTargets object
Required: No
- Version
-
The version of the crawler.
Type: Long
Required: No
See Also
For more information about using this API in one of the language-specific AWS SDKs, see the following: