StartDataQualityRuleRecommendationRun - AWS Glue

StartDataQualityRuleRecommendationRun

Starts a recommendation run that is used to generate rules when you don't know what rules to write. AWS Glue Data Quality analyzes the data and comes up with recommendations for a potential ruleset. You can then triage the ruleset and modify the generated ruleset to your liking.

Recommendation runs are automatically deleted after 90 days.

Request Syntax

{ "ClientToken": "string", "CreatedRulesetName": "string", "DataSource": { "GlueTable": { "AdditionalOptions": { "string" : "string" }, "CatalogId": "string", "ConnectionName": "string", "DatabaseName": "string", "TableName": "string" } }, "NumberOfWorkers": number, "Role": "string", "Timeout": number }

Request Parameters

For information about the parameters that are common to all actions, see Common Parameters.

The request accepts the following data in JSON format.

ClientToken

Used for idempotency and is recommended to be set to a random ID (such as a UUID) to avoid creating or starting multiple instances of the same resource.

Type: String

Length Constraints: Minimum length of 1. Maximum length of 255.

Pattern: [\u0020-\uD7FF\uE000-\uFFFD\uD800\uDC00-\uDBFF\uDFFF\t]*

Required: No

CreatedRulesetName

A name for the ruleset.

Type: String

Length Constraints: Minimum length of 1. Maximum length of 255.

Pattern: [\u0020-\uD7FF\uE000-\uFFFD\uD800\uDC00-\uDBFF\uDFFF\t]*

Required: No

DataSource

The data source (AWS Glue table) associated with this run.

Type: DataSource object

Required: Yes

NumberOfWorkers

The number of G.1X workers to be used in the run. The default is 5.

Type: Integer

Required: No

Role

An IAM role supplied to encrypt the results of the run.

Type: String

Required: Yes

Timeout

The timeout for a run in minutes. This is the maximum time that a run can consume resources before it is terminated and enters TIMEOUT status. The default is 2,880 minutes (48 hours).

Type: Integer

Valid Range: Minimum value of 1.

Required: No

Response Syntax

{ "RunId": "string" }

Response Elements

If the action is successful, the service sends back an HTTP 200 response.

The following data is returned in JSON format by the service.

RunId

The unique run identifier associated with this run.

Type: String

Length Constraints: Minimum length of 1. Maximum length of 255.

Pattern: [\u0020-\uD7FF\uE000-\uFFFD\uD800\uDC00-\uDBFF\uDFFF\t]*

Errors

For information about the errors that are common to all actions, see Common Errors.

ConflictException

The CreatePartitions API was called on a table that has indexes enabled.

HTTP Status Code: 400

InternalServiceException

An internal service error occurred.

HTTP Status Code: 500

InvalidInputException

The input provided was not valid.

HTTP Status Code: 400

OperationTimeoutException

The operation timed out.

HTTP Status Code: 400

See Also

For more information about using this API in one of the language-specific AWS SDKs, see the following: