SDK for PHP 3.x

Client: Aws\Glue\GlueClient
Service ID: glue
Version: 2017-03-31

This page describes the parameters and results for the operations of the AWS Glue (2017-03-31), and shows how to use the Aws\Glue\GlueClient object to call the described operations. This documentation is specific to the 2017-03-31 API version of the service.

Operation Summary

Each of the following operations can be created from a client using $client->getCommand('CommandName'), where "CommandName" is the name of one of the following operations. Note: a command is a value that encapsulates an operation and the parameters used to create an HTTP request.

You can also create and send a command immediately using the magic methods available on a client object: $client->commandName(/* parameters */). You can send the command asynchronously (returning a promise) by appending the word "Async" to the operation name: $client->commandNameAsync(/* parameters */).

BatchCreatePartition ( array $params = [] )
Creates one or more partitions in a batch operation.
BatchDeleteConnection ( array $params = [] )
Deletes a list of connection definitions from the Data Catalog.
BatchDeletePartition ( array $params = [] )
Deletes one or more partitions in a batch operation.
BatchDeleteTable ( array $params = [] )
Deletes multiple tables at once.
BatchDeleteTableVersion ( array $params = [] )
Deletes a specified batch of versions of a table.
BatchGetBlueprints ( array $params = [] )
Retrieves information about a list of blueprints.
BatchGetCrawlers ( array $params = [] )
Returns a list of resource metadata for a given list of crawler names.
BatchGetCustomEntityTypes ( array $params = [] )
Retrieves the details for the custom patterns specified by a list of names.
BatchGetDataQualityResult ( array $params = [] )
Retrieves a list of data quality results for the specified result IDs.
BatchGetDevEndpoints ( array $params = [] )
Returns a list of resource metadata for a given list of development endpoint names.
BatchGetJobs ( array $params = [] )
Returns a list of resource metadata for a given list of job names.
BatchGetPartition ( array $params = [] )
Retrieves partitions in a batch request.
BatchGetTableOptimizer ( array $params = [] )
Returns the configuration for the specified table optimizers.
BatchGetTriggers ( array $params = [] )
Returns a list of resource metadata for a given list of trigger names.
BatchGetWorkflows ( array $params = [] )
Returns a list of resource metadata for a given list of workflow names.
BatchPutDataQualityStatisticAnnotation ( array $params = [] )
Annotate datapoints over time for a specific data quality statistic.
BatchStopJobRun ( array $params = [] )
Stops one or more job runs for a specified job definition.
BatchUpdatePartition ( array $params = [] )
Updates one or more partitions in a batch operation.
CancelDataQualityRuleRecommendationRun ( array $params = [] )
Cancels the specified recommendation run that was being used to generate rules.
CancelDataQualityRulesetEvaluationRun ( array $params = [] )
Cancels a run where a ruleset is being evaluated against a data source.
CancelMLTaskRun ( array $params = [] )
Cancels (stops) a task run.
CancelStatement ( array $params = [] )
Cancels the statement.
CheckSchemaVersionValidity ( array $params = [] )
Validates the supplied schema.
CreateBlueprint ( array $params = [] )
Registers a blueprint with Glue.
CreateClassifier ( array $params = [] )
Creates a classifier in the user's account.
CreateColumnStatisticsTaskSettings ( array $params = [] )
Creates settings for a column statistics task.
CreateConnection ( array $params = [] )
Creates a connection definition in the Data Catalog.
CreateCrawler ( array $params = [] )
Creates a new crawler with specified targets, role, configuration, and optional schedule.
CreateCustomEntityType ( array $params = [] )
Creates a custom pattern that is used to detect sensitive data across the columns and rows of your structured data.
CreateDataQualityRuleset ( array $params = [] )
Creates a data quality ruleset with DQDL rules applied to a specified Glue table.
CreateDatabase ( array $params = [] )
Creates a new database in a Data Catalog.
CreateDevEndpoint ( array $params = [] )
Creates a new development endpoint.
CreateJob ( array $params = [] )
Creates a new job definition.
CreateMLTransform ( array $params = [] )
Creates an Glue machine learning transform.
CreatePartition ( array $params = [] )
Creates a new partition.
CreatePartitionIndex ( array $params = [] )
Creates a specified partition index in an existing table.
CreateRegistry ( array $params = [] )
Creates a new registry which may be used to hold a collection of schemas.
CreateSchema ( array $params = [] )
Creates a new schema set and registers the schema definition.
CreateScript ( array $params = [] )
Transforms a directed acyclic graph (DAG) into code.
CreateSecurityConfiguration ( array $params = [] )
Creates a new security configuration.
CreateSession ( array $params = [] )
Creates a new session.
CreateTable ( array $params = [] )
Creates a new table definition in the Data Catalog.
CreateTableOptimizer ( array $params = [] )
Creates a new table optimizer for a specific function.
CreateTrigger ( array $params = [] )
Creates a new trigger.
CreateUsageProfile ( array $params = [] )
Creates an Glue usage profile.
CreateUserDefinedFunction ( array $params = [] )
Creates a new function definition in the Data Catalog.
CreateWorkflow ( array $params = [] )
Creates a new workflow.
DeleteBlueprint ( array $params = [] )
Deletes an existing blueprint.
DeleteClassifier ( array $params = [] )
Removes a classifier from the Data Catalog.
DeleteColumnStatisticsForPartition ( array $params = [] )
Delete the partition column statistics of a column.
DeleteColumnStatisticsForTable ( array $params = [] )
Retrieves table statistics of columns.
DeleteColumnStatisticsTaskSettings ( array $params = [] )
Deletes settings for a column statistics task.
DeleteConnection ( array $params = [] )
Deletes a connection from the Data Catalog.
DeleteCrawler ( array $params = [] )
Removes a specified crawler from the Glue Data Catalog, unless the crawler state is RUNNING.
DeleteCustomEntityType ( array $params = [] )
Deletes a custom pattern by specifying its name.
DeleteDataQualityRuleset ( array $params = [] )
Deletes a data quality ruleset.
DeleteDatabase ( array $params = [] )
Removes a specified database from a Data Catalog.
DeleteDevEndpoint ( array $params = [] )
Deletes a specified development endpoint.
DeleteJob ( array $params = [] )
Deletes a specified job definition.
DeleteMLTransform ( array $params = [] )
Deletes an Glue machine learning transform.
DeletePartition ( array $params = [] )
Deletes a specified partition.
DeletePartitionIndex ( array $params = [] )
Deletes a specified partition index from an existing table.
DeleteRegistry ( array $params = [] )
Delete the entire registry including schema and all of its versions.
DeleteResourcePolicy ( array $params = [] )
Deletes a specified policy.
DeleteSchema ( array $params = [] )
Deletes the entire schema set, including the schema set and all of its versions.
DeleteSchemaVersions ( array $params = [] )
Remove versions from the specified schema.
DeleteSecurityConfiguration ( array $params = [] )
Deletes a specified security configuration.
DeleteSession ( array $params = [] )
Deletes the session.
DeleteTable ( array $params = [] )
Removes a table definition from the Data Catalog.
DeleteTableOptimizer ( array $params = [] )
Deletes an optimizer and all associated metadata for a table.
DeleteTableVersion ( array $params = [] )
Deletes a specified version of a table.
DeleteTrigger ( array $params = [] )
Deletes a specified trigger.
DeleteUsageProfile ( array $params = [] )
Deletes the Glue specified usage profile.
DeleteUserDefinedFunction ( array $params = [] )
Deletes an existing function definition from the Data Catalog.
DeleteWorkflow ( array $params = [] )
Deletes a workflow.
GetBlueprint ( array $params = [] )
Retrieves the details of a blueprint.
GetBlueprintRun ( array $params = [] )
Retrieves the details of a blueprint run.
GetBlueprintRuns ( array $params = [] )
Retrieves the details of blueprint runs for a specified blueprint.
GetCatalogImportStatus ( array $params = [] )
Retrieves the status of a migration operation.
GetClassifier ( array $params = [] )
Retrieve a classifier by name.
GetClassifiers ( array $params = [] )
Lists all classifier objects in the Data Catalog.
GetColumnStatisticsForPartition ( array $params = [] )
Retrieves partition statistics of columns.
GetColumnStatisticsForTable ( array $params = [] )
Retrieves table statistics of columns.
GetColumnStatisticsTaskRun ( array $params = [] )
Get the associated metadata/information for a task run, given a task run ID.
GetColumnStatisticsTaskRuns ( array $params = [] )
Retrieves information about all runs associated with the specified table.
GetColumnStatisticsTaskSettings ( array $params = [] )
Gets settings for a column statistics task.
GetConnection ( array $params = [] )
Retrieves a connection definition from the Data Catalog.
GetConnections ( array $params = [] )
Retrieves a list of connection definitions from the Data Catalog.
GetCrawler ( array $params = [] )
Retrieves metadata for a specified crawler.
GetCrawlerMetrics ( array $params = [] )
Retrieves metrics about specified crawlers.
GetCrawlers ( array $params = [] )
Retrieves metadata for all crawlers defined in the customer account.
GetCustomEntityType ( array $params = [] )
Retrieves the details of a custom pattern by specifying its name.
GetDataCatalogEncryptionSettings ( array $params = [] )
Retrieves the security configuration for a specified catalog.
GetDataQualityModel ( array $params = [] )
Retrieve the training status of the model along with more information (CompletedOn, StartedOn, FailureReason).
GetDataQualityModelResult ( array $params = [] )
Retrieve a statistic's predictions for a given Profile ID.
GetDataQualityResult ( array $params = [] )
Retrieves the result of a data quality rule evaluation.
GetDataQualityRuleRecommendationRun ( array $params = [] )
Gets the specified recommendation run that was used to generate rules.
GetDataQualityRuleset ( array $params = [] )
Returns an existing ruleset by identifier or name.
GetDataQualityRulesetEvaluationRun ( array $params = [] )
Retrieves a specific run where a ruleset is evaluated against a data source.
GetDatabase ( array $params = [] )
Retrieves the definition of a specified database.
GetDatabases ( array $params = [] )
Retrieves all databases defined in a given Data Catalog.
GetDataflowGraph ( array $params = [] )
Transforms a Python script into a directed acyclic graph (DAG).
GetDevEndpoint ( array $params = [] )
Retrieves information about a specified development endpoint.
GetDevEndpoints ( array $params = [] )
Retrieves all the development endpoints in this Amazon Web Services account.
GetJob ( array $params = [] )
Retrieves an existing job definition.
GetJobBookmark ( array $params = [] )
Returns information on a job bookmark entry.
GetJobRun ( array $params = [] )
Retrieves the metadata for a given job run.
GetJobRuns ( array $params = [] )
Retrieves metadata for all runs of a given job definition.
GetJobs ( array $params = [] )
Retrieves all current job definitions.
GetMLTaskRun ( array $params = [] )
Gets details for a specific task run on a machine learning transform.
GetMLTaskRuns ( array $params = [] )
Gets a list of runs for a machine learning transform.
GetMLTransform ( array $params = [] )
Gets an Glue machine learning transform artifact and all its corresponding metadata.
GetMLTransforms ( array $params = [] )
Gets a sortable, filterable list of existing Glue machine learning transforms.
GetMapping ( array $params = [] )
Creates mappings.
GetPartition ( array $params = [] )
Retrieves information about a specified partition.
GetPartitionIndexes ( array $params = [] )
Retrieves the partition indexes associated with a table.
GetPartitions ( array $params = [] )
Retrieves information about the partitions in a table.
GetPlan ( array $params = [] )
Gets code to perform a specified mapping.
GetRegistry ( array $params = [] )
Describes the specified registry in detail.
GetResourcePolicies ( array $params = [] )
Retrieves the resource policies set on individual resources by Resource Access Manager during cross-account permission grants.
GetResourcePolicy ( array $params = [] )
Retrieves a specified resource policy.
GetSchema ( array $params = [] )
Describes the specified schema in detail.
GetSchemaByDefinition ( array $params = [] )
Retrieves a schema by the SchemaDefinition.
GetSchemaVersion ( array $params = [] )
Get the specified schema by its unique ID assigned when a version of the schema is created or registered.
GetSchemaVersionsDiff ( array $params = [] )
Fetches the schema version difference in the specified difference type between two stored schema versions in the Schema Registry.
GetSecurityConfiguration ( array $params = [] )
Retrieves a specified security configuration.
GetSecurityConfigurations ( array $params = [] )
Retrieves a list of all security configurations.
GetSession ( array $params = [] )
Retrieves the session.
GetStatement ( array $params = [] )
Retrieves the statement.
GetTable ( array $params = [] )
Retrieves the Table definition in a Data Catalog for a specified table.
GetTableOptimizer ( array $params = [] )
Returns the configuration of all optimizers associated with a specified table.
GetTableVersion ( array $params = [] )
Retrieves a specified version of a table.
GetTableVersions ( array $params = [] )
Retrieves a list of strings that identify available versions of a specified table.
GetTables ( array $params = [] )
Retrieves the definitions of some or all of the tables in a given Database.
GetTags ( array $params = [] )
Retrieves a list of tags associated with a resource.
GetTrigger ( array $params = [] )
Retrieves the definition of a trigger.
GetTriggers ( array $params = [] )
Gets all the triggers associated with a job.
GetUnfilteredPartitionMetadata ( array $params = [] )
Retrieves partition metadata from the Data Catalog that contains unfiltered metadata.
GetUnfilteredPartitionsMetadata ( array $params = [] )
Retrieves partition metadata from the Data Catalog that contains unfiltered metadata.
GetUnfilteredTableMetadata ( array $params = [] )
Allows a third-party analytical engine to retrieve unfiltered table metadata from the Data Catalog.
GetUsageProfile ( array $params = [] )
Retrieves information about the specified Glue usage profile.
GetUserDefinedFunction ( array $params = [] )
Retrieves a specified function definition from the Data Catalog.
GetUserDefinedFunctions ( array $params = [] )
Retrieves multiple function definitions from the Data Catalog.
GetWorkflow ( array $params = [] )
Retrieves resource metadata for a workflow.
GetWorkflowRun ( array $params = [] )
Retrieves the metadata for a given workflow run.
GetWorkflowRunProperties ( array $params = [] )
Retrieves the workflow run properties which were set during the run.
GetWorkflowRuns ( array $params = [] )
Retrieves metadata for all runs of a given workflow.
ImportCatalogToGlue ( array $params = [] )
Imports an existing Amazon Athena Data Catalog to Glue.
ListBlueprints ( array $params = [] )
Lists all the blueprint names in an account.
ListColumnStatisticsTaskRuns ( array $params = [] )
List all task runs for a particular account.
ListCrawlers ( array $params = [] )
Retrieves the names of all crawler resources in this Amazon Web Services account, or the resources with the specified tag.
ListCrawls ( array $params = [] )
Returns all the crawls of a specified crawler.
ListCustomEntityTypes ( array $params = [] )
Lists all the custom patterns that have been created.
ListDataQualityResults ( array $params = [] )
Returns all data quality execution results for your account.
ListDataQualityRuleRecommendationRuns ( array $params = [] )
Lists the recommendation runs meeting the filter criteria.
ListDataQualityRulesetEvaluationRuns ( array $params = [] )
Lists all the runs meeting the filter criteria, where a ruleset is evaluated against a data source.
ListDataQualityRulesets ( array $params = [] )
Returns a paginated list of rulesets for the specified list of Glue tables.
ListDataQualityStatisticAnnotations ( array $params = [] )
Retrieve annotations for a data quality statistic.
ListDataQualityStatistics ( array $params = [] )
Retrieves a list of data quality statistics.
ListDevEndpoints ( array $params = [] )
Retrieves the names of all DevEndpoint resources in this Amazon Web Services account, or the resources with the specified tag.
ListJobs ( array $params = [] )
Retrieves the names of all job resources in this Amazon Web Services account, or the resources with the specified tag.
ListMLTransforms ( array $params = [] )
Retrieves a sortable, filterable list of existing Glue machine learning transforms in this Amazon Web Services account, or the resources with the specified tag.
ListRegistries ( array $params = [] )
Returns a list of registries that you have created, with minimal registry information.
ListSchemaVersions ( array $params = [] )
Returns a list of schema versions that you have created, with minimal information.
ListSchemas ( array $params = [] )
Returns a list of schemas with minimal details.
ListSessions ( array $params = [] )
Retrieve a list of sessions.
ListStatements ( array $params = [] )
Lists statements for the session.
ListTableOptimizerRuns ( array $params = [] )
Lists the history of previous optimizer runs for a specific table.
ListTriggers ( array $params = [] )
Retrieves the names of all trigger resources in this Amazon Web Services account, or the resources with the specified tag.
ListUsageProfiles ( array $params = [] )
List all the Glue usage profiles.
ListWorkflows ( array $params = [] )
Lists names of workflows created in the account.
PutDataCatalogEncryptionSettings ( array $params = [] )
Sets the security configuration for a specified catalog.
PutDataQualityProfileAnnotation ( array $params = [] )
Annotate all datapoints for a Profile.
PutResourcePolicy ( array $params = [] )
Sets the Data Catalog resource policy for access control.
PutSchemaVersionMetadata ( array $params = [] )
Puts the metadata key value pair for a specified schema version ID.
PutWorkflowRunProperties ( array $params = [] )
Puts the specified workflow run properties for the given workflow run.
QuerySchemaVersionMetadata ( array $params = [] )
Queries for the schema version metadata information.
RegisterSchemaVersion ( array $params = [] )
Adds a new version to the existing schema.
RemoveSchemaVersionMetadata ( array $params = [] )
Removes a key value pair from the schema version metadata for the specified schema version ID.
ResetJobBookmark ( array $params = [] )
Resets a bookmark entry.
ResumeWorkflowRun ( array $params = [] )
Restarts selected nodes of a previous partially completed workflow run and resumes the workflow run.
RunStatement ( array $params = [] )
Executes the statement.
SearchTables ( array $params = [] )
Searches a set of tables based on properties in the table metadata as well as on the parent database.
StartBlueprintRun ( array $params = [] )
Starts a new run of the specified blueprint.
StartColumnStatisticsTaskRun ( array $params = [] )
Starts a column statistics task run, for a specified table and columns.
StartColumnStatisticsTaskRunSchedule ( array $params = [] )
Starts a column statistics task run schedule.
StartCrawler ( array $params = [] )
Starts a crawl using the specified crawler, regardless of what is scheduled.
StartCrawlerSchedule ( array $params = [] )
Changes the schedule state of the specified crawler to SCHEDULED, unless the crawler is already running or the schedule state is already SCHEDULED.
StartDataQualityRuleRecommendationRun ( array $params = [] )
Starts a recommendation run that is used to generate rules when you don't know what rules to write.
StartDataQualityRulesetEvaluationRun ( array $params = [] )
Once you have a ruleset definition (either recommended or your own), you call this operation to evaluate the ruleset against a data source (Glue table).
StartExportLabelsTaskRun ( array $params = [] )
Begins an asynchronous task to export all labeled data for a particular transform.
StartImportLabelsTaskRun ( array $params = [] )
Enables you to provide additional labels (examples of truth) to be used to teach the machine learning transform and improve its quality.
StartJobRun ( array $params = [] )
Starts a job run using a job definition.
StartMLEvaluationTaskRun ( array $params = [] )
Starts a task to estimate the quality of the transform.
StartMLLabelingSetGenerationTaskRun ( array $params = [] )
Starts the active learning workflow for your machine learning transform to improve the transform's quality by generating label sets and adding labels.
StartTrigger ( array $params = [] )
Starts an existing trigger.
StartWorkflowRun ( array $params = [] )
Starts a new run of the specified workflow.
StopColumnStatisticsTaskRun ( array $params = [] )
Stops a task run for the specified table.
StopColumnStatisticsTaskRunSchedule ( array $params = [] )
Stops a column statistics task run schedule.
StopCrawler ( array $params = [] )
If the specified crawler is running, stops the crawl.
StopCrawlerSchedule ( array $params = [] )
Sets the schedule state of the specified crawler to NOT_SCHEDULED, but does not stop the crawler if it is already running.
StopSession ( array $params = [] )
Stops the session.
StopTrigger ( array $params = [] )
Stops a specified trigger.
StopWorkflowRun ( array $params = [] )
Stops the execution of the specified workflow run.
TagResource ( array $params = [] )
Adds tags to a resource.
TestConnection ( array $params = [] )
Tests a connection to a service to validate the service credentials that you provide.
UntagResource ( array $params = [] )
Removes tags from a resource.
UpdateBlueprint ( array $params = [] )
Updates a registered blueprint.
UpdateClassifier ( array $params = [] )
Modifies an existing classifier (a GrokClassifier, an XMLClassifier, a JsonClassifier, or a CsvClassifier, depending on which field is present).
UpdateColumnStatisticsForPartition ( array $params = [] )
Creates or updates partition statistics of columns.
UpdateColumnStatisticsForTable ( array $params = [] )
Creates or updates table statistics of columns.
UpdateColumnStatisticsTaskSettings ( array $params = [] )
Updates settings for a column statistics task.
UpdateConnection ( array $params = [] )
Updates a connection definition in the Data Catalog.
UpdateCrawler ( array $params = [] )
Updates a crawler.
UpdateCrawlerSchedule ( array $params = [] )
Updates the schedule of a crawler using a cron expression.
UpdateDataQualityRuleset ( array $params = [] )
Updates the specified data quality ruleset.
UpdateDatabase ( array $params = [] )
Updates an existing database definition in a Data Catalog.
UpdateDevEndpoint ( array $params = [] )
Updates a specified development endpoint.
UpdateJob ( array $params = [] )
Updates an existing job definition.
UpdateJobFromSourceControl ( array $params = [] )
Synchronizes a job from the source control repository.
UpdateMLTransform ( array $params = [] )
Updates an existing machine learning transform.
UpdatePartition ( array $params = [] )
Updates a partition.
UpdateRegistry ( array $params = [] )
Updates an existing registry which is used to hold a collection of schemas.
UpdateSchema ( array $params = [] )
Updates the description, compatibility setting, or version checkpoint for a schema set.
UpdateSourceControlFromJob ( array $params = [] )
Synchronizes a job to the source control repository.
UpdateTable ( array $params = [] )
Updates a metadata table in the Data Catalog.
UpdateTableOptimizer ( array $params = [] )
Updates the configuration for an existing table optimizer.
UpdateTrigger ( array $params = [] )
Updates a trigger definition.
UpdateUsageProfile ( array $params = [] )
Update an Glue usage profile.
UpdateUserDefinedFunction ( array $params = [] )
Updates an existing function definition in the Data Catalog.
UpdateWorkflow ( array $params = [] )
Updates an existing workflow.

Paginators

Paginators handle automatically iterating over paginated API results. Paginators are associated with specific API operations, and they accept the parameters that the corresponding API operation accepts. You can get a paginator from a client class using getPaginator($paginatorName, $operationParameters). This client supports the following paginators:

GetBlueprintRuns
GetClassifiers
GetColumnStatisticsTaskRuns
GetConnections
GetCrawlerMetrics
GetCrawlers
GetDatabases
GetDevEndpoints
GetJobRuns
GetJobs
GetMLTaskRuns
GetMLTransforms
GetPartitionIndexes
GetPartitions
GetResourcePolicies
GetSecurityConfigurations
GetTableVersions
GetTables
GetTriggers
GetUnfilteredPartitionsMetadata
GetUserDefinedFunctions
GetWorkflowRuns
ListBlueprints
ListColumnStatisticsTaskRuns
ListCrawlers
ListCustomEntityTypes
ListDataQualityResults
ListDataQualityRuleRecommendationRuns
ListDataQualityRulesetEvaluationRuns
ListDataQualityRulesets
ListDevEndpoints
ListJobs
ListMLTransforms
ListRegistries
ListSchemaVersions
ListSchemas
ListSessions
ListTableOptimizerRuns
ListTriggers
ListUsageProfiles
ListWorkflows
SearchTables

Operations

BatchCreatePartition

$result = $client->batchCreatePartition([/* ... */]);
$promise = $client->batchCreatePartitionAsync([/* ... */]);

Creates one or more partitions in a batch operation.

Parameter Syntax

$result = $client->batchCreatePartition([
    'CatalogId' => '<string>',
    'DatabaseName' => '<string>', // REQUIRED
    'PartitionInputList' => [ // REQUIRED
        [
            'LastAccessTime' => <integer || string || DateTime>,
            'LastAnalyzedTime' => <integer || string || DateTime>,
            'Parameters' => ['<string>', ...],
            'StorageDescriptor' => [
                'AdditionalLocations' => ['<string>', ...],
                'BucketColumns' => ['<string>', ...],
                'Columns' => [
                    [
                        'Comment' => '<string>',
                        'Name' => '<string>', // REQUIRED
                        'Parameters' => ['<string>', ...],
                        'Type' => '<string>',
                    ],
                    // ...
                ],
                'Compressed' => true || false,
                'InputFormat' => '<string>',
                'Location' => '<string>',
                'NumberOfBuckets' => <integer>,
                'OutputFormat' => '<string>',
                'Parameters' => ['<string>', ...],
                'SchemaReference' => [
                    'SchemaId' => [
                        'RegistryName' => '<string>',
                        'SchemaArn' => '<string>',
                        'SchemaName' => '<string>',
                    ],
                    'SchemaVersionId' => '<string>',
                    'SchemaVersionNumber' => <integer>,
                ],
                'SerdeInfo' => [
                    'Name' => '<string>',
                    'Parameters' => ['<string>', ...],
                    'SerializationLibrary' => '<string>',
                ],
                'SkewedInfo' => [
                    'SkewedColumnNames' => ['<string>', ...],
                    'SkewedColumnValueLocationMaps' => ['<string>', ...],
                    'SkewedColumnValues' => ['<string>', ...],
                ],
                'SortColumns' => [
                    [
                        'Column' => '<string>', // REQUIRED
                        'SortOrder' => <integer>, // REQUIRED
                    ],
                    // ...
                ],
                'StoredAsSubDirectories' => true || false,
            ],
            'Values' => ['<string>', ...],
        ],
        // ...
    ],
    'TableName' => '<string>', // REQUIRED
]);

Parameter Details

Members
CatalogId
Type: string

The ID of the catalog in which the partition is to be created. Currently, this should be the Amazon Web Services account ID.

DatabaseName
Required: Yes
Type: string

The name of the metadata database in which the partition is to be created.

PartitionInputList
Required: Yes
Type: Array of PartitionInput structures

A list of PartitionInput structures that define the partitions to be created.

TableName
Required: Yes
Type: string

The name of the metadata table in which the partition is to be created.

Result Syntax

[
    'Errors' => [
        [
            'ErrorDetail' => [
                'ErrorCode' => '<string>',
                'ErrorMessage' => '<string>',
            ],
            'PartitionValues' => ['<string>', ...],
        ],
        // ...
    ],
]

Result Details

Members
Errors
Type: Array of PartitionError structures

The errors encountered when trying to create the requested partitions.

Errors

InvalidInputException:

The input provided was not valid.

AlreadyExistsException:

A resource to be created or added already exists.

ResourceNumberLimitExceededException:

A resource numerical limit was exceeded.

InternalServiceException:

An internal service error occurred.

EntityNotFoundException:

A specified entity does not exist

OperationTimeoutException:

The operation timed out.

GlueEncryptionException:

An encryption operation failed.

BatchDeleteConnection

$result = $client->batchDeleteConnection([/* ... */]);
$promise = $client->batchDeleteConnectionAsync([/* ... */]);

Deletes a list of connection definitions from the Data Catalog.

Parameter Syntax

$result = $client->batchDeleteConnection([
    'CatalogId' => '<string>',
    'ConnectionNameList' => ['<string>', ...], // REQUIRED
]);

Parameter Details

Members
CatalogId
Type: string

The ID of the Data Catalog in which the connections reside. If none is provided, the Amazon Web Services account ID is used by default.

ConnectionNameList
Required: Yes
Type: Array of strings

A list of names of the connections to delete.

Result Syntax

[
    'Errors' => [
        '<NameString>' => [
            'ErrorCode' => '<string>',
            'ErrorMessage' => '<string>',
        ],
        // ...
    ],
    'Succeeded' => ['<string>', ...],
]

Result Details

Members
Errors
Type: Associative array of custom strings keys (NameString) to ErrorDetail structures

A map of the names of connections that were not successfully deleted to error details.

Succeeded
Type: Array of strings

A list of names of the connection definitions that were successfully deleted.

Errors

InternalServiceException:

An internal service error occurred.

OperationTimeoutException:

The operation timed out.

BatchDeletePartition

$result = $client->batchDeletePartition([/* ... */]);
$promise = $client->batchDeletePartitionAsync([/* ... */]);

Deletes one or more partitions in a batch operation.

Parameter Syntax

$result = $client->batchDeletePartition([
    'CatalogId' => '<string>',
    'DatabaseName' => '<string>', // REQUIRED
    'PartitionsToDelete' => [ // REQUIRED
        [
            'Values' => ['<string>', ...], // REQUIRED
        ],
        // ...
    ],
    'TableName' => '<string>', // REQUIRED
]);

Parameter Details

Members
CatalogId
Type: string

The ID of the Data Catalog where the partition to be deleted resides. If none is provided, the Amazon Web Services account ID is used by default.

DatabaseName
Required: Yes
Type: string

The name of the catalog database in which the table in question resides.

PartitionsToDelete
Required: Yes
Type: Array of PartitionValueList structures

A list of PartitionInput structures that define the partitions to be deleted.

TableName
Required: Yes
Type: string

The name of the table that contains the partitions to be deleted.

Result Syntax

[
    'Errors' => [
        [
            'ErrorDetail' => [
                'ErrorCode' => '<string>',
                'ErrorMessage' => '<string>',
            ],
            'PartitionValues' => ['<string>', ...],
        ],
        // ...
    ],
]

Result Details

Members
Errors
Type: Array of PartitionError structures

The errors encountered when trying to delete the requested partitions.

Errors

InvalidInputException:

The input provided was not valid.

EntityNotFoundException:

A specified entity does not exist

InternalServiceException:

An internal service error occurred.

OperationTimeoutException:

The operation timed out.

BatchDeleteTable

$result = $client->batchDeleteTable([/* ... */]);
$promise = $client->batchDeleteTableAsync([/* ... */]);

Deletes multiple tables at once.

After completing this operation, you no longer have access to the table versions and partitions that belong to the deleted table. Glue deletes these "orphaned" resources asynchronously in a timely manner, at the discretion of the service.

To ensure the immediate deletion of all related resources, before calling BatchDeleteTable, use DeleteTableVersion or BatchDeleteTableVersion, and DeletePartition or BatchDeletePartition, to delete any resources that belong to the table.

Parameter Syntax

$result = $client->batchDeleteTable([
    'CatalogId' => '<string>',
    'DatabaseName' => '<string>', // REQUIRED
    'TablesToDelete' => ['<string>', ...], // REQUIRED
    'TransactionId' => '<string>',
]);

Parameter Details

Members
CatalogId
Type: string

The ID of the Data Catalog where the table resides. If none is provided, the Amazon Web Services account ID is used by default.

DatabaseName
Required: Yes
Type: string

The name of the catalog database in which the tables to delete reside. For Hive compatibility, this name is entirely lowercase.

TablesToDelete
Required: Yes
Type: Array of strings

A list of the table to delete.

TransactionId
Type: string

The transaction ID at which to delete the table contents.

Result Syntax

[
    'Errors' => [
        [
            'ErrorDetail' => [
                'ErrorCode' => '<string>',
                'ErrorMessage' => '<string>',
            ],
            'TableName' => '<string>',
        ],
        // ...
    ],
]

Result Details

Members
Errors
Type: Array of TableError structures

A list of errors encountered in attempting to delete the specified tables.

Errors

InvalidInputException:

The input provided was not valid.

EntityNotFoundException:

A specified entity does not exist

InternalServiceException:

An internal service error occurred.

OperationTimeoutException:

The operation timed out.

GlueEncryptionException:

An encryption operation failed.

ResourceNotReadyException:

A resource was not ready for a transaction.

BatchDeleteTableVersion

$result = $client->batchDeleteTableVersion([/* ... */]);
$promise = $client->batchDeleteTableVersionAsync([/* ... */]);

Deletes a specified batch of versions of a table.

Parameter Syntax

$result = $client->batchDeleteTableVersion([
    'CatalogId' => '<string>',
    'DatabaseName' => '<string>', // REQUIRED
    'TableName' => '<string>', // REQUIRED
    'VersionIds' => ['<string>', ...], // REQUIRED
]);

Parameter Details

Members
CatalogId
Type: string

The ID of the Data Catalog where the tables reside. If none is provided, the Amazon Web Services account ID is used by default.

DatabaseName
Required: Yes
Type: string

The database in the catalog in which the table resides. For Hive compatibility, this name is entirely lowercase.

TableName
Required: Yes
Type: string

The name of the table. For Hive compatibility, this name is entirely lowercase.

VersionIds
Required: Yes
Type: Array of strings

A list of the IDs of versions to be deleted. A VersionId is a string representation of an integer. Each version is incremented by 1.

Result Syntax

[
    'Errors' => [
        [
            'ErrorDetail' => [
                'ErrorCode' => '<string>',
                'ErrorMessage' => '<string>',
            ],
            'TableName' => '<string>',
            'VersionId' => '<string>',
        ],
        // ...
    ],
]

Result Details

Members
Errors
Type: Array of TableVersionError structures

A list of errors encountered while trying to delete the specified table versions.

Errors

EntityNotFoundException:

A specified entity does not exist

InvalidInputException:

The input provided was not valid.

InternalServiceException:

An internal service error occurred.

OperationTimeoutException:

The operation timed out.

BatchGetBlueprints

$result = $client->batchGetBlueprints([/* ... */]);
$promise = $client->batchGetBlueprintsAsync([/* ... */]);

Retrieves information about a list of blueprints.

Parameter Syntax

$result = $client->batchGetBlueprints([
    'IncludeBlueprint' => true || false,
    'IncludeParameterSpec' => true || false,
    'Names' => ['<string>', ...], // REQUIRED
]);

Parameter Details

Members
IncludeBlueprint
Type: boolean

Specifies whether or not to include the blueprint in the response.

IncludeParameterSpec
Type: boolean

Specifies whether or not to include the parameters, as a JSON string, for the blueprint in the response.

Names
Required: Yes
Type: Array of strings

A list of blueprint names.

Result Syntax

[
    'Blueprints' => [
        [
            'BlueprintLocation' => '<string>',
            'BlueprintServiceLocation' => '<string>',
            'CreatedOn' => <DateTime>,
            'Description' => '<string>',
            'ErrorMessage' => '<string>',
            'LastActiveDefinition' => [
                'BlueprintLocation' => '<string>',
                'BlueprintServiceLocation' => '<string>',
                'Description' => '<string>',
                'LastModifiedOn' => <DateTime>,
                'ParameterSpec' => '<string>',
            ],
            'LastModifiedOn' => <DateTime>,
            'Name' => '<string>',
            'ParameterSpec' => '<string>',
            'Status' => 'CREATING|ACTIVE|UPDATING|FAILED',
        ],
        // ...
    ],
    'MissingBlueprints' => ['<string>', ...],
]

Result Details

Members
Blueprints
Type: Array of Blueprint structures

Returns a list of blueprint as a Blueprints object.

MissingBlueprints
Type: Array of strings

Returns a list of BlueprintNames that were not found.

Errors

InternalServiceException:

An internal service error occurred.

OperationTimeoutException:

The operation timed out.

InvalidInputException:

The input provided was not valid.

BatchGetCrawlers

$result = $client->batchGetCrawlers([/* ... */]);
$promise = $client->batchGetCrawlersAsync([/* ... */]);

Returns a list of resource metadata for a given list of crawler names. After calling the ListCrawlers operation, you can call this operation to access the data to which you have been granted permissions. This operation supports all IAM permissions, including permission conditions that uses tags.

Parameter Syntax

$result = $client->batchGetCrawlers([
    'CrawlerNames' => ['<string>', ...], // REQUIRED
]);

Parameter Details

Members
CrawlerNames
Required: Yes
Type: Array of strings

A list of crawler names, which might be the names returned from the ListCrawlers operation.

Result Syntax

[
    'Crawlers' => [
        [
            'Classifiers' => ['<string>', ...],
            'Configuration' => '<string>',
            'CrawlElapsedTime' => <integer>,
            'CrawlerSecurityConfiguration' => '<string>',
            'CreationTime' => <DateTime>,
            'DatabaseName' => '<string>',
            'Description' => '<string>',
            'LakeFormationConfiguration' => [
                'AccountId' => '<string>',
                'UseLakeFormationCredentials' => true || false,
            ],
            'LastCrawl' => [
                'ErrorMessage' => '<string>',
                'LogGroup' => '<string>',
                'LogStream' => '<string>',
                'MessagePrefix' => '<string>',
                'StartTime' => <DateTime>,
                'Status' => 'SUCCEEDED|CANCELLED|FAILED',
            ],
            'LastUpdated' => <DateTime>,
            'LineageConfiguration' => [
                'CrawlerLineageSettings' => 'ENABLE|DISABLE',
            ],
            'Name' => '<string>',
            'RecrawlPolicy' => [
                'RecrawlBehavior' => 'CRAWL_EVERYTHING|CRAWL_NEW_FOLDERS_ONLY|CRAWL_EVENT_MODE',
            ],
            'Role' => '<string>',
            'Schedule' => [
                'ScheduleExpression' => '<string>',
                'State' => 'SCHEDULED|NOT_SCHEDULED|TRANSITIONING',
            ],
            'SchemaChangePolicy' => [
                'DeleteBehavior' => 'LOG|DELETE_FROM_DATABASE|DEPRECATE_IN_DATABASE',
                'UpdateBehavior' => 'LOG|UPDATE_IN_DATABASE',
            ],
            'State' => 'READY|RUNNING|STOPPING',
            'TablePrefix' => '<string>',
            'Targets' => [
                'CatalogTargets' => [
                    [
                        'ConnectionName' => '<string>',
                        'DatabaseName' => '<string>',
                        'DlqEventQueueArn' => '<string>',
                        'EventQueueArn' => '<string>',
                        'Tables' => ['<string>', ...],
                    ],
                    // ...
                ],
                'DeltaTargets' => [
                    [
                        'ConnectionName' => '<string>',
                        'CreateNativeDeltaTable' => true || false,
                        'DeltaTables' => ['<string>', ...],
                        'WriteManifest' => true || false,
                    ],
                    // ...
                ],
                'DynamoDBTargets' => [
                    [
                        'Path' => '<string>',
                        'scanAll' => true || false,
                        'scanRate' => <float>,
                    ],
                    // ...
                ],
                'HudiTargets' => [
                    [
                        'ConnectionName' => '<string>',
                        'Exclusions' => ['<string>', ...],
                        'MaximumTraversalDepth' => <integer>,
                        'Paths' => ['<string>', ...],
                    ],
                    // ...
                ],
                'IcebergTargets' => [
                    [
                        'ConnectionName' => '<string>',
                        'Exclusions' => ['<string>', ...],
                        'MaximumTraversalDepth' => <integer>,
                        'Paths' => ['<string>', ...],
                    ],
                    // ...
                ],
                'JdbcTargets' => [
                    [
                        'ConnectionName' => '<string>',
                        'EnableAdditionalMetadata' => ['<string>', ...],
                        'Exclusions' => ['<string>', ...],
                        'Path' => '<string>',
                    ],
                    // ...
                ],
                'MongoDBTargets' => [
                    [
                        'ConnectionName' => '<string>',
                        'Path' => '<string>',
                        'ScanAll' => true || false,
                    ],
                    // ...
                ],
                'S3Targets' => [
                    [
                        'ConnectionName' => '<string>',
                        'DlqEventQueueArn' => '<string>',
                        'EventQueueArn' => '<string>',
                        'Exclusions' => ['<string>', ...],
                        'Path' => '<string>',
                        'SampleSize' => <integer>,
                    ],
                    // ...
                ],
            ],
            'Version' => <integer>,
        ],
        // ...
    ],
    'CrawlersNotFound' => ['<string>', ...],
]

Result Details

Members
Crawlers
Type: Array of Crawler structures

A list of crawler definitions.

CrawlersNotFound
Type: Array of strings

A list of names of crawlers that were not found.

Errors

InvalidInputException:

The input provided was not valid.

OperationTimeoutException:

The operation timed out.

BatchGetCustomEntityTypes

$result = $client->batchGetCustomEntityTypes([/* ... */]);
$promise = $client->batchGetCustomEntityTypesAsync([/* ... */]);

Retrieves the details for the custom patterns specified by a list of names.

Parameter Syntax

$result = $client->batchGetCustomEntityTypes([
    'Names' => ['<string>', ...], // REQUIRED
]);

Parameter Details

Members
Names
Required: Yes
Type: Array of strings

A list of names of the custom patterns that you want to retrieve.

Result Syntax

[
    'CustomEntityTypes' => [
        [
            'ContextWords' => ['<string>', ...],
            'Name' => '<string>',
            'RegexString' => '<string>',
        ],
        // ...
    ],
    'CustomEntityTypesNotFound' => ['<string>', ...],
]

Result Details

Members
CustomEntityTypes
Type: Array of CustomEntityType structures

A list of CustomEntityType objects representing the custom patterns that have been created.

CustomEntityTypesNotFound
Type: Array of strings

A list of the names of custom patterns that were not found.

Errors

InvalidInputException:

The input provided was not valid.

InternalServiceException:

An internal service error occurred.

OperationTimeoutException:

The operation timed out.

BatchGetDataQualityResult

$result = $client->batchGetDataQualityResult([/* ... */]);
$promise = $client->batchGetDataQualityResultAsync([/* ... */]);

Retrieves a list of data quality results for the specified result IDs.

Parameter Syntax

$result = $client->batchGetDataQualityResult([
    'ResultIds' => ['<string>', ...], // REQUIRED
]);

Parameter Details

Members
ResultIds
Required: Yes
Type: Array of strings

A list of unique result IDs for the data quality results.

Result Syntax

[
    'Results' => [
        [
            'AnalyzerResults' => [
                [
                    'Description' => '<string>',
                    'EvaluatedMetrics' => [<float>, ...],
                    'EvaluationMessage' => '<string>',
                    'Name' => '<string>',
                ],
                // ...
            ],
            'CompletedOn' => <DateTime>,
            'DataSource' => [
                'GlueTable' => [
                    'AdditionalOptions' => ['<string>', ...],
                    'CatalogId' => '<string>',
                    'ConnectionName' => '<string>',
                    'DatabaseName' => '<string>',
                    'TableName' => '<string>',
                ],
            ],
            'EvaluationContext' => '<string>',
            'JobName' => '<string>',
            'JobRunId' => '<string>',
            'Observations' => [
                [
                    'Description' => '<string>',
                    'MetricBasedObservation' => [
                        'MetricName' => '<string>',
                        'MetricValues' => [
                            'ActualValue' => <float>,
                            'ExpectedValue' => <float>,
                            'LowerLimit' => <float>,
                            'UpperLimit' => <float>,
                        ],
                        'NewRules' => ['<string>', ...],
                        'StatisticId' => '<string>',
                    ],
                ],
                // ...
            ],
            'ProfileId' => '<string>',
            'ResultId' => '<string>',
            'RuleResults' => [
                [
                    'Description' => '<string>',
                    'EvaluatedMetrics' => [<float>, ...],
                    'EvaluatedRule' => '<string>',
                    'EvaluationMessage' => '<string>',
                    'Name' => '<string>',
                    'Result' => 'PASS|FAIL|ERROR',
                ],
                // ...
            ],
            'RulesetEvaluationRunId' => '<string>',
            'RulesetName' => '<string>',
            'Score' => <float>,
            'StartedOn' => <DateTime>,
        ],
        // ...
    ],
    'ResultsNotFound' => ['<string>', ...],
]

Result Details

Members
Results
Required: Yes
Type: Array of DataQualityResult structures

A list of DataQualityResult objects representing the data quality results.

ResultsNotFound
Type: Array of strings

A list of result IDs for which results were not found.

Errors

InvalidInputException:

The input provided was not valid.

OperationTimeoutException:

The operation timed out.

InternalServiceException:

An internal service error occurred.

BatchGetDevEndpoints

$result = $client->batchGetDevEndpoints([/* ... */]);
$promise = $client->batchGetDevEndpointsAsync([/* ... */]);

Returns a list of resource metadata for a given list of development endpoint names. After calling the ListDevEndpoints operation, you can call this operation to access the data to which you have been granted permissions. This operation supports all IAM permissions, including permission conditions that uses tags.

Parameter Syntax

$result = $client->batchGetDevEndpoints([
    'DevEndpointNames' => ['<string>', ...], // REQUIRED
]);

Parameter Details

Members
DevEndpointNames
Required: Yes
Type: Array of strings

The list of DevEndpoint names, which might be the names returned from the ListDevEndpoint operation.

Result Syntax

[
    'DevEndpoints' => [
        [
            'Arguments' => ['<string>', ...],
            'AvailabilityZone' => '<string>',
            'CreatedTimestamp' => <DateTime>,
            'EndpointName' => '<string>',
            'ExtraJarsS3Path' => '<string>',
            'ExtraPythonLibsS3Path' => '<string>',
            'FailureReason' => '<string>',
            'GlueVersion' => '<string>',
            'LastModifiedTimestamp' => <DateTime>,
            'LastUpdateStatus' => '<string>',
            'NumberOfNodes' => <integer>,
            'NumberOfWorkers' => <integer>,
            'PrivateAddress' => '<string>',
            'PublicAddress' => '<string>',
            'PublicKey' => '<string>',
            'PublicKeys' => ['<string>', ...],
            'RoleArn' => '<string>',
            'SecurityConfiguration' => '<string>',
            'SecurityGroupIds' => ['<string>', ...],
            'Status' => '<string>',
            'SubnetId' => '<string>',
            'VpcId' => '<string>',
            'WorkerType' => 'Standard|G.1X|G.2X|G.025X|G.4X|G.8X|Z.2X',
            'YarnEndpointAddress' => '<string>',
            'ZeppelinRemoteSparkInterpreterPort' => <integer>,
        ],
        // ...
    ],
    'DevEndpointsNotFound' => ['<string>', ...],
]

Result Details

Members
DevEndpoints
Type: Array of DevEndpoint structures

A list of DevEndpoint definitions.

DevEndpointsNotFound
Type: Array of strings

A list of DevEndpoints not found.

Errors

AccessDeniedException:

Access to a resource was denied.

InternalServiceException:

An internal service error occurred.

OperationTimeoutException:

The operation timed out.

InvalidInputException:

The input provided was not valid.

BatchGetJobs

$result = $client->batchGetJobs([/* ... */]);
$promise = $client->batchGetJobsAsync([/* ... */]);

Returns a list of resource metadata for a given list of job names. After calling the ListJobs operation, you can call this operation to access the data to which you have been granted permissions. This operation supports all IAM permissions, including permission conditions that uses tags.

Parameter Syntax

$result = $client->batchGetJobs([
    'JobNames' => ['<string>', ...], // REQUIRED
]);

Parameter Details

Members
JobNames
Required: Yes
Type: Array of strings

A list of job names, which might be the names returned from the ListJobs operation.

Result Syntax

[
    'Jobs' => [
        [
            'AllocatedCapacity' => <integer>,
            'CodeGenConfigurationNodes' => [
                '<NodeId>' => [
                    'Aggregate' => [
                        'Aggs' => [
                            [
                                'AggFunc' => 'avg|countDistinct|count|first|last|kurtosis|max|min|skewness|stddev_samp|stddev_pop|sum|sumDistinct|var_samp|var_pop',
                                'Column' => ['<string>', ...],
                            ],
                            // ...
                        ],
                        'Groups' => [
                            ['<string>', ...],
                            // ...
                        ],
                        'Inputs' => ['<string>', ...],
                        'Name' => '<string>',
                    ],
                    'AmazonRedshiftSource' => [
                        'Data' => [
                            'AccessType' => '<string>',
                            'Action' => '<string>',
                            'AdvancedOptions' => [
                                [
                                    'Key' => '<string>',
                                    'Value' => '<string>',
                                ],
                                // ...
                            ],
                            'CatalogDatabase' => [
                                'Description' => '<string>',
                                'Label' => '<string>',
                                'Value' => '<string>',
                            ],
                            'CatalogRedshiftSchema' => '<string>',
                            'CatalogRedshiftTable' => '<string>',
                            'CatalogTable' => [
                                'Description' => '<string>',
                                'Label' => '<string>',
                                'Value' => '<string>',
                            ],
                            'Connection' => [
                                'Description' => '<string>',
                                'Label' => '<string>',
                                'Value' => '<string>',
                            ],
                            'CrawlerConnection' => '<string>',
                            'IamRole' => [
                                'Description' => '<string>',
                                'Label' => '<string>',
                                'Value' => '<string>',
                            ],
                            'MergeAction' => '<string>',
                            'MergeClause' => '<string>',
                            'MergeWhenMatched' => '<string>',
                            'MergeWhenNotMatched' => '<string>',
                            'PostAction' => '<string>',
                            'PreAction' => '<string>',
                            'SampleQuery' => '<string>',
                            'Schema' => [
                                'Description' => '<string>',
                                'Label' => '<string>',
                                'Value' => '<string>',
                            ],
                            'SelectedColumns' => [
                                [
                                    'Description' => '<string>',
                                    'Label' => '<string>',
                                    'Value' => '<string>',
                                ],
                                // ...
                            ],
                            'SourceType' => '<string>',
                            'StagingTable' => '<string>',
                            'Table' => [
                                'Description' => '<string>',
                                'Label' => '<string>',
                                'Value' => '<string>',
                            ],
                            'TablePrefix' => '<string>',
                            'TableSchema' => [
                                [
                                    'Description' => '<string>',
                                    'Label' => '<string>',
                                    'Value' => '<string>',
                                ],
                                // ...
                            ],
                            'TempDir' => '<string>',
                            'Upsert' => true || false,
                        ],
                        'Name' => '<string>',
                    ],
                    'AmazonRedshiftTarget' => [
                        'Data' => [
                            'AccessType' => '<string>',
                            'Action' => '<string>',
                            'AdvancedOptions' => [
                                [
                                    'Key' => '<string>',
                                    'Value' => '<string>',
                                ],
                                // ...
                            ],
                            'CatalogDatabase' => [
                                'Description' => '<string>',
                                'Label' => '<string>',
                                'Value' => '<string>',
                            ],
                            'CatalogRedshiftSchema' => '<string>',
                            'CatalogRedshiftTable' => '<string>',
                            'CatalogTable' => [
                                'Description' => '<string>',
                                'Label' => '<string>',
                                'Value' => '<string>',
                            ],
                            'Connection' => [
                                'Description' => '<string>',
                                'Label' => '<string>',
                                'Value' => '<string>',
                            ],
                            'CrawlerConnection' => '<string>',
                            'IamRole' => [
                                'Description' => '<string>',
                                'Label' => '<string>',
                                'Value' => '<string>',
                            ],
                            'MergeAction' => '<string>',
                            'MergeClause' => '<string>',
                            'MergeWhenMatched' => '<string>',
                            'MergeWhenNotMatched' => '<string>',
                            'PostAction' => '<string>',
                            'PreAction' => '<string>',
                            'SampleQuery' => '<string>',
                            'Schema' => [
                                'Description' => '<string>',
                                'Label' => '<string>',
                                'Value' => '<string>',
                            ],
                            'SelectedColumns' => [
                                [
                                    'Description' => '<string>',
                                    'Label' => '<string>',
                                    'Value' => '<string>',
                                ],
                                // ...
                            ],
                            'SourceType' => '<string>',
                            'StagingTable' => '<string>',
                            'Table' => [
                                'Description' => '<string>',
                                'Label' => '<string>',
                                'Value' => '<string>',
                            ],
                            'TablePrefix' => '<string>',
                            'TableSchema' => [
                                [
                                    'Description' => '<string>',
                                    'Label' => '<string>',
                                    'Value' => '<string>',
                                ],
                                // ...
                            ],
                            'TempDir' => '<string>',
                            'Upsert' => true || false,
                        ],
                        'Inputs' => ['<string>', ...],
                        'Name' => '<string>',
                    ],
                    'ApplyMapping' => [
                        'Inputs' => ['<string>', ...],
                        'Mapping' => [
                            [
                                'Children' => [...], // RECURSIVE
                                'Dropped' => true || false,
                                'FromPath' => ['<string>', ...],
                                'FromType' => '<string>',
                                'ToKey' => '<string>',
                                'ToType' => '<string>',
                            ],
                            // ...
                        ],
                        'Name' => '<string>',
                    ],
                    'AthenaConnectorSource' => [
                        'ConnectionName' => '<string>',
                        'ConnectionTable' => '<string>',
                        'ConnectionType' => '<string>',
                        'ConnectorName' => '<string>',
                        'Name' => '<string>',
                        'OutputSchemas' => [
                            [
                                'Columns' => [
                                    [
                                        'Name' => '<string>',
                                        'Type' => '<string>',
                                    ],
                                    // ...
                                ],
                            ],
                            // ...
                        ],
                        'SchemaName' => '<string>',
                    ],
                    'CatalogDeltaSource' => [
                        'AdditionalDeltaOptions' => ['<string>', ...],
                        'Database' => '<string>',
                        'Name' => '<string>',
                        'OutputSchemas' => [
                            [
                                'Columns' => [
                                    [
                                        'Name' => '<string>',
                                        'Type' => '<string>',
                                    ],
                                    // ...
                                ],
                            ],
                            // ...
                        ],
                        'Table' => '<string>',
                    ],
                    'CatalogHudiSource' => [
                        'AdditionalHudiOptions' => ['<string>', ...],
                        'Database' => '<string>',
                        'Name' => '<string>',
                        'OutputSchemas' => [
                            [
                                'Columns' => [
                                    [
                                        'Name' => '<string>',
                                        'Type' => '<string>',
                                    ],
                                    // ...
                                ],
                            ],
                            // ...
                        ],
                        'Table' => '<string>',
                    ],
                    'CatalogKafkaSource' => [
                        'DataPreviewOptions' => [
                            'PollingTime' => <integer>,
                            'RecordPollingLimit' => <integer>,
                        ],
                        'Database' => '<string>',
                        'DetectSchema' => true || false,
                        'Name' => '<string>',
                        'StreamingOptions' => [
                            'AddRecordTimestamp' => '<string>',
                            'Assign' => '<string>',
                            'BootstrapServers' => '<string>',
                            'Classification' => '<string>',
                            'ConnectionName' => '<string>',
                            'Delimiter' => '<string>',
                            'EmitConsumerLagMetrics' => '<string>',
                            'EndingOffsets' => '<string>',
                            'IncludeHeaders' => true || false,
                            'MaxOffsetsPerTrigger' => <integer>,
                            'MinPartitions' => <integer>,
                            'NumRetries' => <integer>,
                            'PollTimeoutMs' => <integer>,
                            'RetryIntervalMs' => <integer>,
                            'SecurityProtocol' => '<string>',
                            'StartingOffsets' => '<string>',
                            'StartingTimestamp' => <DateTime>,
                            'SubscribePattern' => '<string>',
                            'TopicName' => '<string>',
                        ],
                        'Table' => '<string>',
                        'WindowSize' => <integer>,
                    ],
                    'CatalogKinesisSource' => [
                        'DataPreviewOptions' => [
                            'PollingTime' => <integer>,
                            'RecordPollingLimit' => <integer>,
                        ],
                        'Database' => '<string>',
                        'DetectSchema' => true || false,
                        'Name' => '<string>',
                        'StreamingOptions' => [
                            'AddIdleTimeBetweenReads' => true || false,
                            'AddRecordTimestamp' => '<string>',
                            'AvoidEmptyBatches' => true || false,
                            'Classification' => '<string>',
                            'Delimiter' => '<string>',
                            'DescribeShardInterval' => <integer>,
                            'EmitConsumerLagMetrics' => '<string>',
                            'EndpointUrl' => '<string>',
                            'IdleTimeBetweenReadsInMs' => <integer>,
                            'MaxFetchRecordsPerShard' => <integer>,
                            'MaxFetchTimeInMs' => <integer>,
                            'MaxRecordPerRead' => <integer>,
                            'MaxRetryIntervalMs' => <integer>,
                            'NumRetries' => <integer>,
                            'RetryIntervalMs' => <integer>,
                            'RoleArn' => '<string>',
                            'RoleSessionName' => '<string>',
                            'StartingPosition' => 'latest|trim_horizon|earliest|timestamp',
                            'StartingTimestamp' => <DateTime>,
                            'StreamArn' => '<string>',
                            'StreamName' => '<string>',
                        ],
                        'Table' => '<string>',
                        'WindowSize' => <integer>,
                    ],
                    'CatalogSource' => [
                        'Database' => '<string>',
                        'Name' => '<string>',
                        'Table' => '<string>',
                    ],
                    'CatalogTarget' => [
                        'Database' => '<string>',
                        'Inputs' => ['<string>', ...],
                        'Name' => '<string>',
                        'PartitionKeys' => [
                            ['<string>', ...],
                            // ...
                        ],
                        'Table' => '<string>',
                    ],
                    'ConnectorDataSource' => [
                        'ConnectionType' => '<string>',
                        'Data' => ['<string>', ...],
                        'Name' => '<string>',
                        'OutputSchemas' => [
                            [
                                'Columns' => [
                                    [
                                        'Name' => '<string>',
                                        'Type' => '<string>',
                                    ],
                                    // ...
                                ],
                            ],
                            // ...
                        ],
                    ],
                    'ConnectorDataTarget' => [
                        'ConnectionType' => '<string>',
                        'Data' => ['<string>', ...],
                        'Inputs' => ['<string>', ...],
                        'Name' => '<string>',
                    ],
                    'CustomCode' => [
                        'ClassName' => '<string>',
                        'Code' => '<string>',
                        'Inputs' => ['<string>', ...],
                        'Name' => '<string>',
                        'OutputSchemas' => [
                            [
                                'Columns' => [
                                    [
                                        'Name' => '<string>',
                                        'Type' => '<string>',
                                    ],
                                    // ...
                                ],
                            ],
                            // ...
                        ],
                    ],
                    'DirectJDBCSource' => [
                        'ConnectionName' => '<string>',
                        'ConnectionType' => 'sqlserver|mysql|oracle|postgresql|redshift',
                        'Database' => '<string>',
                        'Name' => '<string>',
                        'RedshiftTmpDir' => '<string>',
                        'Table' => '<string>',
                    ],
                    'DirectKafkaSource' => [
                        'DataPreviewOptions' => [
                            'PollingTime' => <integer>,
                            'RecordPollingLimit' => <integer>,
                        ],
                        'DetectSchema' => true || false,
                        'Name' => '<string>',
                        'StreamingOptions' => [
                            'AddRecordTimestamp' => '<string>',
                            'Assign' => '<string>',
                            'BootstrapServers' => '<string>',
                            'Classification' => '<string>',
                            'ConnectionName' => '<string>',
                            'Delimiter' => '<string>',
                            'EmitConsumerLagMetrics' => '<string>',
                            'EndingOffsets' => '<string>',
                            'IncludeHeaders' => true || false,
                            'MaxOffsetsPerTrigger' => <integer>,
                            'MinPartitions' => <integer>,
                            'NumRetries' => <integer>,
                            'PollTimeoutMs' => <integer>,
                            'RetryIntervalMs' => <integer>,
                            'SecurityProtocol' => '<string>',
                            'StartingOffsets' => '<string>',
                            'StartingTimestamp' => <DateTime>,
                            'SubscribePattern' => '<string>',
                            'TopicName' => '<string>',
                        ],
                        'WindowSize' => <integer>,
                    ],
                    'DirectKinesisSource' => [
                        'DataPreviewOptions' => [
                            'PollingTime' => <integer>,
                            'RecordPollingLimit' => <integer>,
                        ],
                        'DetectSchema' => true || false,
                        'Name' => '<string>',
                        'StreamingOptions' => [
                            'AddIdleTimeBetweenReads' => true || false,
                            'AddRecordTimestamp' => '<string>',
                            'AvoidEmptyBatches' => true || false,
                            'Classification' => '<string>',
                            'Delimiter' => '<string>',
                            'DescribeShardInterval' => <integer>,
                            'EmitConsumerLagMetrics' => '<string>',
                            'EndpointUrl' => '<string>',
                            'IdleTimeBetweenReadsInMs' => <integer>,
                            'MaxFetchRecordsPerShard' => <integer>,
                            'MaxFetchTimeInMs' => <integer>,
                            'MaxRecordPerRead' => <integer>,
                            'MaxRetryIntervalMs' => <integer>,
                            'NumRetries' => <integer>,
                            'RetryIntervalMs' => <integer>,
                            'RoleArn' => '<string>',
                            'RoleSessionName' => '<string>',
                            'StartingPosition' => 'latest|trim_horizon|earliest|timestamp',
                            'StartingTimestamp' => <DateTime>,
                            'StreamArn' => '<string>',
                            'StreamName' => '<string>',
                        ],
                        'WindowSize' => <integer>,
                    ],
                    'DropDuplicates' => [
                        'Columns' => [
                            ['<string>', ...],
                            // ...
                        ],
                        'Inputs' => ['<string>', ...],
                        'Name' => '<string>',
                    ],
                    'DropFields' => [
                        'Inputs' => ['<string>', ...],
                        'Name' => '<string>',
                        'Paths' => [
                            ['<string>', ...],
                            // ...
                        ],
                    ],
                    'DropNullFields' => [
                        'Inputs' => ['<string>', ...],
                        'Name' => '<string>',
                        'NullCheckBoxList' => [
                            'IsEmpty' => true || false,
                            'IsNegOne' => true || false,
                            'IsNullString' => true || false,
                        ],
                        'NullTextList' => [
                            [
                                'Datatype' => [
                                    'Id' => '<string>',
                                    'Label' => '<string>',
                                ],
                                'Value' => '<string>',
                            ],
                            // ...
                        ],
                    ],
                    'DynamicTransform' => [
                        'FunctionName' => '<string>',
                        'Inputs' => ['<string>', ...],
                        'Name' => '<string>',
                        'OutputSchemas' => [
                            [
                                'Columns' => [
                                    [
                                        'Name' => '<string>',
                                        'Type' => '<string>',
                                    ],
                                    // ...
                                ],
                            ],
                            // ...
                        ],
                        'Parameters' => [
                            [
                                'IsOptional' => true || false,
                                'ListType' => 'str|int|float|complex|bool|list|null',
                                'Name' => '<string>',
                                'Type' => 'str|int|float|complex|bool|list|null',
                                'ValidationMessage' => '<string>',
                                'ValidationRule' => '<string>',
                                'Value' => ['<string>', ...],
                            ],
                            // ...
                        ],
                        'Path' => '<string>',
                        'TransformName' => '<string>',
                        'Version' => '<string>',
                    ],
                    'DynamoDBCatalogSource' => [
                        'Database' => '<string>',
                        'Name' => '<string>',
                        'Table' => '<string>',
                    ],
                    'EvaluateDataQuality' => [
                        'Inputs' => ['<string>', ...],
                        'Name' => '<string>',
                        'Output' => 'PrimaryInput|EvaluationResults',
                        'PublishingOptions' => [
                            'CloudWatchMetricsEnabled' => true || false,
                            'EvaluationContext' => '<string>',
                            'ResultsPublishingEnabled' => true || false,
                            'ResultsS3Prefix' => '<string>',
                        ],
                        'Ruleset' => '<string>',
                        'StopJobOnFailureOptions' => [
                            'StopJobOnFailureTiming' => 'Immediate|AfterDataLoad',
                        ],
                    ],
                    'EvaluateDataQualityMultiFrame' => [
                        'AdditionalDataSources' => ['<string>', ...],
                        'AdditionalOptions' => ['<string>', ...],
                        'Inputs' => ['<string>', ...],
                        'Name' => '<string>',
                        'PublishingOptions' => [
                            'CloudWatchMetricsEnabled' => true || false,
                            'EvaluationContext' => '<string>',
                            'ResultsPublishingEnabled' => true || false,
                            'ResultsS3Prefix' => '<string>',
                        ],
                        'Ruleset' => '<string>',
                        'StopJobOnFailureOptions' => [
                            'StopJobOnFailureTiming' => 'Immediate|AfterDataLoad',
                        ],
                    ],
                    'FillMissingValues' => [
                        'FilledPath' => '<string>',
                        'ImputedPath' => '<string>',
                        'Inputs' => ['<string>', ...],
                        'Name' => '<string>',
                    ],
                    'Filter' => [
                        'Filters' => [
                            [
                                'Negated' => true || false,
                                'Operation' => 'EQ|LT|GT|LTE|GTE|REGEX|ISNULL',
                                'Values' => [
                                    [
                                        'Type' => 'COLUMNEXTRACTED|CONSTANT',
                                        'Value' => ['<string>', ...],
                                    ],
                                    // ...
                                ],
                            ],
                            // ...
                        ],
                        'Inputs' => ['<string>', ...],
                        'LogicalOperator' => 'AND|OR',
                        'Name' => '<string>',
                    ],
                    'GovernedCatalogSource' => [
                        'AdditionalOptions' => [
                            'BoundedFiles' => <integer>,
                            'BoundedSize' => <integer>,
                        ],
                        'Database' => '<string>',
                        'Name' => '<string>',
                        'PartitionPredicate' => '<string>',
                        'Table' => '<string>',
                    ],
                    'GovernedCatalogTarget' => [
                        'Database' => '<string>',
                        'Inputs' => ['<string>', ...],
                        'Name' => '<string>',
                        'PartitionKeys' => [
                            ['<string>', ...],
                            // ...
                        ],
                        'SchemaChangePolicy' => [
                            'EnableUpdateCatalog' => true || false,
                            'UpdateBehavior' => 'UPDATE_IN_DATABASE|LOG',
                        ],
                        'Table' => '<string>',
                    ],
                    'JDBCConnectorSource' => [
                        'AdditionalOptions' => [
                            'DataTypeMapping' => ['<string>', ...],
                            'FilterPredicate' => '<string>',
                            'JobBookmarkKeys' => ['<string>', ...],
                            'JobBookmarkKeysSortOrder' => '<string>',
                            'LowerBound' => <integer>,
                            'NumPartitions' => <integer>,
                            'PartitionColumn' => '<string>',
                            'UpperBound' => <integer>,
                        ],
                        'ConnectionName' => '<string>',
                        'ConnectionTable' => '<string>',
                        'ConnectionType' => '<string>',
                        'ConnectorName' => '<string>',
                        'Name' => '<string>',
                        'OutputSchemas' => [
                            [
                                'Columns' => [
                                    [
                                        'Name' => '<string>',
                                        'Type' => '<string>',
                                    ],
                                    // ...
                                ],
                            ],
                            // ...
                        ],
                        'Query' => '<string>',
                    ],
                    'JDBCConnectorTarget' => [
                        'AdditionalOptions' => ['<string>', ...],
                        'ConnectionName' => '<string>',
                        'ConnectionTable' => '<string>',
                        'ConnectionType' => '<string>',
                        'ConnectorName' => '<string>',
                        'Inputs' => ['<string>', ...],
                        'Name' => '<string>',
                        'OutputSchemas' => [
                            [
                                'Columns' => [
                                    [
                                        'Name' => '<string>',
                                        'Type' => '<string>',
                                    ],
                                    // ...
                                ],
                            ],
                            // ...
                        ],
                    ],
                    'Join' => [
                        'Columns' => [
                            [
                                'From' => '<string>',
                                'Keys' => [
                                    ['<string>', ...],
                                    // ...
                                ],
                            ],
                            // ...
                        ],
                        'Inputs' => ['<string>', ...],
                        'JoinType' => 'equijoin|left|right|outer|leftsemi|leftanti',
                        'Name' => '<string>',
                    ],
                    'Merge' => [
                        'Inputs' => ['<string>', ...],
                        'Name' => '<string>',
                        'PrimaryKeys' => [
                            ['<string>', ...],
                            // ...
                        ],
                        'Source' => '<string>',
                    ],
                    'MicrosoftSQLServerCatalogSource' => [
                        'Database' => '<string>',
                        'Name' => '<string>',
                        'Table' => '<string>',
                    ],
                    'MicrosoftSQLServerCatalogTarget' => [
                        'Database' => '<string>',
                        'Inputs' => ['<string>', ...],
                        'Name' => '<string>',
                        'Table' => '<string>',
                    ],
                    'MySQLCatalogSource' => [
                        'Database' => '<string>',
                        'Name' => '<string>',
                        'Table' => '<string>',
                    ],
                    'MySQLCatalogTarget' => [
                        'Database' => '<string>',
                        'Inputs' => ['<string>', ...],
                        'Name' => '<string>',
                        'Table' => '<string>',
                    ],
                    'OracleSQLCatalogSource' => [
                        'Database' => '<string>',
                        'Name' => '<string>',
                        'Table' => '<string>',
                    ],
                    'OracleSQLCatalogTarget' => [
                        'Database' => '<string>',
                        'Inputs' => ['<string>', ...],
                        'Name' => '<string>',
                        'Table' => '<string>',
                    ],
                    'PIIDetection' => [
                        'EntityTypesToDetect' => ['<string>', ...],
                        'Inputs' => ['<string>', ...],
                        'MaskValue' => '<string>',
                        'Name' => '<string>',
                        'OutputColumnName' => '<string>',
                        'PiiType' => 'RowAudit|RowMasking|ColumnAudit|ColumnMasking',
                        'SampleFraction' => <float>,
                        'ThresholdFraction' => <float>,
                    ],
                    'PostgreSQLCatalogSource' => [
                        'Database' => '<string>',
                        'Name' => '<string>',
                        'Table' => '<string>',
                    ],
                    'PostgreSQLCatalogTarget' => [
                        'Database' => '<string>',
                        'Inputs' => ['<string>', ...],
                        'Name' => '<string>',
                        'Table' => '<string>',
                    ],
                    'Recipe' => [
                        'Inputs' => ['<string>', ...],
                        'Name' => '<string>',
                        'RecipeReference' => [
                            'RecipeArn' => '<string>',
                            'RecipeVersion' => '<string>',
                        ],
                        'RecipeSteps' => [
                            [
                                'Action' => [
                                    'Operation' => '<string>',
                                    'Parameters' => ['<string>', ...],
                                ],
                                'ConditionExpressions' => [
                                    [
                                        'Condition' => '<string>',
                                        'TargetColumn' => '<string>',
                                        'Value' => '<string>',
                                    ],
                                    // ...
                                ],
                            ],
                            // ...
                        ],
                    ],
                    'RedshiftSource' => [
                        'Database' => '<string>',
                        'Name' => '<string>',
                        'RedshiftTmpDir' => '<string>',
                        'Table' => '<string>',
                        'TmpDirIAMRole' => '<string>',
                    ],
                    'RedshiftTarget' => [
                        'Database' => '<string>',
                        'Inputs' => ['<string>', ...],
                        'Name' => '<string>',
                        'RedshiftTmpDir' => '<string>',
                        'Table' => '<string>',
                        'TmpDirIAMRole' => '<string>',
                        'UpsertRedshiftOptions' => [
                            'ConnectionName' => '<string>',
                            'TableLocation' => '<string>',
                            'UpsertKeys' => ['<string>', ...],
                        ],
                    ],
                    'RelationalCatalogSource' => [
                        'Database' => '<string>',
                        'Name' => '<string>',
                        'Table' => '<string>',
                    ],
                    'RenameField' => [
                        'Inputs' => ['<string>', ...],
                        'Name' => '<string>',
                        'SourcePath' => ['<string>', ...],
                        'TargetPath' => ['<string>', ...],
                    ],
                    'S3CatalogDeltaSource' => [
                        'AdditionalDeltaOptions' => ['<string>', ...],
                        'Database' => '<string>',
                        'Name' => '<string>',
                        'OutputSchemas' => [
                            [
                                'Columns' => [
                                    [
                                        'Name' => '<string>',
                                        'Type' => '<string>',
                                    ],
                                    // ...
                                ],
                            ],
                            // ...
                        ],
                        'Table' => '<string>',
                    ],
                    'S3CatalogHudiSource' => [
                        'AdditionalHudiOptions' => ['<string>', ...],
                        'Database' => '<string>',
                        'Name' => '<string>',
                        'OutputSchemas' => [
                            [
                                'Columns' => [
                                    [
                                        'Name' => '<string>',
                                        'Type' => '<string>',
                                    ],
                                    // ...
                                ],
                            ],
                            // ...
                        ],
                        'Table' => '<string>',
                    ],
                    'S3CatalogSource' => [
                        'AdditionalOptions' => [
                            'BoundedFiles' => <integer>,
                            'BoundedSize' => <integer>,
                        ],
                        'Database' => '<string>',
                        'Name' => '<string>',
                        'PartitionPredicate' => '<string>',
                        'Table' => '<string>',
                    ],
                    'S3CatalogTarget' => [
                        'Database' => '<string>',
                        'Inputs' => ['<string>', ...],
                        'Name' => '<string>',
                        'PartitionKeys' => [
                            ['<string>', ...],
                            // ...
                        ],
                        'SchemaChangePolicy' => [
                            'EnableUpdateCatalog' => true || false,
                            'UpdateBehavior' => 'UPDATE_IN_DATABASE|LOG',
                        ],
                        'Table' => '<string>',
                    ],
                    'S3CsvSource' => [
                        'AdditionalOptions' => [
                            'BoundedFiles' => <integer>,
                            'BoundedSize' => <integer>,
                            'EnableSamplePath' => true || false,
                            'SamplePath' => '<string>',
                        ],
                        'CompressionType' => 'gzip|bzip2',
                        'Escaper' => '<string>',
                        'Exclusions' => ['<string>', ...],
                        'GroupFiles' => '<string>',
                        'GroupSize' => '<string>',
                        'MaxBand' => <integer>,
                        'MaxFilesInBand' => <integer>,
                        'Multiline' => true || false,
                        'Name' => '<string>',
                        'OptimizePerformance' => true || false,
                        'OutputSchemas' => [
                            [
                                'Columns' => [
                                    [
                                        'Name' => '<string>',
                                        'Type' => '<string>',
                                    ],
                                    // ...
                                ],
                            ],
                            // ...
                        ],
                        'Paths' => ['<string>', ...],
                        'QuoteChar' => 'quote|quillemet|single_quote|disabled',
                        'Recurse' => true || false,
                        'Separator' => 'comma|ctrla|pipe|semicolon|tab',
                        'SkipFirst' => true || false,
                        'WithHeader' => true || false,
                        'WriteHeader' => true || false,
                    ],
                    'S3DeltaCatalogTarget' => [
                        'AdditionalOptions' => ['<string>', ...],
                        'Database' => '<string>',
                        'Inputs' => ['<string>', ...],
                        'Name' => '<string>',
                        'PartitionKeys' => [
                            ['<string>', ...],
                            // ...
                        ],
                        'SchemaChangePolicy' => [
                            'EnableUpdateCatalog' => true || false,
                            'UpdateBehavior' => 'UPDATE_IN_DATABASE|LOG',
                        ],
                        'Table' => '<string>',
                    ],
                    'S3DeltaDirectTarget' => [
                        'AdditionalOptions' => ['<string>', ...],
                        'Compression' => 'uncompressed|snappy',
                        'Format' => 'json|csv|avro|orc|parquet|hudi|delta',
                        'Inputs' => ['<string>', ...],
                        'Name' => '<string>',
                        'PartitionKeys' => [
                            ['<string>', ...],
                            // ...
                        ],
                        'Path' => '<string>',
                        'SchemaChangePolicy' => [
                            'Database' => '<string>',
                            'EnableUpdateCatalog' => true || false,
                            'Table' => '<string>',
                            'UpdateBehavior' => 'UPDATE_IN_DATABASE|LOG',
                        ],
                    ],
                    'S3DeltaSource' => [
                        'AdditionalDeltaOptions' => ['<string>', ...],
                        'AdditionalOptions' => [
                            'BoundedFiles' => <integer>,
                            'BoundedSize' => <integer>,
                            'EnableSamplePath' => true || false,
                            'SamplePath' => '<string>',
                        ],
                        'Name' => '<string>',
                        'OutputSchemas' => [
                            [
                                'Columns' => [
                                    [
                                        'Name' => '<string>',
                                        'Type' => '<string>',
                                    ],
                                    // ...
                                ],
                            ],
                            // ...
                        ],
                        'Paths' => ['<string>', ...],
                    ],
                    'S3DirectTarget' => [
                        'Compression' => '<string>',
                        'Format' => 'json|csv|avro|orc|parquet|hudi|delta',
                        'Inputs' => ['<string>', ...],
                        'Name' => '<string>',
                        'PartitionKeys' => [
                            ['<string>', ...],
                            // ...
                        ],
                        'Path' => '<string>',
                        'SchemaChangePolicy' => [
                            'Database' => '<string>',
                            'EnableUpdateCatalog' => true || false,
                            'Table' => '<string>',
                            'UpdateBehavior' => 'UPDATE_IN_DATABASE|LOG',
                        ],
                    ],
                    'S3GlueParquetTarget' => [
                        'Compression' => 'snappy|lzo|gzip|uncompressed|none',
                        'Inputs' => ['<string>', ...],
                        'Name' => '<string>',
                        'PartitionKeys' => [
                            ['<string>', ...],
                            // ...
                        ],
                        'Path' => '<string>',
                        'SchemaChangePolicy' => [
                            'Database' => '<string>',
                            'EnableUpdateCatalog' => true || false,
                            'Table' => '<string>',
                            'UpdateBehavior' => 'UPDATE_IN_DATABASE|LOG',
                        ],
                    ],
                    'S3HudiCatalogTarget' => [
                        'AdditionalOptions' => ['<string>', ...],
                        'Database' => '<string>',
                        'Inputs' => ['<string>', ...],
                        'Name' => '<string>',
                        'PartitionKeys' => [
                            ['<string>', ...],
                            // ...
                        ],
                        'SchemaChangePolicy' => [
                            'EnableUpdateCatalog' => true || false,
                            'UpdateBehavior' => 'UPDATE_IN_DATABASE|LOG',
                        ],
                        'Table' => '<string>',
                    ],
                    'S3HudiDirectTarget' => [
                        'AdditionalOptions' => ['<string>', ...],
                        'Compression' => 'gzip|lzo|uncompressed|snappy',
                        'Format' => 'json|csv|avro|orc|parquet|hudi|delta',
                        'Inputs' => ['<string>', ...],
                        'Name' => '<string>',
                        'PartitionKeys' => [
                            ['<string>', ...],
                            // ...
                        ],
                        'Path' => '<string>',
                        'SchemaChangePolicy' => [
                            'Database' => '<string>',
                            'EnableUpdateCatalog' => true || false,
                            'Table' => '<string>',
                            'UpdateBehavior' => 'UPDATE_IN_DATABASE|LOG',
                        ],
                    ],
                    'S3HudiSource' => [
                        'AdditionalHudiOptions' => ['<string>', ...],
                        'AdditionalOptions' => [
                            'BoundedFiles' => <integer>,
                            'BoundedSize' => <integer>,
                            'EnableSamplePath' => true || false,
                            'SamplePath' => '<string>',
                        ],
                        'Name' => '<string>',
                        'OutputSchemas' => [
                            [
                                'Columns' => [
                                    [
                                        'Name' => '<string>',
                                        'Type' => '<string>',
                                    ],
                                    // ...
                                ],
                            ],
                            // ...
                        ],
                        'Paths' => ['<string>', ...],
                    ],
                    'S3JsonSource' => [
                        'AdditionalOptions' => [
                            'BoundedFiles' => <integer>,
                            'BoundedSize' => <integer>,
                            'EnableSamplePath' => true || false,
                            'SamplePath' => '<string>',
                        ],
                        'CompressionType' => 'gzip|bzip2',
                        'Exclusions' => ['<string>', ...],
                        'GroupFiles' => '<string>',
                        'GroupSize' => '<string>',
                        'JsonPath' => '<string>',
                        'MaxBand' => <integer>,
                        'MaxFilesInBand' => <integer>,
                        'Multiline' => true || false,
                        'Name' => '<string>',
                        'OutputSchemas' => [
                            [
                                'Columns' => [
                                    [
                                        'Name' => '<string>',
                                        'Type' => '<string>',
                                    ],
                                    // ...
                                ],
                            ],
                            // ...
                        ],
                        'Paths' => ['<string>', ...],
                        'Recurse' => true || false,
                    ],
                    'S3ParquetSource' => [
                        'AdditionalOptions' => [
                            'BoundedFiles' => <integer>,
                            'BoundedSize' => <integer>,
                            'EnableSamplePath' => true || false,
                            'SamplePath' => '<string>',
                        ],
                        'CompressionType' => 'snappy|lzo|gzip|uncompressed|none',
                        'Exclusions' => ['<string>', ...],
                        'GroupFiles' => '<string>',
                        'GroupSize' => '<string>',
                        'MaxBand' => <integer>,
                        'MaxFilesInBand' => <integer>,
                        'Name' => '<string>',
                        'OutputSchemas' => [
                            [
                                'Columns' => [
                                    [
                                        'Name' => '<string>',
                                        'Type' => '<string>',
                                    ],
                                    // ...
                                ],
                            ],
                            // ...
                        ],
                        'Paths' => ['<string>', ...],
                        'Recurse' => true || false,
                    ],
                    'SelectFields' => [
                        'Inputs' => ['<string>', ...],
                        'Name' => '<string>',
                        'Paths' => [
                            ['<string>', ...],
                            // ...
                        ],
                    ],
                    'SelectFromCollection' => [
                        'Index' => <integer>,
                        'Inputs' => ['<string>', ...],
                        'Name' => '<string>',
                    ],
                    'SnowflakeSource' => [
                        'Data' => [
                            'Action' => '<string>',
                            'AdditionalOptions' => ['<string>', ...],
                            'AutoPushdown' => true || false,
                            'Connection' => [
                                'Description' => '<string>',
                                'Label' => '<string>',
                                'Value' => '<string>',
                            ],
                            'Database' => '<string>',
                            'IamRole' => [
                                'Description' => '<string>',
                                'Label' => '<string>',
                                'Value' => '<string>',
                            ],
                            'MergeAction' => '<string>',
                            'MergeClause' => '<string>',
                            'MergeWhenMatched' => '<string>',
                            'MergeWhenNotMatched' => '<string>',
                            'PostAction' => '<string>',
                            'PreAction' => '<string>',
                            'SampleQuery' => '<string>',
                            'Schema' => '<string>',
                            'SelectedColumns' => [
                                [
                                    'Description' => '<string>',
                                    'Label' => '<string>',
                                    'Value' => '<string>',
                                ],
                                // ...
                            ],
                            'SourceType' => '<string>',
                            'StagingTable' => '<string>',
                            'Table' => '<string>',
                            'TableSchema' => [
                                [
                                    'Description' => '<string>',
                                    'Label' => '<string>',
                                    'Value' => '<string>',
                                ],
                                // ...
                            ],
                            'TempDir' => '<string>',
                            'Upsert' => true || false,
                        ],
                        'Name' => '<string>',
                        'OutputSchemas' => [
                            [
                                'Columns' => [
                                    [
                                        'Name' => '<string>',
                                        'Type' => '<string>',
                                    ],
                                    // ...
                                ],
                            ],
                            // ...
                        ],
                    ],
                    'SnowflakeTarget' => [
                        'Data' => [
                            'Action' => '<string>',
                            'AdditionalOptions' => ['<string>', ...],
                            'AutoPushdown' => true || false,
                            'Connection' => [
                                'Description' => '<string>',
                                'Label' => '<string>',
                                'Value' => '<string>',
                            ],
                            'Database' => '<string>',
                            'IamRole' => [
                                'Description' => '<string>',
                                'Label' => '<string>',
                                'Value' => '<string>',
                            ],
                            'MergeAction' => '<string>',
                            'MergeClause' => '<string>',
                            'MergeWhenMatched' => '<string>',
                            'MergeWhenNotMatched' => '<string>',
                            'PostAction' => '<string>',
                            'PreAction' => '<string>',
                            'SampleQuery' => '<string>',
                            'Schema' => '<string>',
                            'SelectedColumns' => [
                                [
                                    'Description' => '<string>',
                                    'Label' => '<string>',
                                    'Value' => '<string>',
                                ],
                                // ...
                            ],
                            'SourceType' => '<string>',
                            'StagingTable' => '<string>',
                            'Table' => '<string>',
                            'TableSchema' => [
                                [
                                    'Description' => '<string>',
                                    'Label' => '<string>',
                                    'Value' => '<string>',
                                ],
                                // ...
                            ],
                            'TempDir' => '<string>',
                            'Upsert' => true || false,
                        ],
                        'Inputs' => ['<string>', ...],
                        'Name' => '<string>',
                    ],
                    'SparkConnectorSource' => [
                        'AdditionalOptions' => ['<string>', ...],
                        'ConnectionName' => '<string>',
                        'ConnectionType' => '<string>',
                        'ConnectorName' => '<string>',
                        'Name' => '<string>',
                        'OutputSchemas' => [
                            [
                                'Columns' => [
                                    [
                                        'Name' => '<string>',
                                        'Type' => '<string>',
                                    ],
                                    // ...
                                ],
                            ],
                            // ...
                        ],
                    ],
                    'SparkConnectorTarget' => [
                        'AdditionalOptions' => ['<string>', ...],
                        'ConnectionName' => '<string>',
                        'ConnectionType' => '<string>',
                        'ConnectorName' => '<string>',
                        'Inputs' => ['<string>', ...],
                        'Name' => '<string>',
                        'OutputSchemas' => [
                            [
                                'Columns' => [
                                    [
                                        'Name' => '<string>',
                                        'Type' => '<string>',
                                    ],
                                    // ...
                                ],
                            ],
                            // ...
                        ],
                    ],
                    'SparkSQL' => [
                        'Inputs' => ['<string>', ...],
                        'Name' => '<string>',
                        'OutputSchemas' => [
                            [
                                'Columns' => [
                                    [
                                        'Name' => '<string>',
                                        'Type' => '<string>',
                                    ],
                                    // ...
                                ],
                            ],
                            // ...
                        ],
                        'SqlAliases' => [
                            [
                                'Alias' => '<string>',
                                'From' => '<string>',
                            ],
                            // ...
                        ],
                        'SqlQuery' => '<string>',
                    ],
                    'Spigot' => [
                        'Inputs' => ['<string>', ...],
                        'Name' => '<string>',
                        'Path' => '<string>',
                        'Prob' => <float>,
                        'Topk' => <integer>,
                    ],
                    'SplitFields' => [
                        'Inputs' => ['<string>', ...],
                        'Name' => '<string>',
                        'Paths' => [
                            ['<string>', ...],
                            // ...
                        ],
                    ],
                    'Union' => [
                        'Inputs' => ['<string>', ...],
                        'Name' => '<string>',
                        'UnionType' => 'ALL|DISTINCT',
                    ],
                ],
                // ...
            ],
            'Command' => [
                'Name' => '<string>',
                'PythonVersion' => '<string>',
                'Runtime' => '<string>',
                'ScriptLocation' => '<string>',
            ],
            'Connections' => [
                'Connections' => ['<string>', ...],
            ],
            'CreatedOn' => <DateTime>,
            'DefaultArguments' => ['<string>', ...],
            'Description' => '<string>',
            'ExecutionClass' => 'FLEX|STANDARD',
            'ExecutionProperty' => [
                'MaxConcurrentRuns' => <integer>,
            ],
            'GlueVersion' => '<string>',
            'JobMode' => 'SCRIPT|VISUAL|NOTEBOOK',
            'JobRunQueuingEnabled' => true || false,
            'LastModifiedOn' => <DateTime>,
            'LogUri' => '<string>',
            'MaintenanceWindow' => '<string>',
            'MaxCapacity' => <float>,
            'MaxRetries' => <integer>,
            'Name' => '<string>',
            'NonOverridableArguments' => ['<string>', ...],
            'NotificationProperty' => [
                'NotifyDelayAfter' => <integer>,
            ],
            'NumberOfWorkers' => <integer>,
            'ProfileName' => '<string>',
            'Role' => '<string>',
            'SecurityConfiguration' => '<string>',
            'SourceControlDetails' => [
                'AuthStrategy' => 'PERSONAL_ACCESS_TOKEN|AWS_SECRETS_MANAGER',
                'AuthToken' => '<string>',
                'Branch' => '<string>',
                'Folder' => '<string>',
                'LastCommitId' => '<string>',
                'Owner' => '<string>',
                'Provider' => 'GITHUB|GITLAB|BITBUCKET|AWS_CODE_COMMIT',
                'Repository' => '<string>',
            ],
            'Timeout' => <integer>,
            'WorkerType' => 'Standard|G.1X|G.2X|G.025X|G.4X|G.8X|Z.2X',
        ],
        // ...
    ],
    'JobsNotFound' => ['<string>', ...],
]

Result Details

Members
Jobs
Type: Array of Job structures

A list of job definitions.

JobsNotFound
Type: Array of strings

A list of names of jobs not found.

Errors

InternalServiceException:

An internal service error occurred.

OperationTimeoutException:

The operation timed out.

InvalidInputException:

The input provided was not valid.

BatchGetPartition

$result = $client->batchGetPartition([/* ... */]);
$promise = $client->batchGetPartitionAsync([/* ... */]);

Retrieves partitions in a batch request.

Parameter Syntax

$result = $client->batchGetPartition([
    'CatalogId' => '<string>',
    'DatabaseName' => '<string>', // REQUIRED
    'PartitionsToGet' => [ // REQUIRED
        [
            'Values' => ['<string>', ...], // REQUIRED
        ],
        // ...
    ],
    'TableName' => '<string>', // REQUIRED
]);

Parameter Details

Members
CatalogId
Type: string

The ID of the Data Catalog where the partitions in question reside. If none is supplied, the Amazon Web Services account ID is used by default.

DatabaseName
Required: Yes
Type: string

The name of the catalog database where the partitions reside.

PartitionsToGet
Required: Yes
Type: Array of PartitionValueList structures

A list of partition values identifying the partitions to retrieve.

TableName
Required: Yes
Type: string

The name of the partitions' table.

Result Syntax

[
    'Partitions' => [
        [
            'CatalogId' => '<string>',
            'CreationTime' => <DateTime>,
            'DatabaseName' => '<string>',
            'LastAccessTime' => <DateTime>,
            'LastAnalyzedTime' => <DateTime>,
            'Parameters' => ['<string>', ...],
            'StorageDescriptor' => [
                'AdditionalLocations' => ['<string>', ...],
                'BucketColumns' => ['<string>', ...],
                'Columns' => [
                    [
                        'Comment' => '<string>',
                        'Name' => '<string>',
                        'Parameters' => ['<string>', ...],
                        'Type' => '<string>',
                    ],
                    // ...
                ],
                'Compressed' => true || false,
                'InputFormat' => '<string>',
                'Location' => '<string>',
                'NumberOfBuckets' => <integer>,
                'OutputFormat' => '<string>',
                'Parameters' => ['<string>', ...],
                'SchemaReference' => [
                    'SchemaId' => [
                        'RegistryName' => '<string>',
                        'SchemaArn' => '<string>',
                        'SchemaName' => '<string>',
                    ],
                    'SchemaVersionId' => '<string>',
                    'SchemaVersionNumber' => <integer>,
                ],
                'SerdeInfo' => [
                    'Name' => '<string>',
                    'Parameters' => ['<string>', ...],
                    'SerializationLibrary' => '<string>',
                ],
                'SkewedInfo' => [
                    'SkewedColumnNames' => ['<string>', ...],
                    'SkewedColumnValueLocationMaps' => ['<string>', ...],
                    'SkewedColumnValues' => ['<string>', ...],
                ],
                'SortColumns' => [
                    [
                        'Column' => '<string>',
                        'SortOrder' => <integer>,
                    ],
                    // ...
                ],
                'StoredAsSubDirectories' => true || false,
            ],
            'TableName' => '<string>',
            'Values' => ['<string>', ...],
        ],
        // ...
    ],
    'UnprocessedKeys' => [
        [
            'Values' => ['<string>', ...],
        ],
        // ...
    ],
]

Result Details

Members
Partitions
Type: Array of Partition structures

A list of the requested partitions.

UnprocessedKeys
Type: Array of PartitionValueList structures

A list of the partition values in the request for which partitions were not returned.

Errors

InvalidInputException:

The input provided was not valid.

EntityNotFoundException:

A specified entity does not exist

OperationTimeoutException:

The operation timed out.

InternalServiceException:

An internal service error occurred.

GlueEncryptionException:

An encryption operation failed.

InvalidStateException:

An error that indicates your data is in an invalid state.

FederationSourceException:

A federation source failed.

FederationSourceRetryableException:

A federation source failed, but the operation may be retried.

BatchGetTableOptimizer

$result = $client->batchGetTableOptimizer([/* ... */]);
$promise = $client->batchGetTableOptimizerAsync([/* ... */]);

Returns the configuration for the specified table optimizers.

Parameter Syntax

$result = $client->batchGetTableOptimizer([
    'Entries' => [ // REQUIRED
        [
            'catalogId' => '<string>',
            'databaseName' => '<string>',
            'tableName' => '<string>',
            'type' => 'compaction|retention|orphan_file_deletion',
        ],
        // ...
    ],
]);

Parameter Details

Members
Entries
Required: Yes
Type: Array of BatchGetTableOptimizerEntry structures

A list of BatchGetTableOptimizerEntry objects specifying the table optimizers to retrieve.

Result Syntax

[
    'Failures' => [
        [
            'catalogId' => '<string>',
            'databaseName' => '<string>',
            'error' => [
                'ErrorCode' => '<string>',
                'ErrorMessage' => '<string>',
            ],
            'tableName' => '<string>',
            'type' => 'compaction|retention|orphan_file_deletion',
        ],
        // ...
    ],
    'TableOptimizers' => [
        [
            'catalogId' => '<string>',
            'databaseName' => '<string>',
            'tableName' => '<string>',
            'tableOptimizer' => [
                'configuration' => [
                    'enabled' => true || false,
                    'orphanFileDeletionConfiguration' => [
                        'icebergConfiguration' => [
                            'location' => '<string>',
                            'orphanFileRetentionPeriodInDays' => <integer>,
                        ],
                    ],
                    'retentionConfiguration' => [
                        'icebergConfiguration' => [
                            'cleanExpiredFiles' => true || false,
                            'numberOfSnapshotsToRetain' => <integer>,
                            'snapshotRetentionPeriodInDays' => <integer>,
                        ],
                    ],
                    'roleArn' => '<string>',
                    'vpcConfiguration' => [
                        'glueConnectionName' => '<string>',
                    ],
                ],
                'lastRun' => [
                    'compactionMetrics' => [
                        'IcebergMetrics' => [
                            'JobDurationInHour' => <float>,
                            'NumberOfBytesCompacted' => <integer>,
                            'NumberOfDpus' => <integer>,
                            'NumberOfFilesCompacted' => <integer>,
                        ],
                    ],
                    'endTimestamp' => <DateTime>,
                    'error' => '<string>',
                    'eventType' => 'starting|completed|failed|in_progress',
                    'metrics' => [
                        'JobDurationInHour' => '<string>',
                        'NumberOfBytesCompacted' => '<string>',
                        'NumberOfDpus' => '<string>',
                        'NumberOfFilesCompacted' => '<string>',
                    ],
                    'orphanFileDeletionMetrics' => [
                        'IcebergMetrics' => [
                            'JobDurationInHour' => <float>,
                            'NumberOfDpus' => <integer>,
                            'NumberOfOrphanFilesDeleted' => <integer>,
                        ],
                    ],
                    'retentionMetrics' => [
                        'IcebergMetrics' => [
                            'JobDurationInHour' => <float>,
                            'NumberOfDataFilesDeleted' => <integer>,
                            'NumberOfDpus' => <integer>,
                            'NumberOfManifestFilesDeleted' => <integer>,
                            'NumberOfManifestListsDeleted' => <integer>,
                        ],
                    ],
                    'startTimestamp' => <DateTime>,
                ],
                'type' => 'compaction|retention|orphan_file_deletion',
            ],
        ],
        // ...
    ],
]

Result Details

Members
Failures
Type: Array of BatchGetTableOptimizerError structures

A list of errors from the operation.

TableOptimizers
Type: Array of BatchTableOptimizer structures

A list of BatchTableOptimizer objects.

Errors

EntityNotFoundException:

A specified entity does not exist

InvalidInputException:

The input provided was not valid.

AccessDeniedException:

Access to a resource was denied.

InternalServiceException:

An internal service error occurred.

ThrottlingException:

The throttling threshhold was exceeded.

BatchGetTriggers

$result = $client->batchGetTriggers([/* ... */]);
$promise = $client->batchGetTriggersAsync([/* ... */]);

Returns a list of resource metadata for a given list of trigger names. After calling the ListTriggers operation, you can call this operation to access the data to which you have been granted permissions. This operation supports all IAM permissions, including permission conditions that uses tags.

Parameter Syntax

$result = $client->batchGetTriggers([
    'TriggerNames' => ['<string>', ...], // REQUIRED
]);

Parameter Details

Members
TriggerNames
Required: Yes
Type: Array of strings

A list of trigger names, which may be the names returned from the ListTriggers operation.

Result Syntax

[
    'Triggers' => [
        [
            'Actions' => [
                [
                    'Arguments' => ['<string>', ...],
                    'CrawlerName' => '<string>',
                    'JobName' => '<string>',
                    'NotificationProperty' => [
                        'NotifyDelayAfter' => <integer>,
                    ],
                    'SecurityConfiguration' => '<string>',
                    'Timeout' => <integer>,
                ],
                // ...
            ],
            'Description' => '<string>',
            'EventBatchingCondition' => [
                'BatchSize' => <integer>,
                'BatchWindow' => <integer>,
            ],
            'Id' => '<string>',
            'Name' => '<string>',
            'Predicate' => [
                'Conditions' => [
                    [
                        'CrawlState' => 'RUNNING|CANCELLING|CANCELLED|SUCCEEDED|FAILED|ERROR',
                        'CrawlerName' => '<string>',
                        'JobName' => '<string>',
                        'LogicalOperator' => 'EQUALS',
                        'State' => 'STARTING|RUNNING|STOPPING|STOPPED|SUCCEEDED|FAILED|TIMEOUT|ERROR|WAITING|EXPIRED',
                    ],
                    // ...
                ],
                'Logical' => 'AND|ANY',
            ],
            'Schedule' => '<string>',
            'State' => 'CREATING|CREATED|ACTIVATING|ACTIVATED|DEACTIVATING|DEACTIVATED|DELETING|UPDATING',
            'Type' => 'SCHEDULED|CONDITIONAL|ON_DEMAND|EVENT',
            'WorkflowName' => '<string>',
        ],
        // ...
    ],
    'TriggersNotFound' => ['<string>', ...],
]

Result Details

Members
Triggers
Type: Array of Trigger structures

A list of trigger definitions.

TriggersNotFound
Type: Array of strings

A list of names of triggers not found.

Errors

InternalServiceException:

An internal service error occurred.

OperationTimeoutException:

The operation timed out.

InvalidInputException:

The input provided was not valid.

BatchGetWorkflows

$result = $client->batchGetWorkflows([/* ... */]);
$promise = $client->batchGetWorkflowsAsync([/* ... */]);

Returns a list of resource metadata for a given list of workflow names. After calling the ListWorkflows operation, you can call this operation to access the data to which you have been granted permissions. This operation supports all IAM permissions, including permission conditions that uses tags.

Parameter Syntax

$result = $client->batchGetWorkflows([
    'IncludeGraph' => true || false,
    'Names' => ['<string>', ...], // REQUIRED
]);

Parameter Details

Members
IncludeGraph
Type: boolean

Specifies whether to include a graph when returning the workflow resource metadata.

Names
Required: Yes
Type: Array of strings

A list of workflow names, which may be the names returned from the ListWorkflows operation.

Result Syntax

[
    'MissingWorkflows' => ['<string>', ...],
    'Workflows' => [
        [
            'BlueprintDetails' => [
                'BlueprintName' => '<string>',
                'RunId' => '<string>',
            ],
            'CreatedOn' => <DateTime>,
            'DefaultRunProperties' => ['<string>', ...],
            'Description' => '<string>',
            'Graph' => [
                'Edges' => [
                    [
                        'DestinationId' => '<string>',
                        'SourceId' => '<string>',
                    ],
                    // ...
                ],
                'Nodes' => [
                    [
                        'CrawlerDetails' => [
                            'Crawls' => [
                                [
                                    'CompletedOn' => <DateTime>,
                                    'ErrorMessage' => '<string>',
                                    'LogGroup' => '<string>',
                                    'LogStream' => '<string>',
                                    'StartedOn' => <DateTime>,
                                    'State' => 'RUNNING|CANCELLING|CANCELLED|SUCCEEDED|FAILED|ERROR',
                                ],
                                // ...
                            ],
                        ],
                        'JobDetails' => [
                            'JobRuns' => [
                                [
                                    'AllocatedCapacity' => <integer>,
                                    'Arguments' => ['<string>', ...],
                                    'Attempt' => <integer>,
                                    'CompletedOn' => <DateTime>,
                                    'DPUSeconds' => <float>,
                                    'ErrorMessage' => '<string>',
                                    'ExecutionClass' => 'FLEX|STANDARD',
                                    'ExecutionTime' => <integer>,
                                    'GlueVersion' => '<string>',
                                    'Id' => '<string>',
                                    'JobMode' => 'SCRIPT|VISUAL|NOTEBOOK',
                                    'JobName' => '<string>',
                                    'JobRunQueuingEnabled' => true || false,
                                    'JobRunState' => 'STARTING|RUNNING|STOPPING|STOPPED|SUCCEEDED|FAILED|TIMEOUT|ERROR|WAITING|EXPIRED',
                                    'LastModifiedOn' => <DateTime>,
                                    'LogGroupName' => '<string>',
                                    'MaintenanceWindow' => '<string>',
                                    'MaxCapacity' => <float>,
                                    'NotificationProperty' => [
                                        'NotifyDelayAfter' => <integer>,
                                    ],
                                    'NumberOfWorkers' => <integer>,
                                    'PredecessorRuns' => [
                                        [
                                            'JobName' => '<string>',
                                            'RunId' => '<string>',
                                        ],
                                        // ...
                                    ],
                                    'PreviousRunId' => '<string>',
                                    'ProfileName' => '<string>',
                                    'SecurityConfiguration' => '<string>',
                                    'StartedOn' => <DateTime>,
                                    'StateDetail' => '<string>',
                                    'Timeout' => <integer>,
                                    'TriggerName' => '<string>',
                                    'WorkerType' => 'Standard|G.1X|G.2X|G.025X|G.4X|G.8X|Z.2X',
                                ],
                                // ...
                            ],
                        ],
                        'Name' => '<string>',
                        'TriggerDetails' => [
                            'Trigger' => [
                                'Actions' => [
                                    [
                                        'Arguments' => ['<string>', ...],
                                        'CrawlerName' => '<string>',
                                        'JobName' => '<string>',
                                        'NotificationProperty' => [
                                            'NotifyDelayAfter' => <integer>,
                                        ],
                                        'SecurityConfiguration' => '<string>',
                                        'Timeout' => <integer>,
                                    ],
                                    // ...
                                ],
                                'Description' => '<string>',
                                'EventBatchingCondition' => [
                                    'BatchSize' => <integer>,
                                    'BatchWindow' => <integer>,
                                ],
                                'Id' => '<string>',
                                'Name' => '<string>',
                                'Predicate' => [
                                    'Conditions' => [
                                        [
                                            'CrawlState' => 'RUNNING|CANCELLING|CANCELLED|SUCCEEDED|FAILED|ERROR',
                                            'CrawlerName' => '<string>',
                                            'JobName' => '<string>',
                                            'LogicalOperator' => 'EQUALS',
                                            'State' => 'STARTING|RUNNING|STOPPING|STOPPED|SUCCEEDED|FAILED|TIMEOUT|ERROR|WAITING|EXPIRED',
                                        ],
                                        // ...
                                    ],
                                    'Logical' => 'AND|ANY',
                                ],
                                'Schedule' => '<string>',
                                'State' => 'CREATING|CREATED|ACTIVATING|ACTIVATED|DEACTIVATING|DEACTIVATED|DELETING|UPDATING',
                                'Type' => 'SCHEDULED|CONDITIONAL|ON_DEMAND|EVENT',
                                'WorkflowName' => '<string>',
                            ],
                        ],
                        'Type' => 'CRAWLER|JOB|TRIGGER',
                        'UniqueId' => '<string>',
                    ],
                    // ...
                ],
            ],
            'LastModifiedOn' => <DateTime>,
            'LastRun' => [
                'CompletedOn' => <DateTime>,
                'ErrorMessage' => '<string>',
                'Graph' => [
                    'Edges' => [
                        [
                            'DestinationId' => '<string>',
                            'SourceId' => '<string>',
                        ],
                        // ...
                    ],
                    'Nodes' => [
                        [
                            'CrawlerDetails' => [
                                'Crawls' => [
                                    [
                                        'CompletedOn' => <DateTime>,
                                        'ErrorMessage' => '<string>',
                                        'LogGroup' => '<string>',
                                        'LogStream' => '<string>',
                                        'StartedOn' => <DateTime>,
                                        'State' => 'RUNNING|CANCELLING|CANCELLED|SUCCEEDED|FAILED|ERROR',
                                    ],
                                    // ...
                                ],
                            ],
                            'JobDetails' => [
                                'JobRuns' => [
                                    [
                                        'AllocatedCapacity' => <integer>,
                                        'Arguments' => ['<string>', ...],
                                        'Attempt' => <integer>,
                                        'CompletedOn' => <DateTime>,
                                        'DPUSeconds' => <float>,
                                        'ErrorMessage' => '<string>',
                                        'ExecutionClass' => 'FLEX|STANDARD',
                                        'ExecutionTime' => <integer>,
                                        'GlueVersion' => '<string>',
                                        'Id' => '<string>',
                                        'JobMode' => 'SCRIPT|VISUAL|NOTEBOOK',
                                        'JobName' => '<string>',
                                        'JobRunQueuingEnabled' => true || false,
                                        'JobRunState' => 'STARTING|RUNNING|STOPPING|STOPPED|SUCCEEDED|FAILED|TIMEOUT|ERROR|WAITING|EXPIRED',
                                        'LastModifiedOn' => <DateTime>,
                                        'LogGroupName' => '<string>',
                                        'MaintenanceWindow' => '<string>',
                                        'MaxCapacity' => <float>,
                                        'NotificationProperty' => [
                                            'NotifyDelayAfter' => <integer>,
                                        ],
                                        'NumberOfWorkers' => <integer>,
                                        'PredecessorRuns' => [
                                            [
                                                'JobName' => '<string>',
                                                'RunId' => '<string>',
                                            ],
                                            // ...
                                        ],
                                        'PreviousRunId' => '<string>',
                                        'ProfileName' => '<string>',
                                        'SecurityConfiguration' => '<string>',
                                        'StartedOn' => <DateTime>,
                                        'StateDetail' => '<string>',
                                        'Timeout' => <integer>,
                                        'TriggerName' => '<string>',
                                        'WorkerType' => 'Standard|G.1X|G.2X|G.025X|G.4X|G.8X|Z.2X',
                                    ],
                                    // ...
                                ],
                            ],
                            'Name' => '<string>',
                            'TriggerDetails' => [
                                'Trigger' => [
                                    'Actions' => [
                                        [
                                            'Arguments' => ['<string>', ...],
                                            'CrawlerName' => '<string>',
                                            'JobName' => '<string>',
                                            'NotificationProperty' => [
                                                'NotifyDelayAfter' => <integer>,
                                            ],
                                            'SecurityConfiguration' => '<string>',
                                            'Timeout' => <integer>,
                                        ],
                                        // ...
                                    ],
                                    'Description' => '<string>',
                                    'EventBatchingCondition' => [
                                        'BatchSize' => <integer>,
                                        'BatchWindow' => <integer>,
                                    ],
                                    'Id' => '<string>',
                                    'Name' => '<string>',
                                    'Predicate' => [
                                        'Conditions' => [
                                            [
                                                'CrawlState' => 'RUNNING|CANCELLING|CANCELLED|SUCCEEDED|FAILED|ERROR',
                                                'CrawlerName' => '<string>',
                                                'JobName' => '<string>',
                                                'LogicalOperator' => 'EQUALS',
                                                'State' => 'STARTING|RUNNING|STOPPING|STOPPED|SUCCEEDED|FAILED|TIMEOUT|ERROR|WAITING|EXPIRED',
                                            ],
                                            // ...
                                        ],
                                        'Logical' => 'AND|ANY',
                                    ],
                                    'Schedule' => '<string>',
                                    'State' => 'CREATING|CREATED|ACTIVATING|ACTIVATED|DEACTIVATING|DEACTIVATED|DELETING|UPDATING',
                                    'Type' => 'SCHEDULED|CONDITIONAL|ON_DEMAND|EVENT',
                                    'WorkflowName' => '<string>',
                                ],
                            ],
                            'Type' => 'CRAWLER|JOB|TRIGGER',
                            'UniqueId' => '<string>',
                        ],
                        // ...
                    ],
                ],
                'Name' => '<string>',
                'PreviousRunId' => '<string>',
                'StartedOn' => <DateTime>,
                'StartingEventBatchCondition' => [
                    'BatchSize' => <integer>,
                    'BatchWindow' => <integer>,
                ],
                'Statistics' => [
                    'ErroredActions' => <integer>,
                    'FailedActions' => <integer>,
                    'RunningActions' => <integer>,
                    'StoppedActions' => <integer>,
                    'SucceededActions' => <integer>,
                    'TimeoutActions' => <integer>,
                    'TotalActions' => <integer>,
                    'WaitingActions' => <integer>,
                ],
                'Status' => 'RUNNING|COMPLETED|STOPPING|STOPPED|ERROR',
                'WorkflowRunId' => '<string>',
                'WorkflowRunProperties' => ['<string>', ...],
            ],
            'MaxConcurrentRuns' => <integer>,
            'Name' => '<string>',
        ],
        // ...
    ],
]

Result Details

Members
MissingWorkflows
Type: Array of strings

A list of names of workflows not found.

Workflows
Type: Array of Workflow structures

A list of workflow resource metadata.

Errors

InternalServiceException:

An internal service error occurred.

OperationTimeoutException:

The operation timed out.

InvalidInputException:

The input provided was not valid.

BatchPutDataQualityStatisticAnnotation

$result = $client->batchPutDataQualityStatisticAnnotation([/* ... */]);
$promise = $client->batchPutDataQualityStatisticAnnotationAsync([/* ... */]);

Annotate datapoints over time for a specific data quality statistic.

Parameter Syntax

$result = $client->batchPutDataQualityStatisticAnnotation([
    'ClientToken' => '<string>',
    'InclusionAnnotations' => [ // REQUIRED
        [
            'InclusionAnnotation' => 'INCLUDE|EXCLUDE',
            'ProfileId' => '<string>',
            'StatisticId' => '<string>',
        ],
        // ...
    ],
]);

Parameter Details

Members
ClientToken
Type: string

Client Token.

InclusionAnnotations
Required: Yes
Type: Array of DatapointInclusionAnnotation structures

A list of DatapointInclusionAnnotation's.

Result Syntax

[
    'FailedInclusionAnnotations' => [
        [
            'FailureReason' => '<string>',
            'ProfileId' => '<string>',
            'StatisticId' => '<string>',
        ],
        // ...
    ],
]

Result Details

Members
FailedInclusionAnnotations
Type: Array of AnnotationError structures

A list of AnnotationError's.

Errors

EntityNotFoundException:

A specified entity does not exist

InvalidInputException:

The input provided was not valid.

InternalServiceException:

An internal service error occurred.

ResourceNumberLimitExceededException:

A resource numerical limit was exceeded.

BatchStopJobRun

$result = $client->batchStopJobRun([/* ... */]);
$promise = $client->batchStopJobRunAsync([/* ... */]);

Stops one or more job runs for a specified job definition.

Parameter Syntax

$result = $client->batchStopJobRun([
    'JobName' => '<string>', // REQUIRED
    'JobRunIds' => ['<string>', ...], // REQUIRED
]);

Parameter Details

Members
JobName
Required: Yes
Type: string

The name of the job definition for which to stop job runs.

JobRunIds
Required: Yes
Type: Array of strings

A list of the JobRunIds that should be stopped for that job definition.

Result Syntax

[
    'Errors' => [
        [
            'ErrorDetail' => [
                'ErrorCode' => '<string>',
                'ErrorMessage' => '<string>',
            ],
            'JobName' => '<string>',
            'JobRunId' => '<string>',
        ],
        // ...
    ],
    'SuccessfulSubmissions' => [
        [
            'JobName' => '<string>',
            'JobRunId' => '<string>',
        ],
        // ...
    ],
]

Result Details

Members
Errors
Type: Array of BatchStopJobRunError structures

A list of the errors that were encountered in trying to stop JobRuns, including the JobRunId for which each error was encountered and details about the error.

SuccessfulSubmissions
Type: Array of BatchStopJobRunSuccessfulSubmission structures

A list of the JobRuns that were successfully submitted for stopping.

Errors

InvalidInputException:

The input provided was not valid.

InternalServiceException:

An internal service error occurred.

OperationTimeoutException:

The operation timed out.

BatchUpdatePartition

$result = $client->batchUpdatePartition([/* ... */]);
$promise = $client->batchUpdatePartitionAsync([/* ... */]);

Updates one or more partitions in a batch operation.

Parameter Syntax

$result = $client->batchUpdatePartition([
    'CatalogId' => '<string>',
    'DatabaseName' => '<string>', // REQUIRED
    'Entries' => [ // REQUIRED
        [
            'PartitionInput' => [ // REQUIRED
                'LastAccessTime' => <integer || string || DateTime>,
                'LastAnalyzedTime' => <integer || string || DateTime>,
                'Parameters' => ['<string>', ...],
                'StorageDescriptor' => [
                    'AdditionalLocations' => ['<string>', ...],
                    'BucketColumns' => ['<string>', ...],
                    'Columns' => [
                        [
                            'Comment' => '<string>',
                            'Name' => '<string>', // REQUIRED
                            'Parameters' => ['<string>', ...],
                            'Type' => '<string>',
                        ],
                        // ...
                    ],
                    'Compressed' => true || false,
                    'InputFormat' => '<string>',
                    'Location' => '<string>',
                    'NumberOfBuckets' => <integer>,
                    'OutputFormat' => '<string>',
                    'Parameters' => ['<string>', ...],
                    'SchemaReference' => [
                        'SchemaId' => [
                            'RegistryName' => '<string>',
                            'SchemaArn' => '<string>',
                            'SchemaName' => '<string>',
                        ],
                        'SchemaVersionId' => '<string>',
                        'SchemaVersionNumber' => <integer>,
                    ],
                    'SerdeInfo' => [
                        'Name' => '<string>',
                        'Parameters' => ['<string>', ...],
                        'SerializationLibrary' => '<string>',
                    ],
                    'SkewedInfo' => [
                        'SkewedColumnNames' => ['<string>', ...],
                        'SkewedColumnValueLocationMaps' => ['<string>', ...],
                        'SkewedColumnValues' => ['<string>', ...],
                    ],
                    'SortColumns' => [
                        [
                            'Column' => '<string>', // REQUIRED
                            'SortOrder' => <integer>, // REQUIRED
                        ],
                        // ...
                    ],
                    'StoredAsSubDirectories' => true || false,
                ],
                'Values' => ['<string>', ...],
            ],
            'PartitionValueList' => ['<string>', ...], // REQUIRED
        ],
        // ...
    ],
    'TableName' => '<string>', // REQUIRED
]);

Parameter Details

Members
CatalogId
Type: string

The ID of the catalog in which the partition is to be updated. Currently, this should be the Amazon Web Services account ID.

DatabaseName
Required: Yes
Type: string

The name of the metadata database in which the partition is to be updated.

Entries
Required: Yes
Type: Array of BatchUpdatePartitionRequestEntry structures

A list of up to 100 BatchUpdatePartitionRequestEntry objects to update.

TableName
Required: Yes
Type: string

The name of the metadata table in which the partition is to be updated.

Result Syntax

[
    'Errors' => [
        [
            'ErrorDetail' => [
                'ErrorCode' => '<string>',
                'ErrorMessage' => '<string>',
            ],
            'PartitionValueList' => ['<string>', ...],
        ],
        // ...
    ],
]

Result Details

Members
Errors
Type: Array of BatchUpdatePartitionFailureEntry structures

The errors encountered when trying to update the requested partitions. A list of BatchUpdatePartitionFailureEntry objects.

Errors

InvalidInputException:

The input provided was not valid.

EntityNotFoundException:

A specified entity does not exist

OperationTimeoutException:

The operation timed out.

InternalServiceException:

An internal service error occurred.

GlueEncryptionException:

An encryption operation failed.

CancelDataQualityRuleRecommendationRun

$result = $client->cancelDataQualityRuleRecommendationRun([/* ... */]);
$promise = $client->cancelDataQualityRuleRecommendationRunAsync([/* ... */]);

Cancels the specified recommendation run that was being used to generate rules.

Parameter Syntax

$result = $client->cancelDataQualityRuleRecommendationRun([
    'RunId' => '<string>', // REQUIRED
]);

Parameter Details

Members
RunId
Required: Yes
Type: string

The unique run identifier associated with this run.

Result Syntax

[]

Result Details

The results for this operation are always empty.

Errors

EntityNotFoundException:

A specified entity does not exist

InvalidInputException:

The input provided was not valid.

OperationTimeoutException:

The operation timed out.

InternalServiceException:

An internal service error occurred.

CancelDataQualityRulesetEvaluationRun

$result = $client->cancelDataQualityRulesetEvaluationRun([/* ... */]);
$promise = $client->cancelDataQualityRulesetEvaluationRunAsync([/* ... */]);

Cancels a run where a ruleset is being evaluated against a data source.

Parameter Syntax

$result = $client->cancelDataQualityRulesetEvaluationRun([
    'RunId' => '<string>', // REQUIRED
]);

Parameter Details

Members
RunId
Required: Yes
Type: string

The unique run identifier associated with this run.

Result Syntax

[]

Result Details

The results for this operation are always empty.

Errors

EntityNotFoundException:

A specified entity does not exist

InvalidInputException:

The input provided was not valid.

OperationTimeoutException:

The operation timed out.

InternalServiceException:

An internal service error occurred.

CancelMLTaskRun

$result = $client->cancelMLTaskRun([/* ... */]);
$promise = $client->cancelMLTaskRunAsync([/* ... */]);

Cancels (stops) a task run. Machine learning task runs are asynchronous tasks that Glue runs on your behalf as part of various machine learning workflows. You can cancel a machine learning task run at any time by calling CancelMLTaskRun with a task run's parent transform's TransformID and the task run's TaskRunId.

Parameter Syntax

$result = $client->cancelMLTaskRun([
    'TaskRunId' => '<string>', // REQUIRED
    'TransformId' => '<string>', // REQUIRED
]);

Parameter Details

Members
TaskRunId
Required: Yes
Type: string

A unique identifier for the task run.

TransformId
Required: Yes
Type: string

The unique identifier of the machine learning transform.

Result Syntax

[
    'Status' => 'STARTING|RUNNING|STOPPING|STOPPED|SUCCEEDED|FAILED|TIMEOUT',
    'TaskRunId' => '<string>',
    'TransformId' => '<string>',
]

Result Details

Members
Status
Type: string

The status for this run.

TaskRunId
Type: string

The unique identifier for the task run.

TransformId
Type: string

The unique identifier of the machine learning transform.

Errors

EntityNotFoundException:

A specified entity does not exist

InvalidInputException:

The input provided was not valid.

OperationTimeoutException:

The operation timed out.

InternalServiceException:

An internal service error occurred.

CancelStatement

$result = $client->cancelStatement([/* ... */]);
$promise = $client->cancelStatementAsync([/* ... */]);

Cancels the statement.

Parameter Syntax

$result = $client->cancelStatement([
    'Id' => <integer>, // REQUIRED
    'RequestOrigin' => '<string>',
    'SessionId' => '<string>', // REQUIRED
]);

Parameter Details

Members
Id
Required: Yes
Type: int

The ID of the statement to be cancelled.

RequestOrigin
Type: string

The origin of the request to cancel the statement.

SessionId
Required: Yes
Type: string

The Session ID of the statement to be cancelled.

Result Syntax

[]

Result Details

The results for this operation are always empty.

Errors

AccessDeniedException:

Access to a resource was denied.

EntityNotFoundException:

A specified entity does not exist

InternalServiceException:

An internal service error occurred.

OperationTimeoutException:

The operation timed out.

InvalidInputException:

The input provided was not valid.

IllegalSessionStateException:

The session is in an invalid state to perform a requested operation.

CheckSchemaVersionValidity

$result = $client->checkSchemaVersionValidity([/* ... */]);
$promise = $client->checkSchemaVersionValidityAsync([/* ... */]);

Validates the supplied schema. This call has no side effects, it simply validates using the supplied schema using DataFormat as the format. Since it does not take a schema set name, no compatibility checks are performed.

Parameter Syntax

$result = $client->checkSchemaVersionValidity([
    'DataFormat' => 'AVRO|JSON|PROTOBUF', // REQUIRED
    'SchemaDefinition' => '<string>', // REQUIRED
]);

Parameter Details

Members
DataFormat
Required: Yes
Type: string

The data format of the schema definition. Currently AVRO, JSON and PROTOBUF are supported.

SchemaDefinition
Required: Yes
Type: string

The definition of the schema that has to be validated.

Result Syntax

[
    'Error' => '<string>',
    'Valid' => true || false,
]

Result Details

Members
Error
Type: string

A validation failure error message.

Valid
Type: boolean

Return true, if the schema is valid and false otherwise.

Errors

InvalidInputException:

The input provided was not valid.

AccessDeniedException:

Access to a resource was denied.

InternalServiceException:

An internal service error occurred.

CreateBlueprint

$result = $client->createBlueprint([/* ... */]);
$promise = $client->createBlueprintAsync([/* ... */]);

Registers a blueprint with Glue.

Parameter Syntax

$result = $client->createBlueprint([
    'BlueprintLocation' => '<string>', // REQUIRED
    'Description' => '<string>',
    'Name' => '<string>', // REQUIRED
    'Tags' => ['<string>', ...],
]);

Parameter Details

Members
BlueprintLocation
Required: Yes
Type: string

Specifies a path in Amazon S3 where the blueprint is published.

Description
Type: string

A description of the blueprint.

Name
Required: Yes
Type: string

The name of the blueprint.

Tags
Type: Associative array of custom strings keys (TagKey) to strings

The tags to be applied to this blueprint.

Result Syntax

[
    'Name' => '<string>',
]

Result Details

Members
Name
Type: string

Returns the name of the blueprint that was registered.

Errors

AlreadyExistsException:

A resource to be created or added already exists.

InvalidInputException:

The input provided was not valid.

OperationTimeoutException:

The operation timed out.

InternalServiceException:

An internal service error occurred.

ResourceNumberLimitExceededException:

A resource numerical limit was exceeded.

CreateClassifier

$result = $client->createClassifier([/* ... */]);
$promise = $client->createClassifierAsync([/* ... */]);

Creates a classifier in the user's account. This can be a GrokClassifier, an XMLClassifier, a JsonClassifier, or a CsvClassifier, depending on which field of the request is present.

Parameter Syntax

$result = $client->createClassifier([
    'CsvClassifier' => [
        'AllowSingleColumn' => true || false,
        'ContainsHeader' => 'UNKNOWN|PRESENT|ABSENT',
        'CustomDatatypeConfigured' => true || false,
        'CustomDatatypes' => ['<string>', ...],
        'Delimiter' => '<string>',
        'DisableValueTrimming' => true || false,
        'Header' => ['<string>', ...],
        'Name' => '<string>', // REQUIRED
        'QuoteSymbol' => '<string>',
        'Serde' => 'OpenCSVSerDe|LazySimpleSerDe|None',
    ],
    'GrokClassifier' => [
        'Classification' => '<string>', // REQUIRED
        'CustomPatterns' => '<string>',
        'GrokPattern' => '<string>', // REQUIRED
        'Name' => '<string>', // REQUIRED
    ],
    'JsonClassifier' => [
        'JsonPath' => '<string>', // REQUIRED
        'Name' => '<string>', // REQUIRED
    ],
    'XMLClassifier' => [
        'Classification' => '<string>', // REQUIRED
        'Name' => '<string>', // REQUIRED
        'RowTag' => '<string>',
    ],
]);

Parameter Details

Members
CsvClassifier
Type: CreateCsvClassifierRequest structure

A CsvClassifier object specifying the classifier to create.

GrokClassifier
Type: CreateGrokClassifierRequest structure

A GrokClassifier object specifying the classifier to create.

JsonClassifier
Type: CreateJsonClassifierRequest structure

A JsonClassifier object specifying the classifier to create.

XMLClassifier
Type: CreateXMLClassifierRequest structure

An XMLClassifier object specifying the classifier to create.

Result Syntax

[]

Result Details

The results for this operation are always empty.

Errors

AlreadyExistsException:

A resource to be created or added already exists.

InvalidInputException:

The input provided was not valid.

OperationTimeoutException:

The operation timed out.

CreateColumnStatisticsTaskSettings

$result = $client->createColumnStatisticsTaskSettings([/* ... */]);
$promise = $client->createColumnStatisticsTaskSettingsAsync([/* ... */]);

Creates settings for a column statistics task.

Parameter Syntax

$result = $client->createColumnStatisticsTaskSettings([
    'CatalogID' => '<string>',
    'ColumnNameList' => ['<string>', ...],
    'DatabaseName' => '<string>', // REQUIRED
    'Role' => '<string>', // REQUIRED
    'SampleSize' => <float>,
    'Schedule' => '<string>',
    'SecurityConfiguration' => '<string>',
    'TableName' => '<string>', // REQUIRED
    'Tags' => ['<string>', ...],
]);

Parameter Details

Members
CatalogID
Type: string

The ID of the Data Catalog in which the database resides.

ColumnNameList
Type: Array of strings

A list of column names for which to run statistics.

DatabaseName
Required: Yes
Type: string

The name of the database where the table resides.

Role
Required: Yes
Type: string

The role used for running the column statistics.

SampleSize
Type: double

The percentage of data to sample.

Schedule
Type: string

A schedule for running the column statistics, specified in CRON syntax.

SecurityConfiguration
Type: string

Name of the security configuration that is used to encrypt CloudWatch logs.

TableName
Required: Yes
Type: string

The name of the table for which to generate column statistics.

Tags
Type: Associative array of custom strings keys (TagKey) to strings

A map of tags.

Result Syntax

[]

Result Details

The results for this operation are always empty.

Errors

AlreadyExistsException:

A resource to be created or added already exists.

AccessDeniedException:

Access to a resource was denied.

EntityNotFoundException:

A specified entity does not exist

InvalidInputException:

The input provided was not valid.

OperationTimeoutException:

The operation timed out.

ResourceNumberLimitExceededException:

A resource numerical limit was exceeded.

ColumnStatisticsTaskRunningException:

An exception thrown when you try to start another job while running a column stats generation job.

CreateConnection

$result = $client->createConnection([/* ... */]);
$promise = $client->createConnectionAsync([/* ... */]);

Creates a connection definition in the Data Catalog.

Connections used for creating federated resources require the IAM glue:PassConnection permission.

Parameter Syntax

$result = $client->createConnection([
    'CatalogId' => '<string>',
    'ConnectionInput' => [ // REQUIRED
        'AthenaProperties' => ['<string>', ...],
        'AuthenticationConfiguration' => [
            'AuthenticationType' => 'BASIC|OAUTH2|CUSTOM',
            'OAuth2Properties' => [
                'AuthorizationCodeProperties' => [
                    'AuthorizationCode' => '<string>',
                    'RedirectUri' => '<string>',
                ],
                'OAuth2ClientApplication' => [
                    'AWSManagedClientApplicationReference' => '<string>',
                    'UserManagedClientApplicationClientId' => '<string>',
                ],
                'OAuth2GrantType' => 'AUTHORIZATION_CODE|CLIENT_CREDENTIALS|JWT_BEARER',
                'TokenUrl' => '<string>',
                'TokenUrlParametersMap' => ['<string>', ...],
            ],
            'SecretArn' => '<string>',
        ],
        'ConnectionProperties' => ['<string>', ...], // REQUIRED
        'ConnectionType' => 'JDBC|SFTP|MONGODB|KAFKA|NETWORK|MARKETPLACE|CUSTOM|SALESFORCE|VIEW_VALIDATION_REDSHIFT|VIEW_VALIDATION_ATHENA', // REQUIRED
        'Description' => '<string>',
        'MatchCriteria' => ['<string>', ...],
        'Name' => '<string>', // REQUIRED
        'PhysicalConnectionRequirements' => [
            'AvailabilityZone' => '<string>',
            'SecurityGroupIdList' => ['<string>', ...],
            'SubnetId' => '<string>',
        ],
        'ValidateCredentials' => true || false,
    ],
    'Tags' => ['<string>', ...],
]);

Parameter Details

Members
CatalogId
Type: string

The ID of the Data Catalog in which to create the connection. If none is provided, the Amazon Web Services account ID is used by default.

ConnectionInput
Required: Yes
Type: ConnectionInput structure

A ConnectionInput object defining the connection to create.

Tags
Type: Associative array of custom strings keys (TagKey) to strings

The tags you assign to the connection.

Result Syntax

[
    'CreateConnectionStatus' => 'READY|IN_PROGRESS|FAILED',
]

Result Details

Members
CreateConnectionStatus
Type: string

The status of the connection creation request. The request can take some time for certain authentication types, for example when creating an OAuth connection with token exchange over VPC.

Errors

AlreadyExistsException:

A resource to be created or added already exists.

InvalidInputException:

The input provided was not valid.

OperationTimeoutException:

The operation timed out.

ResourceNumberLimitExceededException:

A resource numerical limit was exceeded.

GlueEncryptionException:

An encryption operation failed.

CreateCrawler

$result = $client->createCrawler([/* ... */]);
$promise = $client->createCrawlerAsync([/* ... */]);

Creates a new crawler with specified targets, role, configuration, and optional schedule. At least one crawl target must be specified, in the s3Targets field, the jdbcTargets field, or the DynamoDBTargets field.

Parameter Syntax

$result = $client->createCrawler([
    'Classifiers' => ['<string>', ...],
    'Configuration' => '<string>',
    'CrawlerSecurityConfiguration' => '<string>',
    'DatabaseName' => '<string>',
    'Description' => '<string>',
    'LakeFormationConfiguration' => [
        'AccountId' => '<string>',
        'UseLakeFormationCredentials' => true || false,
    ],
    'LineageConfiguration' => [
        'CrawlerLineageSettings' => 'ENABLE|DISABLE',
    ],
    'Name' => '<string>', // REQUIRED
    'RecrawlPolicy' => [
        'RecrawlBehavior' => 'CRAWL_EVERYTHING|CRAWL_NEW_FOLDERS_ONLY|CRAWL_EVENT_MODE',
    ],
    'Role' => '<string>', // REQUIRED
    'Schedule' => '<string>',
    'SchemaChangePolicy' => [
        'DeleteBehavior' => 'LOG|DELETE_FROM_DATABASE|DEPRECATE_IN_DATABASE',
        'UpdateBehavior' => 'LOG|UPDATE_IN_DATABASE',
    ],
    'TablePrefix' => '<string>',
    'Tags' => ['<string>', ...],
    'Targets' => [ // REQUIRED
        'CatalogTargets' => [
            [
                'ConnectionName' => '<string>',
                'DatabaseName' => '<string>', // REQUIRED
                'DlqEventQueueArn' => '<string>',
                'EventQueueArn' => '<string>',
                'Tables' => ['<string>', ...], // REQUIRED
            ],
            // ...
        ],
        'DeltaTargets' => [
            [
                'ConnectionName' => '<string>',
                'CreateNativeDeltaTable' => true || false,
                'DeltaTables' => ['<string>', ...],
                'WriteManifest' => true || false,
            ],
            // ...
        ],
        'DynamoDBTargets' => [
            [
                'Path' => '<string>',
                'scanAll' => true || false,
                'scanRate' => <float>,
            ],
            // ...
        ],
        'HudiTargets' => [
            [
                'ConnectionName' => '<string>',
                'Exclusions' => ['<string>', ...],
                'MaximumTraversalDepth' => <integer>,
                'Paths' => ['<string>', ...],
            ],
            // ...
        ],
        'IcebergTargets' => [
            [
                'ConnectionName' => '<string>',
                'Exclusions' => ['<string>', ...],
                'MaximumTraversalDepth' => <integer>,
                'Paths' => ['<string>', ...],
            ],
            // ...
        ],
        'JdbcTargets' => [
            [
                'ConnectionName' => '<string>',
                'EnableAdditionalMetadata' => ['<string>', ...],
                'Exclusions' => ['<string>', ...],
                'Path' => '<string>',
            ],
            // ...
        ],
        'MongoDBTargets' => [
            [
                'ConnectionName' => '<string>',
                'Path' => '<string>',
                'ScanAll' => true || false,
            ],
            // ...
        ],
        'S3Targets' => [
            [
                'ConnectionName' => '<string>',
                'DlqEventQueueArn' => '<string>',
                'EventQueueArn' => '<string>',
                'Exclusions' => ['<string>', ...],
                'Path' => '<string>',
                'SampleSize' => <integer>,
            ],
            // ...
        ],
    ],
]);

Parameter Details

Members
Classifiers
Type: Array of strings

A list of custom classifiers that the user has registered. By default, all built-in classifiers are included in a crawl, but these custom classifiers always override the default classifiers for a given classification.

Configuration
Type: string

Crawler configuration information. This versioned JSON string allows users to specify aspects of a crawler's behavior. For more information, see Setting crawler configuration options.

CrawlerSecurityConfiguration
Type: string

The name of the SecurityConfiguration structure to be used by this crawler.

DatabaseName
Type: string

The Glue database where results are written, such as: arn:aws:daylight:us-east-1::database/sometable/*.

Description
Type: string

A description of the new crawler.

LakeFormationConfiguration
Type: LakeFormationConfiguration structure

Specifies Lake Formation configuration settings for the crawler.

LineageConfiguration
Type: LineageConfiguration structure

Specifies data lineage configuration settings for the crawler.

Name
Required: Yes
Type: string

Name of the new crawler.

RecrawlPolicy
Type: RecrawlPolicy structure

A policy that specifies whether to crawl the entire dataset again, or to crawl only folders that were added since the last crawler run.

Role
Required: Yes
Type: string

The IAM role or Amazon Resource Name (ARN) of an IAM role used by the new crawler to access customer resources.

Schedule
Type: string

A cron expression used to specify the schedule (see Time-Based Schedules for Jobs and Crawlers. For example, to run something every day at 12:15 UTC, you would specify: cron(15 12 * * ? *).

SchemaChangePolicy
Type: SchemaChangePolicy structure

The policy for the crawler's update and deletion behavior.

TablePrefix
Type: string

The table prefix used for catalog tables that are created.

Tags
Type: Associative array of custom strings keys (TagKey) to strings

The tags to use with this crawler request. You may use tags to limit access to the crawler. For more information about tags in Glue, see Amazon Web Services Tags in Glue in the developer guide.

Targets
Required: Yes
Type: CrawlerTargets structure

A list of collection of targets to crawl.

Result Syntax

[]

Result Details

The results for this operation are always empty.

Errors

InvalidInputException:

The input provided was not valid.

AlreadyExistsException:

A resource to be created or added already exists.

OperationTimeoutException:

The operation timed out.

ResourceNumberLimitExceededException:

A resource numerical limit was exceeded.

CreateCustomEntityType

$result = $client->createCustomEntityType([/* ... */]);
$promise = $client->createCustomEntityTypeAsync([/* ... */]);

Creates a custom pattern that is used to detect sensitive data across the columns and rows of your structured data.

Each custom pattern you create specifies a regular expression and an optional list of context words. If no context words are passed only a regular expression is checked.

Parameter Syntax

$result = $client->createCustomEntityType([
    'ContextWords' => ['<string>', ...],
    'Name' => '<string>', // REQUIRED
    'RegexString' => '<string>', // REQUIRED
    'Tags' => ['<string>', ...],
]);

Parameter Details

Members
ContextWords
Type: Array of strings

A list of context words. If none of these context words are found within the vicinity of the regular expression the data will not be detected as sensitive data.

If no context words are passed only a regular expression is checked.

Name
Required: Yes
Type: string

A name for the custom pattern that allows it to be retrieved or deleted later. This name must be unique per Amazon Web Services account.

RegexString
Required: Yes
Type: string

A regular expression string that is used for detecting sensitive data in a custom pattern.

Tags
Type: Associative array of custom strings keys (TagKey) to strings

A list of tags applied to the custom entity type.

Result Syntax

[
    'Name' => '<string>',
]

Result Details

Members
Name
Type: string

The name of the custom pattern you created.

Errors

AccessDeniedException:

Access to a resource was denied.

AlreadyExistsException:

A resource to be created or added already exists.

IdempotentParameterMismatchException:

The same unique identifier was associated with two different records.

InternalServiceException:

An internal service error occurred.

InvalidInputException:

The input provided was not valid.

OperationTimeoutException:

The operation timed out.

ResourceNumberLimitExceededException:

A resource numerical limit was exceeded.

CreateDataQualityRuleset

$result = $client->createDataQualityRuleset([/* ... */]);
$promise = $client->createDataQualityRulesetAsync([/* ... */]);

Creates a data quality ruleset with DQDL rules applied to a specified Glue table.

You create the ruleset using the Data Quality Definition Language (DQDL). For more information, see the Glue developer guide.

Parameter Syntax

$result = $client->createDataQualityRuleset([
    'ClientToken' => '<string>',
    'DataQualitySecurityConfiguration' => '<string>',
    'Description' => '<string>',
    'Name' => '<string>', // REQUIRED
    'Ruleset' => '<string>', // REQUIRED
    'Tags' => ['<string>', ...],
    'TargetTable' => [
        'CatalogId' => '<string>',
        'DatabaseName' => '<string>', // REQUIRED
        'TableName' => '<string>', // REQUIRED
    ],
]);

Parameter Details

Members
ClientToken
Type: string

Used for idempotency and is recommended to be set to a random ID (such as a UUID) to avoid creating or starting multiple instances of the same resource.

DataQualitySecurityConfiguration
Type: string

The name of the security configuration created with the data quality encryption option.

Description
Type: string

A description of the data quality ruleset.

Name
Required: Yes
Type: string

A unique name for the data quality ruleset.

Ruleset
Required: Yes
Type: string

A Data Quality Definition Language (DQDL) ruleset. For more information, see the Glue developer guide.

Tags
Type: Associative array of custom strings keys (TagKey) to strings

A list of tags applied to the data quality ruleset.

TargetTable
Type: DataQualityTargetTable structure

A target table associated with the data quality ruleset.

Result Syntax

[
    'Name' => '<string>',
]

Result Details

Members
Name
Type: string

A unique name for the data quality ruleset.

Errors

InvalidInputException:

The input provided was not valid.

AlreadyExistsException:

A resource to be created or added already exists.

OperationTimeoutException:

The operation timed out.

InternalServiceException:

An internal service error occurred.

ResourceNumberLimitExceededException:

A resource numerical limit was exceeded.

CreateDatabase

$result = $client->createDatabase([/* ... */]);
$promise = $client->createDatabaseAsync([/* ... */]);

Creates a new database in a Data Catalog.

Parameter Syntax

$result = $client->createDatabase([
    'CatalogId' => '<string>',
    'DatabaseInput' => [ // REQUIRED
        'CreateTableDefaultPermissions' => [
            [
                'Permissions' => ['<string>', ...],
                'Principal' => [
                    'DataLakePrincipalIdentifier' => '<string>',
                ],
            ],
            // ...
        ],
        'Description' => '<string>',
        'FederatedDatabase' => [
            'ConnectionName' => '<string>',
            'Identifier' => '<string>',
        ],
        'LocationUri' => '<string>',
        'Name' => '<string>', // REQUIRED
        'Parameters' => ['<string>', ...],
        'TargetDatabase' => [
            'CatalogId' => '<string>',
            'DatabaseName' => '<string>',
            'Region' => '<string>',
        ],
    ],
    'Tags' => ['<string>', ...],
]);

Parameter Details

Members
CatalogId
Type: string

The ID of the Data Catalog in which to create the database. If none is provided, the Amazon Web Services account ID is used by default.

DatabaseInput
Required: Yes
Type: DatabaseInput structure

The metadata for the database.

Tags
Type: Associative array of custom strings keys (TagKey) to strings

The tags you assign to the database.

Result Syntax

[]

Result Details

The results for this operation are always empty.

Errors

InvalidInputException:

The input provided was not valid.

AlreadyExistsException:

A resource to be created or added already exists.

ResourceNumberLimitExceededException:

A resource numerical limit was exceeded.

InternalServiceException:

An internal service error occurred.

OperationTimeoutException:

The operation timed out.

GlueEncryptionException:

An encryption operation failed.

ConcurrentModificationException:

Two processes are trying to modify a resource simultaneously.

FederatedResourceAlreadyExistsException:

A federated resource already exists.

CreateDevEndpoint

$result = $client->createDevEndpoint([/* ... */]);
$promise = $client->createDevEndpointAsync([/* ... */]);

Creates a new development endpoint.

Parameter Syntax

$result = $client->createDevEndpoint([
    'Arguments' => ['<string>', ...],
    'EndpointName' => '<string>', // REQUIRED
    'ExtraJarsS3Path' => '<string>',
    'ExtraPythonLibsS3Path' => '<string>',
    'GlueVersion' => '<string>',
    'NumberOfNodes' => <integer>,
    'NumberOfWorkers' => <integer>,
    'PublicKey' => '<string>',
    'PublicKeys' => ['<string>', ...],
    'RoleArn' => '<string>', // REQUIRED
    'SecurityConfiguration' => '<string>',
    'SecurityGroupIds' => ['<string>', ...],
    'SubnetId' => '<string>',
    'Tags' => ['<string>', ...],
    'WorkerType' => 'Standard|G.1X|G.2X|G.025X|G.4X|G.8X|Z.2X',
]);

Parameter Details

Members
Arguments
Type: Associative array of custom strings keys (GenericString) to strings

A map of arguments used to configure the DevEndpoint.

EndpointName
Required: Yes
Type: string

The name to be assigned to the new DevEndpoint.

ExtraJarsS3Path
Type: string

The path to one or more Java .jar files in an S3 bucket that should be loaded in your DevEndpoint.

ExtraPythonLibsS3Path
Type: string

The paths to one or more Python libraries in an Amazon S3 bucket that should be loaded in your DevEndpoint. Multiple values must be complete paths separated by a comma.

You can only use pure Python libraries with a DevEndpoint. Libraries that rely on C extensions, such as the pandas Python data analysis library, are not yet supported.

GlueVersion
Type: string

Glue version determines the versions of Apache Spark and Python that Glue supports. The Python version indicates the version supported for running your ETL scripts on development endpoints.

For more information about the available Glue versions and corresponding Spark and Python versions, see Glue version in the developer guide.

Development endpoints that are created without specifying a Glue version default to Glue 0.9.

You can specify a version of Python support for development endpoints by using the Arguments parameter in the CreateDevEndpoint or UpdateDevEndpoint APIs. If no arguments are provided, the version defaults to Python 2.

NumberOfNodes
Type: int

The number of Glue Data Processing Units (DPUs) to allocate to this DevEndpoint.

NumberOfWorkers
Type: int

The number of workers of a defined workerType that are allocated to the development endpoint.

The maximum number of workers you can define are 299 for G.1X, and 149 for G.2X.

PublicKey
Type: string

The public key to be used by this DevEndpoint for authentication. This attribute is provided for backward compatibility because the recommended attribute to use is public keys.

PublicKeys
Type: Array of strings

A list of public keys to be used by the development endpoints for authentication. The use of this attribute is preferred over a single public key because the public keys allow you to have a different private key per client.

If you previously created an endpoint with a public key, you must remove that key to be able to set a list of public keys. Call the UpdateDevEndpoint API with the public key content in the deletePublicKeys attribute, and the list of new keys in the addPublicKeys attribute.

RoleArn
Required: Yes
Type: string

The IAM role for the DevEndpoint.

SecurityConfiguration
Type: string

The name of the SecurityConfiguration structure to be used with this DevEndpoint.

SecurityGroupIds
Type: Array of strings

Security group IDs for the security groups to be used by the new DevEndpoint.

SubnetId
Type: string

The subnet ID for the new DevEndpoint to use.

Tags
Type: Associative array of custom strings keys (TagKey) to strings

The tags to use with this DevEndpoint. You may use tags to limit access to the DevEndpoint. For more information about tags in Glue, see Amazon Web Services Tags in Glue in the developer guide.

WorkerType
Type: string

The type of predefined worker that is allocated to the development endpoint. Accepts a value of Standard, G.1X, or G.2X.

  • For the Standard worker type, each worker provides 4 vCPU, 16 GB of memory and a 50GB disk, and 2 executors per worker.

  • For the G.1X worker type, each worker maps to 1 DPU (4 vCPU, 16 GB of memory, 64 GB disk), and provides 1 executor per worker. We recommend this worker type for memory-intensive jobs.

  • For the G.2X worker type, each worker maps to 2 DPU (8 vCPU, 32 GB of memory, 128 GB disk), and provides 1 executor per worker. We recommend this worker type for memory-intensive jobs.

Known issue: when a development endpoint is created with the G.2X WorkerType configuration, the Spark drivers for the development endpoint will run on 4 vCPU, 16 GB of memory, and a 64 GB disk.

Result Syntax

[
    'Arguments' => ['<string>', ...],
    'AvailabilityZone' => '<string>',
    'CreatedTimestamp' => <DateTime>,
    'EndpointName' => '<string>',
    'ExtraJarsS3Path' => '<string>',
    'ExtraPythonLibsS3Path' => '<string>',
    'FailureReason' => '<string>',
    'GlueVersion' => '<string>',
    'NumberOfNodes' => <integer>,
    'NumberOfWorkers' => <integer>,
    'RoleArn' => '<string>',
    'SecurityConfiguration' => '<string>',
    'SecurityGroupIds' => ['<string>', ...],
    'Status' => '<string>',
    'SubnetId' => '<string>',
    'VpcId' => '<string>',
    'WorkerType' => 'Standard|G.1X|G.2X|G.025X|G.4X|G.8X|Z.2X',
    'YarnEndpointAddress' => '<string>',
    'ZeppelinRemoteSparkInterpreterPort' => <integer>,
]

Result Details

Members
Arguments
Type: Associative array of custom strings keys (GenericString) to strings

The map of arguments used to configure this DevEndpoint.

Valid arguments are:

  • "--enable-glue-datacatalog": ""

You can specify a version of Python support for development endpoints by using the Arguments parameter in the CreateDevEndpoint or UpdateDevEndpoint APIs. If no arguments are provided, the version defaults to Python 2.

AvailabilityZone
Type: string

The Amazon Web Services Availability Zone where this DevEndpoint is located.

CreatedTimestamp
Type: timestamp (string|DateTime or anything parsable by strtotime)

The point in time at which this DevEndpoint was created.

EndpointName
Type: string

The name assigned to the new DevEndpoint.

ExtraJarsS3Path
Type: string

Path to one or more Java .jar files in an S3 bucket that will be loaded in your DevEndpoint.

ExtraPythonLibsS3Path
Type: string

The paths to one or more Python libraries in an S3 bucket that will be loaded in your DevEndpoint.

FailureReason
Type: string

The reason for a current failure in this DevEndpoint.

GlueVersion
Type: string

Glue version determines the versions of Apache Spark and Python that Glue supports. The Python version indicates the version supported for running your ETL scripts on development endpoints.

For more information about the available Glue versions and corresponding Spark and Python versions, see Glue version in the developer guide.

NumberOfNodes
Type: int

The number of Glue Data Processing Units (DPUs) allocated to this DevEndpoint.

NumberOfWorkers
Type: int

The number of workers of a defined workerType that are allocated to the development endpoint.

RoleArn
Type: string

The Amazon Resource Name (ARN) of the role assigned to the new DevEndpoint.

SecurityConfiguration
Type: string

The name of the SecurityConfiguration structure being used with this DevEndpoint.

SecurityGroupIds
Type: Array of strings

The security groups assigned to the new DevEndpoint.

Status
Type: string

The current status of the new DevEndpoint.

SubnetId
Type: string

The subnet ID assigned to the new DevEndpoint.

VpcId
Type: string

The ID of the virtual private cloud (VPC) used by this DevEndpoint.

WorkerType
Type: string

The type of predefined worker that is allocated to the development endpoint. May be a value of Standard, G.1X, or G.2X.

YarnEndpointAddress
Type: string

The address of the YARN endpoint used by this DevEndpoint.

ZeppelinRemoteSparkInterpreterPort
Type: int

The Apache Zeppelin port for the remote Apache Spark interpreter.

Errors

AccessDeniedException:

Access to a resource was denied.

AlreadyExistsException:

A resource to be created or added already exists.

IdempotentParameterMismatchException:

The same unique identifier was associated with two different records.

InternalServiceException:

An internal service error occurred.

OperationTimeoutException:

The operation timed out.

InvalidInputException:

The input provided was not valid.

ValidationException:

A value could not be validated.

ResourceNumberLimitExceededException:

A resource numerical limit was exceeded.

CreateJob

$result = $client->createJob([/* ... */]);
$promise = $client->createJobAsync([/* ... */]);

Creates a new job definition.

Parameter Syntax

$result = $client->createJob([
    'AllocatedCapacity' => <integer>,
    'CodeGenConfigurationNodes' => [
        '<NodeId>' => [
            'Aggregate' => [
                'Aggs' => [ // REQUIRED
                    [
                        'AggFunc' => 'avg|countDistinct|count|first|last|kurtosis|max|min|skewness|stddev_samp|stddev_pop|sum|sumDistinct|var_samp|var_pop', // REQUIRED
                        'Column' => ['<string>', ...], // REQUIRED
                    ],
                    // ...
                ],
                'Groups' => [ // REQUIRED
                    ['<string>', ...],
                    // ...
                ],
                'Inputs' => ['<string>', ...], // REQUIRED
                'Name' => '<string>', // REQUIRED
            ],
            'AmazonRedshiftSource' => [
                'Data' => [
                    'AccessType' => '<string>',
                    'Action' => '<string>',
                    'AdvancedOptions' => [
                        [
                            'Key' => '<string>',
                            'Value' => '<string>',
                        ],
                        // ...
                    ],
                    'CatalogDatabase' => [
                        'Description' => '<string>',
                        'Label' => '<string>',
                        'Value' => '<string>',
                    ],
                    'CatalogRedshiftSchema' => '<string>',
                    'CatalogRedshiftTable' => '<string>',
                    'CatalogTable' => [
                        'Description' => '<string>',
                        'Label' => '<string>',
                        'Value' => '<string>',
                    ],
                    'Connection' => [
                        'Description' => '<string>',
                        'Label' => '<string>',
                        'Value' => '<string>',
                    ],
                    'CrawlerConnection' => '<string>',
                    'IamRole' => [
                        'Description' => '<string>',
                        'Label' => '<string>',
                        'Value' => '<string>',
                    ],
                    'MergeAction' => '<string>',
                    'MergeClause' => '<string>',
                    'MergeWhenMatched' => '<string>',
                    'MergeWhenNotMatched' => '<string>',
                    'PostAction' => '<string>',
                    'PreAction' => '<string>',
                    'SampleQuery' => '<string>',
                    'Schema' => [
                        'Description' => '<string>',
                        'Label' => '<string>',
                        'Value' => '<string>',
                    ],
                    'SelectedColumns' => [
                        [
                            'Description' => '<string>',
                            'Label' => '<string>',
                            'Value' => '<string>',
                        ],
                        // ...
                    ],
                    'SourceType' => '<string>',
                    'StagingTable' => '<string>',
                    'Table' => [
                        'Description' => '<string>',
                        'Label' => '<string>',
                        'Value' => '<string>',
                    ],
                    'TablePrefix' => '<string>',
                    'TableSchema' => [
                        [
                            'Description' => '<string>',
                            'Label' => '<string>',
                            'Value' => '<string>',
                        ],
                        // ...
                    ],
                    'TempDir' => '<string>',
                    'Upsert' => true || false,
                ],
                'Name' => '<string>',
            ],
            'AmazonRedshiftTarget' => [
                'Data' => [
                    'AccessType' => '<string>',
                    'Action' => '<string>',
                    'AdvancedOptions' => [
                        [
                            'Key' => '<string>',
                            'Value' => '<string>',
                        ],
                        // ...
                    ],
                    'CatalogDatabase' => [
                        'Description' => '<string>',
                        'Label' => '<string>',
                        'Value' => '<string>',
                    ],
                    'CatalogRedshiftSchema' => '<string>',
                    'CatalogRedshiftTable' => '<string>',
                    'CatalogTable' => [
                        'Description' => '<string>',
                        'Label' => '<string>',
                        'Value' => '<string>',
                    ],
                    'Connection' => [
                        'Description' => '<string>',
                        'Label' => '<string>',
                        'Value' => '<string>',
                    ],
                    'CrawlerConnection' => '<string>',
                    'IamRole' => [
                        'Description' => '<string>',
                        'Label' => '<string>',
                        'Value' => '<string>',
                    ],
                    'MergeAction' => '<string>',
                    'MergeClause' => '<string>',
                    'MergeWhenMatched' => '<string>',
                    'MergeWhenNotMatched' => '<string>',
                    'PostAction' => '<string>',
                    'PreAction' => '<string>',
                    'SampleQuery' => '<string>',
                    'Schema' => [
                        'Description' => '<string>',
                        'Label' => '<string>',
                        'Value' => '<string>',
                    ],
                    'SelectedColumns' => [
                        [
                            'Description' => '<string>',
                            'Label' => '<string>',
                            'Value' => '<string>',
                        ],
                        // ...
                    ],
                    'SourceType' => '<string>',
                    'StagingTable' => '<string>',
                    'Table' => [
                        'Description' => '<string>',
                        'Label' => '<string>',
                        'Value' => '<string>',
                    ],
                    'TablePrefix' => '<string>',
                    'TableSchema' => [
                        [
                            'Description' => '<string>',
                            'Label' => '<string>',
                            'Value' => '<string>',
                        ],
                        // ...
                    ],
                    'TempDir' => '<string>',
                    'Upsert' => true || false,
                ],
                'Inputs' => ['<string>', ...],
                'Name' => '<string>',
            ],
            'ApplyMapping' => [
                'Inputs' => ['<string>', ...], // REQUIRED
                'Mapping' => [ // REQUIRED
                    [
                        'Children' => [...], // RECURSIVE
                        'Dropped' => true || false,
                        'FromPath' => ['<string>', ...],
                        'FromType' => '<string>',
                        'ToKey' => '<string>',
                        'ToType' => '<string>',
                    ],
                    // ...
                ],
                'Name' => '<string>', // REQUIRED
            ],
            'AthenaConnectorSource' => [
                'ConnectionName' => '<string>', // REQUIRED
                'ConnectionTable' => '<string>',
                'ConnectionType' => '<string>', // REQUIRED
                'ConnectorName' => '<string>', // REQUIRED
                'Name' => '<string>', // REQUIRED
                'OutputSchemas' => [
                    [
                        'Columns' => [
                            [
                                'Name' => '<string>', // REQUIRED
                                'Type' => '<string>',
                            ],
                            // ...
                        ],
                    ],
                    // ...
                ],
                'SchemaName' => '<string>', // REQUIRED
            ],
            'CatalogDeltaSource' => [
                'AdditionalDeltaOptions' => ['<string>', ...],
                'Database' => '<string>', // REQUIRED
                'Name' => '<string>', // REQUIRED
                'OutputSchemas' => [
                    [
                        'Columns' => [
                            [
                                'Name' => '<string>', // REQUIRED
                                'Type' => '<string>',
                            ],
                            // ...
                        ],
                    ],
                    // ...
                ],
                'Table' => '<string>', // REQUIRED
            ],
            'CatalogHudiSource' => [
                'AdditionalHudiOptions' => ['<string>', ...],
                'Database' => '<string>', // REQUIRED
                'Name' => '<string>', // REQUIRED
                'OutputSchemas' => [
                    [
                        'Columns' => [
                            [
                                'Name' => '<string>', // REQUIRED
                                'Type' => '<string>',
                            ],
                            // ...
                        ],
                    ],
                    // ...
                ],
                'Table' => '<string>', // REQUIRED
            ],
            'CatalogKafkaSource' => [
                'DataPreviewOptions' => [
                    'PollingTime' => <integer>,
                    'RecordPollingLimit' => <integer>,
                ],
                'Database' => '<string>', // REQUIRED
                'DetectSchema' => true || false,
                'Name' => '<string>', // REQUIRED
                'StreamingOptions' => [
                    'AddRecordTimestamp' => '<string>',
                    'Assign' => '<string>',
                    'BootstrapServers' => '<string>',
                    'Classification' => '<string>',
                    'ConnectionName' => '<string>',
                    'Delimiter' => '<string>',
                    'EmitConsumerLagMetrics' => '<string>',
                    'EndingOffsets' => '<string>',
                    'IncludeHeaders' => true || false,
                    'MaxOffsetsPerTrigger' => <integer>,
                    'MinPartitions' => <integer>,
                    'NumRetries' => <integer>,
                    'PollTimeoutMs' => <integer>,
                    'RetryIntervalMs' => <integer>,
                    'SecurityProtocol' => '<string>',
                    'StartingOffsets' => '<string>',
                    'StartingTimestamp' => <integer || string || DateTime>,
                    'SubscribePattern' => '<string>',
                    'TopicName' => '<string>',
                ],
                'Table' => '<string>', // REQUIRED
                'WindowSize' => <integer>,
            ],
            'CatalogKinesisSource' => [
                'DataPreviewOptions' => [
                    'PollingTime' => <integer>,
                    'RecordPollingLimit' => <integer>,
                ],
                'Database' => '<string>', // REQUIRED
                'DetectSchema' => true || false,
                'Name' => '<string>', // REQUIRED
                'StreamingOptions' => [
                    'AddIdleTimeBetweenReads' => true || false,
                    'AddRecordTimestamp' => '<string>',
                    'AvoidEmptyBatches' => true || false,
                    'Classification' => '<string>',
                    'Delimiter' => '<string>',
                    'DescribeShardInterval' => <integer>,
                    'EmitConsumerLagMetrics' => '<string>',
                    'EndpointUrl' => '<string>',
                    'IdleTimeBetweenReadsInMs' => <integer>,
                    'MaxFetchRecordsPerShard' => <integer>,
                    'MaxFetchTimeInMs' => <integer>,
                    'MaxRecordPerRead' => <integer>,
                    'MaxRetryIntervalMs' => <integer>,
                    'NumRetries' => <integer>,
                    'RetryIntervalMs' => <integer>,
                    'RoleArn' => '<string>',
                    'RoleSessionName' => '<string>',
                    'StartingPosition' => 'latest|trim_horizon|earliest|timestamp',
                    'StartingTimestamp' => <integer || string || DateTime>,
                    'StreamArn' => '<string>',
                    'StreamName' => '<string>',
                ],
                'Table' => '<string>', // REQUIRED
                'WindowSize' => <integer>,
            ],
            'CatalogSource' => [
                'Database' => '<string>', // REQUIRED
                'Name' => '<string>', // REQUIRED
                'Table' => '<string>', // REQUIRED
            ],
            'CatalogTarget' => [
                'Database' => '<string>', // REQUIRED
                'Inputs' => ['<string>', ...], // REQUIRED
                'Name' => '<string>', // REQUIRED
                'PartitionKeys' => [
                    ['<string>', ...],
                    // ...
                ],
                'Table' => '<string>', // REQUIRED
            ],
            'ConnectorDataSource' => [
                'ConnectionType' => '<string>', // REQUIRED
                'Data' => ['<string>', ...], // REQUIRED
                'Name' => '<string>', // REQUIRED
                'OutputSchemas' => [
                    [
                        'Columns' => [
                            [
                                'Name' => '<string>', // REQUIRED
                                'Type' => '<string>',
                            ],
                            // ...
                        ],
                    ],
                    // ...
                ],
            ],
            'ConnectorDataTarget' => [
                'ConnectionType' => '<string>', // REQUIRED
                'Data' => ['<string>', ...], // REQUIRED
                'Inputs' => ['<string>', ...],
                'Name' => '<string>', // REQUIRED
            ],
            'CustomCode' => [
                'ClassName' => '<string>', // REQUIRED
                'Code' => '<string>', // REQUIRED
                'Inputs' => ['<string>', ...], // REQUIRED
                'Name' => '<string>', // REQUIRED
                'OutputSchemas' => [
                    [
                        'Columns' => [
                            [
                                'Name' => '<string>', // REQUIRED
                                'Type' => '<string>',
                            ],
                            // ...
                        ],
                    ],
                    // ...
                ],
            ],
            'DirectJDBCSource' => [
                'ConnectionName' => '<string>', // REQUIRED
                'ConnectionType' => 'sqlserver|mysql|oracle|postgresql|redshift', // REQUIRED
                'Database' => '<string>', // REQUIRED
                'Name' => '<string>', // REQUIRED
                'RedshiftTmpDir' => '<string>',
                'Table' => '<string>', // REQUIRED
            ],
            'DirectKafkaSource' => [
                'DataPreviewOptions' => [
                    'PollingTime' => <integer>,
                    'RecordPollingLimit' => <integer>,
                ],
                'DetectSchema' => true || false,
                'Name' => '<string>', // REQUIRED
                'StreamingOptions' => [
                    'AddRecordTimestamp' => '<string>',
                    'Assign' => '<string>',
                    'BootstrapServers' => '<string>',
                    'Classification' => '<string>',
                    'ConnectionName' => '<string>',
                    'Delimiter' => '<string>',
                    'EmitConsumerLagMetrics' => '<string>',
                    'EndingOffsets' => '<string>',
                    'IncludeHeaders' => true || false,
                    'MaxOffsetsPerTrigger' => <integer>,
                    'MinPartitions' => <integer>,
                    'NumRetries' => <integer>,
                    'PollTimeoutMs' => <integer>,
                    'RetryIntervalMs' => <integer>,
                    'SecurityProtocol' => '<string>',
                    'StartingOffsets' => '<string>',
                    'StartingTimestamp' => <integer || string || DateTime>,
                    'SubscribePattern' => '<string>',
                    'TopicName' => '<string>',
                ],
                'WindowSize' => <integer>,
            ],
            'DirectKinesisSource' => [
                'DataPreviewOptions' => [
                    'PollingTime' => <integer>,
                    'RecordPollingLimit' => <integer>,
                ],
                'DetectSchema' => true || false,
                'Name' => '<string>', // REQUIRED
                'StreamingOptions' => [
                    'AddIdleTimeBetweenReads' => true || false,
                    'AddRecordTimestamp' => '<string>',
                    'AvoidEmptyBatches' => true || false,
                    'Classification' => '<string>',
                    'Delimiter' => '<string>',
                    'DescribeShardInterval' => <integer>,
                    'EmitConsumerLagMetrics' => '<string>',
                    'EndpointUrl' => '<string>',
                    'IdleTimeBetweenReadsInMs' => <integer>,
                    'MaxFetchRecordsPerShard' => <integer>,
                    'MaxFetchTimeInMs' => <integer>,
                    'MaxRecordPerRead' => <integer>,
                    'MaxRetryIntervalMs' => <integer>,
                    'NumRetries' => <integer>,
                    'RetryIntervalMs' => <integer>,
                    'RoleArn' => '<string>',
                    'RoleSessionName' => '<string>',
                    'StartingPosition' => 'latest|trim_horizon|earliest|timestamp',
                    'StartingTimestamp' => <integer || string || DateTime>,
                    'StreamArn' => '<string>',
                    'StreamName' => '<string>',
                ],
                'WindowSize' => <integer>,
            ],
            'DropDuplicates' => [
                'Columns' => [
                    ['<string>', ...],
                    // ...
                ],
                'Inputs' => ['<string>', ...], // REQUIRED
                'Name' => '<string>', // REQUIRED
            ],
            'DropFields' => [
                'Inputs' => ['<string>', ...], // REQUIRED
                'Name' => '<string>', // REQUIRED
                'Paths' => [ // REQUIRED
                    ['<string>', ...],
                    // ...
                ],
            ],
            'DropNullFields' => [
                'Inputs' => ['<string>', ...], // REQUIRED
                'Name' => '<string>', // REQUIRED
                'NullCheckBoxList' => [
                    'IsEmpty' => true || false,
                    'IsNegOne' => true || false,
                    'IsNullString' => true || false,
                ],
                'NullTextList' => [
                    [
                        'Datatype' => [ // REQUIRED
                            'Id' => '<string>', // REQUIRED
                            'Label' => '<string>', // REQUIRED
                        ],
                        'Value' => '<string>', // REQUIRED
                    ],
                    // ...
                ],
            ],
            'DynamicTransform' => [
                'FunctionName' => '<string>', // REQUIRED
                'Inputs' => ['<string>', ...], // REQUIRED
                'Name' => '<string>', // REQUIRED
                'OutputSchemas' => [
                    [
                        'Columns' => [
                            [
                                'Name' => '<string>', // REQUIRED
                                'Type' => '<string>',
                            ],
                            // ...
                        ],
                    ],
                    // ...
                ],
                'Parameters' => [
                    [
                        'IsOptional' => true || false,
                        'ListType' => 'str|int|float|complex|bool|list|null',
                        'Name' => '<string>', // REQUIRED
                        'Type' => 'str|int|float|complex|bool|list|null', // REQUIRED
                        'ValidationMessage' => '<string>',
                        'ValidationRule' => '<string>',
                        'Value' => ['<string>', ...],
                    ],
                    // ...
                ],
                'Path' => '<string>', // REQUIRED
                'TransformName' => '<string>', // REQUIRED
                'Version' => '<string>',
            ],
            'DynamoDBCatalogSource' => [
                'Database' => '<string>', // REQUIRED
                'Name' => '<string>', // REQUIRED
                'Table' => '<string>', // REQUIRED
            ],
            'EvaluateDataQuality' => [
                'Inputs' => ['<string>', ...], // REQUIRED
                'Name' => '<string>', // REQUIRED
                'Output' => 'PrimaryInput|EvaluationResults',
                'PublishingOptions' => [
                    'CloudWatchMetricsEnabled' => true || false,
                    'EvaluationContext' => '<string>',
                    'ResultsPublishingEnabled' => true || false,
                    'ResultsS3Prefix' => '<string>',
                ],
                'Ruleset' => '<string>', // REQUIRED
                'StopJobOnFailureOptions' => [
                    'StopJobOnFailureTiming' => 'Immediate|AfterDataLoad',
                ],
            ],
            'EvaluateDataQualityMultiFrame' => [
                'AdditionalDataSources' => ['<string>', ...],
                'AdditionalOptions' => ['<string>', ...],
                'Inputs' => ['<string>', ...], // REQUIRED
                'Name' => '<string>', // REQUIRED
                'PublishingOptions' => [
                    'CloudWatchMetricsEnabled' => true || false,
                    'EvaluationContext' => '<string>',
                    'ResultsPublishingEnabled' => true || false,
                    'ResultsS3Prefix' => '<string>',
                ],
                'Ruleset' => '<string>', // REQUIRED
                'StopJobOnFailureOptions' => [
                    'StopJobOnFailureTiming' => 'Immediate|AfterDataLoad',
                ],
            ],
            'FillMissingValues' => [
                'FilledPath' => '<string>',
                'ImputedPath' => '<string>', // REQUIRED
                'Inputs' => ['<string>', ...], // REQUIRED
                'Name' => '<string>', // REQUIRED
            ],
            'Filter' => [
                'Filters' => [ // REQUIRED
                    [
                        'Negated' => true || false,
                        'Operation' => 'EQ|LT|GT|LTE|GTE|REGEX|ISNULL', // REQUIRED
                        'Values' => [ // REQUIRED
                            [
                                'Type' => 'COLUMNEXTRACTED|CONSTANT', // REQUIRED
                                'Value' => ['<string>', ...], // REQUIRED
                            ],
                            // ...
                        ],
                    ],
                    // ...
                ],
                'Inputs' => ['<string>', ...], // REQUIRED
                'LogicalOperator' => 'AND|OR', // REQUIRED
                'Name' => '<string>', // REQUIRED
            ],
            'GovernedCatalogSource' => [
                'AdditionalOptions' => [
                    'BoundedFiles' => <integer>,
                    'BoundedSize' => <integer>,
                ],
                'Database' => '<string>', // REQUIRED
                'Name' => '<string>', // REQUIRED
                'PartitionPredicate' => '<string>',
                'Table' => '<string>', // REQUIRED
            ],
            'GovernedCatalogTarget' => [
                'Database' => '<string>', // REQUIRED
                'Inputs' => ['<string>', ...], // REQUIRED
                'Name' => '<string>', // REQUIRED
                'PartitionKeys' => [
                    ['<string>', ...],
                    // ...
                ],
                'SchemaChangePolicy' => [
                    'EnableUpdateCatalog' => true || false,
                    'UpdateBehavior' => 'UPDATE_IN_DATABASE|LOG',
                ],
                'Table' => '<string>', // REQUIRED
            ],
            'JDBCConnectorSource' => [
                'AdditionalOptions' => [
                    'DataTypeMapping' => ['<string>', ...],
                    'FilterPredicate' => '<string>',
                    'JobBookmarkKeys' => ['<string>', ...],
                    'JobBookmarkKeysSortOrder' => '<string>',
                    'LowerBound' => <integer>,
                    'NumPartitions' => <integer>,
                    'PartitionColumn' => '<string>',
                    'UpperBound' => <integer>,
                ],
                'ConnectionName' => '<string>', // REQUIRED
                'ConnectionTable' => '<string>',
                'ConnectionType' => '<string>', // REQUIRED
                'ConnectorName' => '<string>', // REQUIRED
                'Name' => '<string>', // REQUIRED
                'OutputSchemas' => [
                    [
                        'Columns' => [
                            [
                                'Name' => '<string>', // REQUIRED
                                'Type' => '<string>',
                            ],
                            // ...
                        ],
                    ],
                    // ...
                ],
                'Query' => '<string>',
            ],
            'JDBCConnectorTarget' => [
                'AdditionalOptions' => ['<string>', ...],
                'ConnectionName' => '<string>', // REQUIRED
                'ConnectionTable' => '<string>', // REQUIRED
                'ConnectionType' => '<string>', // REQUIRED
                'ConnectorName' => '<string>', // REQUIRED
                'Inputs' => ['<string>', ...], // REQUIRED
                'Name' => '<string>', // REQUIRED
                'OutputSchemas' => [
                    [
                        'Columns' => [
                            [
                                'Name' => '<string>', // REQUIRED
                                'Type' => '<string>',
                            ],
                            // ...
                        ],
                    ],
                    // ...
                ],
            ],
            'Join' => [
                'Columns' => [ // REQUIRED
                    [
                        'From' => '<string>', // REQUIRED
                        'Keys' => [ // REQUIRED
                            ['<string>', ...],
                            // ...
                        ],
                    ],
                    // ...
                ],
                'Inputs' => ['<string>', ...], // REQUIRED
                'JoinType' => 'equijoin|left|right|outer|leftsemi|leftanti', // REQUIRED
                'Name' => '<string>', // REQUIRED
            ],
            'Merge' => [
                'Inputs' => ['<string>', ...], // REQUIRED
                'Name' => '<string>', // REQUIRED
                'PrimaryKeys' => [ // REQUIRED
                    ['<string>', ...],
                    // ...
                ],
                'Source' => '<string>', // REQUIRED
            ],
            'MicrosoftSQLServerCatalogSource' => [
                'Database' => '<string>', // REQUIRED
                'Name' => '<string>', // REQUIRED
                'Table' => '<string>', // REQUIRED
            ],
            'MicrosoftSQLServerCatalogTarget' => [
                'Database' => '<string>', // REQUIRED
                'Inputs' => ['<string>', ...], // REQUIRED
                'Name' => '<string>', // REQUIRED
                'Table' => '<string>', // REQUIRED
            ],
            'MySQLCatalogSource' => [
                'Database' => '<string>', // REQUIRED
                'Name' => '<string>', // REQUIRED
                'Table' => '<string>', // REQUIRED
            ],
            'MySQLCatalogTarget' => [
                'Database' => '<string>', // REQUIRED
                'Inputs' => ['<string>', ...], // REQUIRED
                'Name' => '<string>', // REQUIRED
                'Table' => '<string>', // REQUIRED
            ],
            'OracleSQLCatalogSource' => [
                'Database' => '<string>', // REQUIRED
                'Name' => '<string>', // REQUIRED
                'Table' => '<string>', // REQUIRED
            ],
            'OracleSQLCatalogTarget' => [
                'Database' => '<string>', // REQUIRED
                'Inputs' => ['<string>', ...], // REQUIRED
                'Name' => '<string>', // REQUIRED
                'Table' => '<string>', // REQUIRED
            ],
            'PIIDetection' => [
                'EntityTypesToDetect' => ['<string>', ...], // REQUIRED
                'Inputs' => ['<string>', ...], // REQUIRED
                'MaskValue' => '<string>',
                'Name' => '<string>', // REQUIRED
                'OutputColumnName' => '<string>',
                'PiiType' => 'RowAudit|RowMasking|ColumnAudit|ColumnMasking', // REQUIRED
                'SampleFraction' => <float>,
                'ThresholdFraction' => <float>,
            ],
            'PostgreSQLCatalogSource' => [
                'Database' => '<string>', // REQUIRED
                'Name' => '<string>', // REQUIRED
                'Table' => '<string>', // REQUIRED
            ],
            'PostgreSQLCatalogTarget' => [
                'Database' => '<string>', // REQUIRED
                'Inputs' => ['<string>', ...], // REQUIRED
                'Name' => '<string>', // REQUIRED
                'Table' => '<string>', // REQUIRED
            ],
            'Recipe' => [
                'Inputs' => ['<string>', ...], // REQUIRED
                'Name' => '<string>', // REQUIRED
                'RecipeReference' => [
                    'RecipeArn' => '<string>', // REQUIRED
                    'RecipeVersion' => '<string>', // REQUIRED
                ],
                'RecipeSteps' => [
                    [
                        'Action' => [ // REQUIRED
                            'Operation' => '<string>', // REQUIRED
                            'Parameters' => ['<string>', ...],
                        ],
                        'ConditionExpressions' => [
                            [
                                'Condition' => '<string>', // REQUIRED
                                'TargetColumn' => '<string>', // REQUIRED
                                'Value' => '<string>',
                            ],
                            // ...
                        ],
                    ],
                    // ...
                ],
            ],
            'RedshiftSource' => [
                'Database' => '<string>', // REQUIRED
                'Name' => '<string>', // REQUIRED
                'RedshiftTmpDir' => '<string>',
                'Table' => '<string>', // REQUIRED
                'TmpDirIAMRole' => '<string>',
            ],
            'RedshiftTarget' => [
                'Database' => '<string>', // REQUIRED
                'Inputs' => ['<string>', ...], // REQUIRED
                'Name' => '<string>', // REQUIRED
                'RedshiftTmpDir' => '<string>',
                'Table' => '<string>', // REQUIRED
                'TmpDirIAMRole' => '<string>',
                'UpsertRedshiftOptions' => [
                    'ConnectionName' => '<string>',
                    'TableLocation' => '<string>',
                    'UpsertKeys' => ['<string>', ...],
                ],
            ],
            'RelationalCatalogSource' => [
                'Database' => '<string>', // REQUIRED
                'Name' => '<string>', // REQUIRED
                'Table' => '<string>', // REQUIRED
            ],
            'RenameField' => [
                'Inputs' => ['<string>', ...], // REQUIRED
                'Name' => '<string>', // REQUIRED
                'SourcePath' => ['<string>', ...], // REQUIRED
                'TargetPath' => ['<string>', ...], // REQUIRED
            ],
            'S3CatalogDeltaSource' => [
                'AdditionalDeltaOptions' => ['<string>', ...],
                'Database' => '<string>', // REQUIRED
                'Name' => '<string>', // REQUIRED
                'OutputSchemas' => [
                    [
                        'Columns' => [
                            [
                                'Name' => '<string>', // REQUIRED
                                'Type' => '<string>',
                            ],
                            // ...
                        ],
                    ],
                    // ...
                ],
                'Table' => '<string>', // REQUIRED
            ],
            'S3CatalogHudiSource' => [
                'AdditionalHudiOptions' => ['<string>', ...],
                'Database' => '<string>', // REQUIRED
                'Name' => '<string>', // REQUIRED
                'OutputSchemas' => [
                    [
                        'Columns' => [
                            [
                                'Name' => '<string>', // REQUIRED
                                'Type' => '<string>',
                            ],
                            // ...
                        ],
                    ],
                    // ...
                ],
                'Table' => '<string>', // REQUIRED
            ],
            'S3CatalogSource' => [
                'AdditionalOptions' => [
                    'BoundedFiles' => <integer>,
                    'BoundedSize' => <integer>,
                ],
                'Database' => '<string>', // REQUIRED
                'Name' => '<string>', // REQUIRED
                'PartitionPredicate' => '<string>',
                'Table' => '<string>', // REQUIRED
            ],
            'S3CatalogTarget' => [
                'Database' => '<string>', // REQUIRED
                'Inputs' => ['<string>', ...], // REQUIRED
                'Name' => '<string>', // REQUIRED
                'PartitionKeys' => [
                    ['<string>', ...],
                    // ...
                ],
                'SchemaChangePolicy' => [
                    'EnableUpdateCatalog' => true || false,
                    'UpdateBehavior' => 'UPDATE_IN_DATABASE|LOG',
                ],
                'Table' => '<string>', // REQUIRED
            ],
            'S3CsvSource' => [
                'AdditionalOptions' => [
                    'BoundedFiles' => <integer>,
                    'BoundedSize' => <integer>,
                    'EnableSamplePath' => true || false,
                    'SamplePath' => '<string>',
                ],
                'CompressionType' => 'gzip|bzip2',
                'Escaper' => '<string>',
                'Exclusions' => ['<string>', ...],
                'GroupFiles' => '<string>',
                'GroupSize' => '<string>',
                'MaxBand' => <integer>,
                'MaxFilesInBand' => <integer>,
                'Multiline' => true || false,
                'Name' => '<string>', // REQUIRED
                'OptimizePerformance' => true || false,
                'OutputSchemas' => [
                    [
                        'Columns' => [
                            [
                                'Name' => '<string>', // REQUIRED
                                'Type' => '<string>',
                            ],
                            // ...
                        ],
                    ],
                    // ...
                ],
                'Paths' => ['<string>', ...], // REQUIRED
                'QuoteChar' => 'quote|quillemet|single_quote|disabled', // REQUIRED
                'Recurse' => true || false,
                'Separator' => 'comma|ctrla|pipe|semicolon|tab', // REQUIRED
                'SkipFirst' => true || false,
                'WithHeader' => true || false,
                'WriteHeader' => true || false,
            ],
            'S3DeltaCatalogTarget' => [
                'AdditionalOptions' => ['<string>', ...],
                'Database' => '<string>', // REQUIRED
                'Inputs' => ['<string>', ...], // REQUIRED
                'Name' => '<string>', // REQUIRED
                'PartitionKeys' => [
                    ['<string>', ...],
                    // ...
                ],
                'SchemaChangePolicy' => [
                    'EnableUpdateCatalog' => true || false,
                    'UpdateBehavior' => 'UPDATE_IN_DATABASE|LOG',
                ],
                'Table' => '<string>', // REQUIRED
            ],
            'S3DeltaDirectTarget' => [
                'AdditionalOptions' => ['<string>', ...],
                'Compression' => 'uncompressed|snappy', // REQUIRED
                'Format' => 'json|csv|avro|orc|parquet|hudi|delta', // REQUIRED
                'Inputs' => ['<string>', ...], // REQUIRED
                'Name' => '<string>', // REQUIRED
                'PartitionKeys' => [
                    ['<string>', ...],
                    // ...
                ],
                'Path' => '<string>', // REQUIRED
                'SchemaChangePolicy' => [
                    'Database' => '<string>',
                    'EnableUpdateCatalog' => true || false,
                    'Table' => '<string>',
                    'UpdateBehavior' => 'UPDATE_IN_DATABASE|LOG',
                ],
            ],
            'S3DeltaSource' => [
                'AdditionalDeltaOptions' => ['<string>', ...],
                'AdditionalOptions' => [
                    'BoundedFiles' => <integer>,
                    'BoundedSize' => <integer>,
                    'EnableSamplePath' => true || false,
                    'SamplePath' => '<string>',
                ],
                'Name' => '<string>', // REQUIRED
                'OutputSchemas' => [
                    [
                        'Columns' => [
                            [
                                'Name' => '<string>', // REQUIRED
                                'Type' => '<string>',
                            ],
                            // ...
                        ],
                    ],
                    // ...
                ],
                'Paths' => ['<string>', ...], // REQUIRED
            ],
            'S3DirectTarget' => [
                'Compression' => '<string>',
                'Format' => 'json|csv|avro|orc|parquet|hudi|delta', // REQUIRED
                'Inputs' => ['<string>', ...], // REQUIRED
                'Name' => '<string>', // REQUIRED
                'PartitionKeys' => [
                    ['<string>', ...],
                    // ...
                ],
                'Path' => '<string>', // REQUIRED
                'SchemaChangePolicy' => [
                    'Database' => '<string>',
                    'EnableUpdateCatalog' => true || false,
                    'Table' => '<string>',
                    'UpdateBehavior' => 'UPDATE_IN_DATABASE|LOG',
                ],
            ],
            'S3GlueParquetTarget' => [
                'Compression' => 'snappy|lzo|gzip|uncompressed|none',
                'Inputs' => ['<string>', ...], // REQUIRED
                'Name' => '<string>', // REQUIRED
                'PartitionKeys' => [
                    ['<string>', ...],
                    // ...
                ],
                'Path' => '<string>', // REQUIRED
                'SchemaChangePolicy' => [
                    'Database' => '<string>',
                    'EnableUpdateCatalog' => true || false,
                    'Table' => '<string>',
                    'UpdateBehavior' => 'UPDATE_IN_DATABASE|LOG',
                ],
            ],
            'S3HudiCatalogTarget' => [
                'AdditionalOptions' => ['<string>', ...], // REQUIRED
                'Database' => '<string>', // REQUIRED
                'Inputs' => ['<string>', ...], // REQUIRED
                'Name' => '<string>', // REQUIRED
                'PartitionKeys' => [
                    ['<string>', ...],
                    // ...
                ],
                'SchemaChangePolicy' => [
                    'EnableUpdateCatalog' => true || false,
                    'UpdateBehavior' => 'UPDATE_IN_DATABASE|LOG',
                ],
                'Table' => '<string>', // REQUIRED
            ],
            'S3HudiDirectTarget' => [
                'AdditionalOptions' => ['<string>', ...], // REQUIRED
                'Compression' => 'gzip|lzo|uncompressed|snappy', // REQUIRED
                'Format' => 'json|csv|avro|orc|parquet|hudi|delta', // REQUIRED
                'Inputs' => ['<string>', ...], // REQUIRED
                'Name' => '<string>', // REQUIRED
                'PartitionKeys' => [
                    ['<string>', ...],
                    // ...
                ],
                'Path' => '<string>', // REQUIRED
                'SchemaChangePolicy' => [
                    'Database' => '<string>',
                    'EnableUpdateCatalog' => true || false,
                    'Table' => '<string>',
                    'UpdateBehavior' => 'UPDATE_IN_DATABASE|LOG',
                ],
            ],
            'S3HudiSource' => [
                'AdditionalHudiOptions' => ['<string>', ...],
                'AdditionalOptions' => [
                    'BoundedFiles' => <integer>,
                    'BoundedSize' => <integer>,
                    'EnableSamplePath' => true || false,
                    'SamplePath' => '<string>',
                ],
                'Name' => '<string>', // REQUIRED
                'OutputSchemas' => [
                    [
                        'Columns' => [
                            [
                                'Name' => '<string>', // REQUIRED
                                'Type' => '<string>',
                            ],
                            // ...
                        ],
                    ],
                    // ...
                ],
                'Paths' => ['<string>', ...], // REQUIRED
            ],
            'S3JsonSource' => [
                'AdditionalOptions' => [
                    'BoundedFiles' => <integer>,
                    'BoundedSize' => <integer>,
                    'EnableSamplePath' => true || false,
                    'SamplePath' => '<string>',
                ],
                'CompressionType' => 'gzip|bzip2',
                'Exclusions' => ['<string>', ...],
                'GroupFiles' => '<string>',
                'GroupSize' => '<string>',
                'JsonPath' => '<string>',
                'MaxBand' => <integer>,
                'MaxFilesInBand' => <integer>,
                'Multiline' => true || false,
                'Name' => '<string>', // REQUIRED
                'OutputSchemas' => [
                    [
                        'Columns' => [
                            [
                                'Name' => '<string>', // REQUIRED
                                'Type' => '<string>',
                            ],
                            // ...
                        ],
                    ],
                    // ...
                ],
                'Paths' => ['<string>', ...], // REQUIRED
                'Recurse' => true || false,
            ],
            'S3ParquetSource' => [
                'AdditionalOptions' => [
                    'BoundedFiles' => <integer>,
                    'BoundedSize' => <integer>,
                    'EnableSamplePath' => true || false,
                    'SamplePath' => '<string>',
                ],
                'CompressionType' => 'snappy|lzo|gzip|uncompressed|none',
                'Exclusions' => ['<string>', ...],
                'GroupFiles' => '<string>',
                'GroupSize' => '<string>',
                'MaxBand' => <integer>,
                'MaxFilesInBand' => <integer>,
                'Name' => '<string>', // REQUIRED
                'OutputSchemas' => [
                    [
                        'Columns' => [
                            [
                                'Name' => '<string>', // REQUIRED
                                'Type' => '<string>',
                            ],
                            // ...
                        ],
                    ],
                    // ...
                ],
                'Paths' => ['<string>', ...], // REQUIRED
                'Recurse' => true || false,
            ],
            'SelectFields' => [
                'Inputs' => ['<string>', ...], // REQUIRED
                'Name' => '<string>', // REQUIRED
                'Paths' => [ // REQUIRED
                    ['<string>', ...],
                    // ...
                ],
            ],
            'SelectFromCollection' => [
                'Index' => <integer>, // REQUIRED
                'Inputs' => ['<string>', ...], // REQUIRED
                'Name' => '<string>', // REQUIRED
            ],
            'SnowflakeSource' => [
                'Data' => [ // REQUIRED
                    'Action' => '<string>',
                    'AdditionalOptions' => ['<string>', ...],
                    'AutoPushdown' => true || false,
                    'Connection' => [
                        'Description' => '<string>',
                        'Label' => '<string>',
                        'Value' => '<string>',
                    ],
                    'Database' => '<string>',
                    'IamRole' => [
                        'Description' => '<string>',
                        'Label' => '<string>',
                        'Value' => '<string>',
                    ],
                    'MergeAction' => '<string>',
                    'MergeClause' => '<string>',
                    'MergeWhenMatched' => '<string>',
                    'MergeWhenNotMatched' => '<string>',
                    'PostAction' => '<string>',
                    'PreAction' => '<string>',
                    'SampleQuery' => '<string>',
                    'Schema' => '<string>',
                    'SelectedColumns' => [
                        [
                            'Description' => '<string>',
                            'Label' => '<string>',
                            'Value' => '<string>',
                        ],
                        // ...
                    ],
                    'SourceType' => '<string>',
                    'StagingTable' => '<string>',
                    'Table' => '<string>',
                    'TableSchema' => [
                        [
                            'Description' => '<string>',
                            'Label' => '<string>',
                            'Value' => '<string>',
                        ],
                        // ...
                    ],
                    'TempDir' => '<string>',
                    'Upsert' => true || false,
                ],
                'Name' => '<string>', // REQUIRED
                'OutputSchemas' => [
                    [
                        'Columns' => [
                            [
                                'Name' => '<string>', // REQUIRED
                                'Type' => '<string>',
                            ],
                            // ...
                        ],
                    ],
                    // ...
                ],
            ],
            'SnowflakeTarget' => [
                'Data' => [ // REQUIRED
                    'Action' => '<string>',
                    'AdditionalOptions' => ['<string>', ...],
                    'AutoPushdown' => true || false,
                    'Connection' => [
                        'Description' => '<string>',
                        'Label' => '<string>',
                        'Value' => '<string>',
                    ],
                    'Database' => '<string>',
                    'IamRole' => [
                        'Description' => '<string>',
                        'Label' => '<string>',
                        'Value' => '<string>',
                    ],
                    'MergeAction' => '<string>',
                    'MergeClause' => '<string>',
                    'MergeWhenMatched' => '<string>',
                    'MergeWhenNotMatched' => '<string>',
                    'PostAction' => '<string>',
                    'PreAction' => '<string>',
                    'SampleQuery' => '<string>',
                    'Schema' => '<string>',
                    'SelectedColumns' => [
                        [
                            'Description' => '<string>',
                            'Label' => '<string>',
                            'Value' => '<string>',
                        ],
                        // ...
                    ],
                    'SourceType' => '<string>',
                    'StagingTable' => '<string>',
                    'Table' => '<string>',
                    'TableSchema' => [
                        [
                            'Description' => '<string>',
                            'Label' => '<string>',
                            'Value' => '<string>',
                        ],
                        // ...
                    ],
                    'TempDir' => '<string>',
                    'Upsert' => true || false,
                ],
                'Inputs' => ['<string>', ...],
                'Name' => '<string>', // REQUIRED
            ],
            'SparkConnectorSource' => [
                'AdditionalOptions' => ['<string>', ...],
                'ConnectionName' => '<string>', // REQUIRED
                'ConnectionType' => '<string>', // REQUIRED
                'ConnectorName' => '<string>', // REQUIRED
                'Name' => '<string>', // REQUIRED
                'OutputSchemas' => [
                    [
                        'Columns' => [
                            [
                                'Name' => '<string>', // REQUIRED
                                'Type' => '<string>',
                            ],
                            // ...
                        ],
                    ],
                    // ...
                ],
            ],
            'SparkConnectorTarget' => [
                'AdditionalOptions' => ['<string>', ...],
                'ConnectionName' => '<string>', // REQUIRED
                'ConnectionType' => '<string>', // REQUIRED
                'ConnectorName' => '<string>', // REQUIRED
                'Inputs' => ['<string>', ...], // REQUIRED
                'Name' => '<string>', // REQUIRED
                'OutputSchemas' => [
                    [
                        'Columns' => [
                            [
                                'Name' => '<string>', // REQUIRED
                                'Type' => '<string>',
                            ],
                            // ...
                        ],
                    ],
                    // ...
                ],
            ],
            'SparkSQL' => [
                'Inputs' => ['<string>', ...], // REQUIRED
                'Name' => '<string>', // REQUIRED
                'OutputSchemas' => [
                    [
                        'Columns' => [
                            [
                                'Name' => '<string>', // REQUIRED
                                'Type' => '<string>',
                            ],
                            // ...
                        ],
                    ],
                    // ...
                ],
                'SqlAliases' => [ // REQUIRED
                    [
                        'Alias' => '<string>', // REQUIRED
                        'From' => '<string>', // REQUIRED
                    ],
                    // ...
                ],
                'SqlQuery' => '<string>', // REQUIRED
            ],
            'Spigot' => [
                'Inputs' => ['<string>', ...], // REQUIRED
                'Name' => '<string>', // REQUIRED
                'Path' => '<string>', // REQUIRED
                'Prob' => <float>,
                'Topk' => <integer>,
            ],
            'SplitFields' => [
                'Inputs' => ['<string>', ...], // REQUIRED
                'Name' => '<string>', // REQUIRED
                'Paths' => [ // REQUIRED
                    ['<string>', ...],
                    // ...
                ],
            ],
            'Union' => [
                'Inputs' => ['<string>', ...], // REQUIRED
                'Name' => '<string>', // REQUIRED
                'UnionType' => 'ALL|DISTINCT', // REQUIRED
            ],
        ],
        // ...
    ],
    'Command' => [ // REQUIRED
        'Name' => '<string>',
        'PythonVersion' => '<string>',
        'Runtime' => '<string>',
        'ScriptLocation' => '<string>',
    ],
    'Connections' => [
        'Connections' => ['<string>', ...],
    ],
    'DefaultArguments' => ['<string>', ...],
    'Description' => '<string>',
    'ExecutionClass' => 'FLEX|STANDARD',
    'ExecutionProperty' => [
        'MaxConcurrentRuns' => <integer>,
    ],
    'GlueVersion' => '<string>',
    'JobMode' => 'SCRIPT|VISUAL|NOTEBOOK',
    'JobRunQueuingEnabled' => true || false,
    'LogUri' => '<string>',
    'MaintenanceWindow' => '<string>',
    'MaxCapacity' => <float>,
    'MaxRetries' => <integer>,
    'Name' => '<string>', // REQUIRED
    'NonOverridableArguments' => ['<string>', ...],
    'NotificationProperty' => [
        'NotifyDelayAfter' => <integer>,
    ],
    'NumberOfWorkers' => <integer>,
    'Role' => '<string>', // REQUIRED
    'SecurityConfiguration' => '<string>',
    'SourceControlDetails' => [
        'AuthStrategy' => 'PERSONAL_ACCESS_TOKEN|AWS_SECRETS_MANAGER',
        'AuthToken' => '<string>',
        'Branch' => '<string>',
        'Folder' => '<string>',
        'LastCommitId' => '<string>',
        'Owner' => '<string>',
        'Provider' => 'GITHUB|GITLAB|BITBUCKET|AWS_CODE_COMMIT',
        'Repository' => '<string>',
    ],
    'Tags' => ['<string>', ...],
    'Timeout' => <integer>,
    'WorkerType' => 'Standard|G.1X|G.2X|G.025X|G.4X|G.8X|Z.2X',
]);

Parameter Details

Members
AllocatedCapacity
Type: int

This parameter is deprecated. Use MaxCapacity instead.

The number of Glue data processing units (DPUs) to allocate to this Job. You can allocate a minimum of 2 DPUs; the default is 10. A DPU is a relative measure of processing power that consists of 4 vCPUs of compute capacity and 16 GB of memory. For more information, see the Glue pricing page.

CodeGenConfigurationNodes
Type: Associative array of custom strings keys (NodeId) to CodeGenConfigurationNode structures

The representation of a directed acyclic graph on which both the Glue Studio visual component and Glue Studio code generation is based.

Command
Required: Yes
Type: JobCommand structure

The JobCommand that runs this job.

Connections
Type: ConnectionsList structure

The connections used for this job.

DefaultArguments
Type: Associative array of custom strings keys (GenericString) to strings

The default arguments for every run of this job, specified as name-value pairs.

You can specify arguments here that your own job-execution script consumes, as well as arguments that Glue itself consumes.

Job arguments may be logged. Do not pass plaintext secrets as arguments. Retrieve secrets from a Glue Connection, Secrets Manager or other secret management mechanism if you intend to keep them within the Job.

For information about how to specify and consume your own Job arguments, see the Calling Glue APIs in Python topic in the developer guide.

For information about the arguments you can provide to this field when configuring Spark jobs, see the Special Parameters Used by Glue topic in the developer guide.

For information about the arguments you can provide to this field when configuring Ray jobs, see Using job parameters in Ray jobs in the developer guide.

Description
Type: string

Description of the job being defined.

ExecutionClass
Type: string

Indicates whether the job is run with a standard or flexible execution class. The standard execution-class is ideal for time-sensitive workloads that require fast job startup and dedicated resources.

The flexible execution class is appropriate for time-insensitive jobs whose start and completion times may vary.

Only jobs with Glue version 3.0 and above and command type glueetl will be allowed to set ExecutionClass to FLEX. The flexible execution class is available for Spark jobs.

ExecutionProperty
Type: ExecutionProperty structure

An ExecutionProperty specifying the maximum number of concurrent runs allowed for this job.

GlueVersion
Type: string

In Spark jobs, GlueVersion determines the versions of Apache Spark and Python that Glue available in a job. The Python version indicates the version supported for jobs of type Spark.

Ray jobs should set GlueVersion to 4.0 or greater. However, the versions of Ray, Python and additional libraries available in your Ray job are determined by the Runtime parameter of the Job command.

For more information about the available Glue versions and corresponding Spark and Python versions, see Glue version in the developer guide.

Jobs that are created without specifying a Glue version default to Glue 0.9.

JobMode
Type: string

A mode that describes how a job was created. Valid values are:

  • SCRIPT - The job was created using the Glue Studio script editor.

  • VISUAL - The job was created using the Glue Studio visual editor.

  • NOTEBOOK - The job was created using an interactive sessions notebook.

When the JobMode field is missing or null, SCRIPT is assigned as the default value.

JobRunQueuingEnabled
Type: boolean

Specifies whether job run queuing is enabled for the job runs for this job.

A value of true means job run queuing is enabled for the job runs. If false or not populated, the job runs will not be considered for queueing.

If this field does not match the value set in the job run, then the value from the job run field will be used.

LogUri
Type: string

This field is reserved for future use.

MaintenanceWindow
Type: string

This field specifies a day of the week and hour for a maintenance window for streaming jobs. Glue periodically performs maintenance activities. During these maintenance windows, Glue will need to restart your streaming jobs.

Glue will restart the job within 3 hours of the specified maintenance window. For instance, if you set up the maintenance window for Monday at 10:00AM GMT, your jobs will be restarted between 10:00AM GMT to 1:00PM GMT.

MaxCapacity
Type: double

For Glue version 1.0 or earlier jobs, using the standard worker type, the number of Glue data processing units (DPUs) that can be allocated when this job runs. A DPU is a relative measure of processing power that consists of 4 vCPUs of compute capacity and 16 GB of memory. For more information, see the Glue pricing page.

For Glue version 2.0+ jobs, you cannot specify a Maximum capacity. Instead, you should specify a Worker type and the Number of workers.

Do not set MaxCapacity if using WorkerType and NumberOfWorkers.

The value that can be allocated for MaxCapacity depends on whether you are running a Python shell job, an Apache Spark ETL job, or an Apache Spark streaming ETL job:

  • When you specify a Python shell job (JobCommand.Name="pythonshell"), you can allocate either 0.0625 or 1 DPU. The default is 0.0625 DPU.

  • When you specify an Apache Spark ETL job (JobCommand.Name="glueetl") or Apache Spark streaming ETL job (JobCommand.Name="gluestreaming"), you can allocate from 2 to 100 DPUs. The default is 10 DPUs. This job type cannot have a fractional DPU allocation.

MaxRetries
Type: int

The maximum number of times to retry this job if it fails.

Name
Required: Yes
Type: string

The name you assign to this job definition. It must be unique in your account.

NonOverridableArguments
Type: Associative array of custom strings keys (GenericString) to strings

Arguments for this job that are not overridden when providing job arguments in a job run, specified as name-value pairs.

NotificationProperty
Type: NotificationProperty structure

Specifies configuration properties of a job notification.

NumberOfWorkers
Type: int

The number of workers of a defined workerType that are allocated when a job runs.

Role
Required: Yes
Type: string

The name or Amazon Resource Name (ARN) of the IAM role associated with this job.

SecurityConfiguration
Type: string

The name of the SecurityConfiguration structure to be used with this job.

SourceControlDetails
Type: SourceControlDetails structure

The details for a source control configuration for a job, allowing synchronization of job artifacts to or from a remote repository.

Tags
Type: Associative array of custom strings keys (TagKey) to strings

The tags to use with this job. You may use tags to limit access to the job. For more information about tags in Glue, see Amazon Web Services Tags in Glue in the developer guide.

Timeout
Type: int

The job timeout in minutes. This is the maximum time that a job run can consume resources before it is terminated and enters TIMEOUT status. The default is 2,880 minutes (48 hours) for batch jobs.

Streaming jobs must have timeout values less than 7 days or 10080 minutes. When the value is left blank, the job will be restarted after 7 days based if you have not setup a maintenance window. If you have setup maintenance window, it will be restarted during the maintenance window after 7 days.

WorkerType
Type: string

The type of predefined worker that is allocated when a job runs. Accepts a value of G.1X, G.2X, G.4X, G.8X or G.025X for Spark jobs. Accepts the value Z.2X for Ray jobs.

  • For the G.1X worker type, each worker maps to 1 DPU (4 vCPUs, 16 GB of memory) with 84GB disk (approximately 34GB free), and provides 1 executor per worker. We recommend this worker type for workloads such as data transforms, joins, and queries, to offers a scalable and cost effective way to run most jobs.

  • For the G.2X worker type, each worker maps to 2 DPU (8 vCPUs, 32 GB of memory) with 128GB disk (approximately 77GB free), and provides 1 executor per worker. We recommend this worker type for workloads such as data transforms, joins, and queries, to offers a scalable and cost effective way to run most jobs.

  • For the G.4X worker type, each worker maps to 4 DPU (16 vCPUs, 64 GB of memory) with 256GB disk (approximately 235GB free), and provides 1 executor per worker. We recommend this worker type for jobs whose workloads contain your most demanding transforms, aggregations, joins, and queries. This worker type is available only for Glue version 3.0 or later Spark ETL jobs in the following Amazon Web Services Regions: US East (Ohio), US East (N. Virginia), US West (Oregon), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Tokyo), Canada (Central), Europe (Frankfurt), Europe (Ireland), and Europe (Stockholm).

  • For the G.8X worker type, each worker maps to 8 DPU (32 vCPUs, 128 GB of memory) with 512GB disk (approximately 487GB free), and provides 1 executor per worker. We recommend this worker type for jobs whose workloads contain your most demanding transforms, aggregations, joins, and queries. This worker type is available only for Glue version 3.0 or later Spark ETL jobs, in the same Amazon Web Services Regions as supported for the G.4X worker type.

  • For the G.025X worker type, each worker maps to 0.25 DPU (2 vCPUs, 4 GB of memory) with 84GB disk (approximately 34GB free), and provides 1 executor per worker. We recommend this worker type for low volume streaming jobs. This worker type is only available for Glue version 3.0 streaming jobs.

  • For the Z.2X worker type, each worker maps to 2 M-DPU (8vCPUs, 64 GB of memory) with 128 GB disk (approximately 120GB free), and provides up to 8 Ray workers based on the autoscaler.

Result Syntax

[
    'Name' => '<string>',
]

Result Details

Members
Name
Type: string

The unique name that was provided for this job definition.

Errors

InvalidInputException:

The input provided was not valid.

IdempotentParameterMismatchException:

The same unique identifier was associated with two different records.

AlreadyExistsException:

A resource to be created or added already exists.

InternalServiceException:

An internal service error occurred.

OperationTimeoutException:

The operation timed out.

ResourceNumberLimitExceededException:

A resource numerical limit was exceeded.

ConcurrentModificationException:

Two processes are trying to modify a resource simultaneously.

CreateMLTransform

$result = $client->createMLTransform([/* ... */]);
$promise = $client->createMLTransformAsync([/* ... */]);

Creates an Glue machine learning transform. This operation creates the transform and all the necessary parameters to train it.

Call this operation as the first step in the process of using a machine learning transform (such as the FindMatches transform) for deduplicating data. You can provide an optional Description, in addition to the parameters that you want to use for your algorithm.

You must also specify certain parameters for the tasks that Glue runs on your behalf as part of learning from your data and creating a high-quality machine learning transform. These parameters include Role, and optionally, AllocatedCapacity, Timeout, and MaxRetries. For more information, see Jobs.

Parameter Syntax

$result = $client->createMLTransform([
    'Description' => '<string>',
    'GlueVersion' => '<string>',
    'InputRecordTables' => [ // REQUIRED
        [
            'AdditionalOptions' => ['<string>', ...],
            'CatalogId' => '<string>',
            'ConnectionName' => '<string>',
            'DatabaseName' => '<string>', // REQUIRED
            'TableName' => '<string>', // REQUIRED
        ],
        // ...
    ],
    'MaxCapacity' => <float>,
    'MaxRetries' => <integer>,
    'Name' => '<string>', // REQUIRED
    'NumberOfWorkers' => <integer>,
    'Parameters' => [ // REQUIRED
        'FindMatchesParameters' => [
            'AccuracyCostTradeoff' => <float>,
            'EnforceProvidedLabels' => true || false,
            'PrecisionRecallTradeoff' => <float>,
            'PrimaryKeyColumnName' => '<string>',
        ],
        'TransformType' => 'FIND_MATCHES', // REQUIRED
    ],
    'Role' => '<string>', // REQUIRED
    'Tags' => ['<string>', ...],
    'Timeout' => <integer>,
    'TransformEncryption' => [
        'MlUserDataEncryption' => [
            'KmsKeyId' => '<string>',
            'MlUserDataEncryptionMode' => 'DISABLED|SSE-KMS', // REQUIRED
        ],
        'TaskRunSecurityConfigurationName' => '<string>',
    ],
    'WorkerType' => 'Standard|G.1X|G.2X|G.025X|G.4X|G.8X|Z.2X',
]);

Parameter Details

Members
Description
Type: string

A description of the machine learning transform that is being defined. The default is an empty string.

GlueVersion
Type: string

This value determines which version of Glue this machine learning transform is compatible with. Glue 1.0 is recommended for most customers. If the value is not set, the Glue compatibility defaults to Glue 0.9. For more information, see Glue Versions in the developer guide.

InputRecordTables
Required: Yes
Type: Array of GlueTable structures

A list of Glue table definitions used by the transform.

MaxCapacity
Type: double

The number of Glue data processing units (DPUs) that are allocated to task runs for this transform. You can allocate from 2 to 100 DPUs; the default is 10. A DPU is a relative measure of processing power that consists of 4 vCPUs of compute capacity and 16 GB of memory. For more information, see the Glue pricing page.

MaxCapacity is a mutually exclusive option with NumberOfWorkers and WorkerType.

  • If either NumberOfWorkers or WorkerType is set, then MaxCapacity cannot be set.

  • If MaxCapacity is set then neither NumberOfWorkers or WorkerType can be set.

  • If WorkerType is set, then NumberOfWorkers is required (and vice versa).

  • MaxCapacity and NumberOfWorkers must both be at least 1.

When the WorkerType field is set to a value other than Standard, the MaxCapacity field is set automatically and becomes read-only.

When the WorkerType field is set to a value other than Standard, the MaxCapacity field is set automatically and becomes read-only.

MaxRetries
Type: int

The maximum number of times to retry a task for this transform after a task run fails.

Name
Required: Yes
Type: string

The unique name that you give the transform when you create it.

NumberOfWorkers
Type: int

The number of workers of a defined workerType that are allocated when this task runs.

If WorkerType is set, then NumberOfWorkers is required (and vice versa).

Parameters
Required: Yes
Type: TransformParameters structure

The algorithmic parameters that are specific to the transform type used. Conditionally dependent on the transform type.

Role
Required: Yes
Type: string

The name or Amazon Resource Name (ARN) of the IAM role with the required permissions. The required permissions include both Glue service role permissions to Glue resources, and Amazon S3 permissions required by the transform.

  • This role needs Glue service role permissions to allow access to resources in Glue. See Attach a Policy to IAM Users That Access Glue.

  • This role needs permission to your Amazon Simple Storage Service (Amazon S3) sources, targets, temporary directory, scripts, and any libraries used by the task run for this transform.

Tags
Type: Associative array of custom strings keys (TagKey) to strings

The tags to use with this machine learning transform. You may use tags to limit access to the machine learning transform. For more information about tags in Glue, see Amazon Web Services Tags in Glue in the developer guide.

Timeout
Type: int

The timeout of the task run for this transform in minutes. This is the maximum time that a task run for this transform can consume resources before it is terminated and enters TIMEOUT status. The default is 2,880 minutes (48 hours).

TransformEncryption
Type: TransformEncryption structure

The encryption-at-rest settings of the transform that apply to accessing user data. Machine learning transforms can access user data encrypted in Amazon S3 using KMS.

WorkerType
Type: string

The type of predefined worker that is allocated when this task runs. Accepts a value of Standard, G.1X, or G.2X.

  • For the Standard worker type, each worker provides 4 vCPU, 16 GB of memory and a 50GB disk, and 2 executors per worker.

  • For the G.1X worker type, each worker provides 4 vCPU, 16 GB of memory and a 64GB disk, and 1 executor per worker.

  • For the G.2X worker type, each worker provides 8 vCPU, 32 GB of memory and a 128GB disk, and 1 executor per worker.

MaxCapacity is a mutually exclusive option with NumberOfWorkers and WorkerType.

  • If either NumberOfWorkers or WorkerType is set, then MaxCapacity cannot be set.

  • If MaxCapacity is set then neither NumberOfWorkers or WorkerType can be set.

  • If WorkerType is set, then NumberOfWorkers is required (and vice versa).

  • MaxCapacity and NumberOfWorkers must both be at least 1.

Result Syntax

[
    'TransformId' => '<string>',
]

Result Details

Members
TransformId
Type: string

A unique identifier that is generated for the transform.

Errors

AlreadyExistsException:

A resource to be created or added already exists.

InvalidInputException:

The input provided was not valid.

OperationTimeoutException:

The operation timed out.

InternalServiceException:

An internal service error occurred.

AccessDeniedException:

Access to a resource was denied.

ResourceNumberLimitExceededException:

A resource numerical limit was exceeded.

IdempotentParameterMismatchException:

The same unique identifier was associated with two different records.

CreatePartition

$result = $client->createPartition([/* ... */]);
$promise = $client->createPartitionAsync([/* ... */]);

Creates a new partition.

Parameter Syntax

$result = $client->createPartition([
    'CatalogId' => '<string>',
    'DatabaseName' => '<string>', // REQUIRED
    'PartitionInput' => [ // REQUIRED
        'LastAccessTime' => <integer || string || DateTime>,
        'LastAnalyzedTime' => <integer || string || DateTime>,
        'Parameters' => ['<string>', ...],
        'StorageDescriptor' => [
            'AdditionalLocations' => ['<string>', ...],
            'BucketColumns' => ['<string>', ...],
            'Columns' => [
                [
                    'Comment' => '<string>',
                    'Name' => '<string>', // REQUIRED
                    'Parameters' => ['<string>', ...],
                    'Type' => '<string>',
                ],
                // ...
            ],
            'Compressed' => true || false,
            'InputFormat' => '<string>',
            'Location' => '<string>',
            'NumberOfBuckets' => <integer>,
            'OutputFormat' => '<string>',
            'Parameters' => ['<string>', ...],
            'SchemaReference' => [
                'SchemaId' => [
                    'RegistryName' => '<string>',
                    'SchemaArn' => '<string>',
                    'SchemaName' => '<string>',
                ],
                'SchemaVersionId' => '<string>',
                'SchemaVersionNumber' => <integer>,
            ],
            'SerdeInfo' => [
                'Name' => '<string>',
                'Parameters' => ['<string>', ...],
                'SerializationLibrary' => '<string>',
            ],
            'SkewedInfo' => [
                'SkewedColumnNames' => ['<string>', ...],
                'SkewedColumnValueLocationMaps' => ['<string>', ...],
                'SkewedColumnValues' => ['<string>', ...],
            ],
            'SortColumns' => [
                [
                    'Column' => '<string>', // REQUIRED
                    'SortOrder' => <integer>, // REQUIRED
                ],
                // ...
            ],
            'StoredAsSubDirectories' => true || false,
        ],
        'Values' => ['<string>', ...],
    ],
    'TableName' => '<string>', // REQUIRED
]);

Parameter Details

Members
CatalogId
Type: string

The Amazon Web Services account ID of the catalog in which the partition is to be created.

DatabaseName
Required: Yes
Type: string

The name of the metadata database in which the partition is to be created.

PartitionInput
Required: Yes
Type: PartitionInput structure

A PartitionInput structure defining the partition to be created.

TableName
Required: Yes
Type: string

The name of the metadata table in which the partition is to be created.

Result Syntax

[]

Result Details

The results for this operation are always empty.

Errors

InvalidInputException:

The input provided was not valid.

AlreadyExistsException:

A resource to be created or added already exists.

ResourceNumberLimitExceededException:

A resource numerical limit was exceeded.

InternalServiceException:

An internal service error occurred.

EntityNotFoundException:

A specified entity does not exist

OperationTimeoutException:

The operation timed out.

GlueEncryptionException:

An encryption operation failed.

CreatePartitionIndex

$result = $client->createPartitionIndex([/* ... */]);
$promise = $client->createPartitionIndexAsync([/* ... */]);

Creates a specified partition index in an existing table.

Parameter Syntax

$result = $client->createPartitionIndex([
    'CatalogId' => '<string>',
    'DatabaseName' => '<string>', // REQUIRED
    'PartitionIndex' => [ // REQUIRED
        'IndexName' => '<string>', // REQUIRED
        'Keys' => ['<string>', ...], // REQUIRED
    ],
    'TableName' => '<string>', // REQUIRED
]);

Parameter Details

Members
CatalogId
Type: string

The catalog ID where the table resides.

DatabaseName
Required: Yes
Type: string

Specifies the name of a database in which you want to create a partition index.

PartitionIndex
Required: Yes
Type: PartitionIndex structure

Specifies a PartitionIndex structure to create a partition index in an existing table.

TableName
Required: Yes
Type: string

Specifies the name of a table in which you want to create a partition index.

Result Syntax

[]

Result Details

The results for this operation are always empty.

Errors

AlreadyExistsException:

A resource to be created or added already exists.

InvalidInputException:

The input provided was not valid.

EntityNotFoundException:

A specified entity does not exist

ResourceNumberLimitExceededException:

A resource numerical limit was exceeded.

InternalServiceException:

An internal service error occurred.

OperationTimeoutException:

The operation timed out.

GlueEncryptionException:

An encryption operation failed.

CreateRegistry

$result = $client->createRegistry([/* ... */]);
$promise = $client->createRegistryAsync([/* ... */]);

Creates a new registry which may be used to hold a collection of schemas.

Parameter Syntax

$result = $client->createRegistry([
    'Description' => '<string>',
    'RegistryName' => '<string>', // REQUIRED
    'Tags' => ['<string>', ...],
]);

Parameter Details

Members
Description
Type: string

A description of the registry. If description is not provided, there will not be any default value for this.

RegistryName
Required: Yes
Type: string

Name of the registry to be created of max length of 255, and may only contain letters, numbers, hyphen, underscore, dollar sign, or hash mark. No whitespace.

Tags
Type: Associative array of custom strings keys (TagKey) to strings

Amazon Web Services tags that contain a key value pair and may be searched by console, command line, or API.

Result Syntax

[
    'Description' => '<string>',
    'RegistryArn' => '<string>',
    'RegistryName' => '<string>',
    'Tags' => ['<string>', ...],
]

Result Details

Members
Description
Type: string

A description of the registry.

RegistryArn
Type: string

The Amazon Resource Name (ARN) of the newly created registry.

RegistryName
Type: string

The name of the registry.

Tags
Type: Associative array of custom strings keys (TagKey) to strings

The tags for the registry.

Errors

InvalidInputException:

The input provided was not valid.

AccessDeniedException:

Access to a resource was denied.

AlreadyExistsException:

A resource to be created or added already exists.

ResourceNumberLimitExceededException:

A resource numerical limit was exceeded.

ConcurrentModificationException:

Two processes are trying to modify a resource simultaneously.

InternalServiceException:

An internal service error occurred.

CreateSchema

$result = $client->createSchema([/* ... */]);
$promise = $client->createSchemaAsync([/* ... */]);

Creates a new schema set and registers the schema definition. Returns an error if the schema set already exists without actually registering the version.

When the schema set is created, a version checkpoint will be set to the first version. Compatibility mode "DISABLED" restricts any additional schema versions from being added after the first schema version. For all other compatibility modes, validation of compatibility settings will be applied only from the second version onwards when the RegisterSchemaVersion API is used.

When this API is called without a RegistryId, this will create an entry for a "default-registry" in the registry database tables, if it is not already present.

Parameter Syntax

$result = $client->createSchema([
    'Compatibility' => 'NONE|DISABLED|BACKWARD|BACKWARD_ALL|FORWARD|FORWARD_ALL|FULL|FULL_ALL',
    'DataFormat' => 'AVRO|JSON|PROTOBUF', // REQUIRED
    'Description' => '<string>',
    'RegistryId' => [
        'RegistryArn' => '<string>',
        'RegistryName' => '<string>',
    ],
    'SchemaDefinition' => '<string>',
    'SchemaName' => '<string>', // REQUIRED
    'Tags' => ['<string>', ...],
]);

Parameter Details

Members
Compatibility
Type: string

The compatibility mode of the schema. The possible values are:

  • NONE: No compatibility mode applies. You can use this choice in development scenarios or if you do not know the compatibility mode that you want to apply to schemas. Any new version added will be accepted without undergoing a compatibility check.

  • DISABLED: This compatibility choice prevents versioning for a particular schema. You can use this choice to prevent future versioning of a schema.

  • BACKWARD: This compatibility choice is recommended as it allows data receivers to read both the current and one previous schema version. This means that for instance, a new schema version cannot drop data fields or change the type of these fields, so they can't be read by readers using the previous version.

  • BACKWARD_ALL: This compatibility choice allows data receivers to read both the current and all previous schema versions. You can use this choice when you need to delete fields or add optional fields, and check compatibility against all previous schema versions.

  • FORWARD: This compatibility choice allows data receivers to read both the current and one next schema version, but not necessarily later versions. You can use this choice when you need to add fields or delete optional fields, but only check compatibility against the last schema version.

  • FORWARD_ALL: This compatibility choice allows data receivers to read written by producers of any new registered schema. You can use this choice when you need to add fields or delete optional fields, and check compatibility against all previous schema versions.

  • FULL: This compatibility choice allows data receivers to read data written by producers using the previous or next version of the schema, but not necessarily earlier or later versions. You can use this choice when you need to add or remove optional fields, but only check compatibility against the last schema version.

  • FULL_ALL: This compatibility choice allows data receivers to read data written by producers using all previous schema versions. You can use this choice when you need to add or remove optional fields, and check compatibility against all previous schema versions.

DataFormat
Required: Yes
Type: string

The data format of the schema definition. Currently AVRO, JSON and PROTOBUF are supported.

Description
Type: string

An optional description of the schema. If description is not provided, there will not be any automatic default value for this.

RegistryId
Type: RegistryId structure

This is a wrapper shape to contain the registry identity fields. If this is not provided, the default registry will be used. The ARN format for the same will be: arn:aws:glue:us-east-2:<customer id>:registry/default-registry:random-5-letter-id.

SchemaDefinition
Type: string

The schema definition using the DataFormat setting for SchemaName.

SchemaName
Required: Yes
Type: string

Name of the schema to be created of max length of 255, and may only contain letters, numbers, hyphen, underscore, dollar sign, or hash mark. No whitespace.

Tags
Type: Associative array of custom strings keys (TagKey) to strings

Amazon Web Services tags that contain a key value pair and may be searched by console, command line, or API. If specified, follows the Amazon Web Services tags-on-create pattern.

Result Syntax

[
    'Compatibility' => 'NONE|DISABLED|BACKWARD|BACKWARD_ALL|FORWARD|FORWARD_ALL|FULL|FULL_ALL',
    'DataFormat' => 'AVRO|JSON|PROTOBUF',
    'Description' => '<string>',
    'LatestSchemaVersion' => <integer>,
    'NextSchemaVersion' => <integer>,
    'RegistryArn' => '<string>',
    'RegistryName' => '<string>',
    'SchemaArn' => '<string>',
    'SchemaCheckpoint' => <integer>,
    'SchemaName' => '<string>',
    'SchemaStatus' => 'AVAILABLE|PENDING|DELETING',
    'SchemaVersionId' => '<string>',
    'SchemaVersionStatus' => 'AVAILABLE|PENDING|FAILURE|DELETING',
    'Tags' => ['<string>', ...],
]

Result Details

Members
Compatibility
Type: string

The schema compatibility mode.

DataFormat
Type: string

The data format of the schema definition. Currently AVRO, JSON and PROTOBUF are supported.

Description
Type: string

A description of the schema if specified when created.

LatestSchemaVersion
Type: long (int|float)

The latest version of the schema associated with the returned schema definition.

NextSchemaVersion
Type: long (int|float)

The next version of the schema associated with the returned schema definition.

RegistryArn
Type: string

The Amazon Resource Name (ARN) of the registry.

RegistryName
Type: string

The name of the registry.

SchemaArn
Type: string

The Amazon Resource Name (ARN) of the schema.

SchemaCheckpoint
Type: long (int|float)

The version number of the checkpoint (the last time the compatibility mode was changed).

SchemaName
Type: string

The name of the schema.

SchemaStatus
Type: string

The status of the schema.

SchemaVersionId
Type: string

The unique identifier of the first schema version.

SchemaVersionStatus
Type: string

The status of the first schema version created.

Tags
Type: Associative array of custom strings keys (TagKey) to strings

The tags for the schema.

Errors

InvalidInputException:

The input provided was not valid.

AccessDeniedException:

Access to a resource was denied.

EntityNotFoundException:

A specified entity does not exist

AlreadyExistsException:

A resource to be created or added already exists.

ResourceNumberLimitExceededException:

A resource numerical limit was exceeded.

ConcurrentModificationException:

Two processes are trying to modify a resource simultaneously.

InternalServiceException:

An internal service error occurred.

CreateScript

$result = $client->createScript([/* ... */]);
$promise = $client->createScriptAsync([/* ... */]);

Transforms a directed acyclic graph (DAG) into code.

Parameter Syntax

$result = $client->createScript([
    'DagEdges' => [
        [
            'Source' => '<string>', // REQUIRED
            'Target' => '<string>', // REQUIRED
            'TargetParameter' => '<string>',
        ],
        // ...
    ],
    'DagNodes' => [
        [
            'Args' => [ // REQUIRED
                [
                    'Name' => '<string>', // REQUIRED
                    'Param' => true || false,
                    'Value' => '<string>', // REQUIRED
                ],
                // ...
            ],
            'Id' => '<string>', // REQUIRED
            'LineNumber' => <integer>,
            'NodeType' => '<string>', // REQUIRED
        ],
        // ...
    ],
    'Language' => 'PYTHON|SCALA',
]);

Parameter Details

Members
DagEdges
Type: Array of CodeGenEdge structures

A list of the edges in the DAG.

DagNodes
Type: Array of CodeGenNode structures

A list of the nodes in the DAG.

Language
Type: string

The programming language of the resulting code from the DAG.

Result Syntax

[
    'PythonScript' => '<string>',
    'ScalaCode' => '<string>',
]

Result Details

Members
PythonScript
Type: string

The Python script generated from the DAG.

ScalaCode
Type: string

The Scala code generated from the DAG.

Errors

InvalidInputException:

The input provided was not valid.

InternalServiceException:

An internal service error occurred.

OperationTimeoutException:

The operation timed out.

CreateSecurityConfiguration

$result = $client->createSecurityConfiguration([/* ... */]);
$promise = $client->createSecurityConfigurationAsync([/* ... */]);

Creates a new security configuration. A security configuration is a set of security properties that can be used by Glue. You can use a security configuration to encrypt data at rest. For information about using security configurations in Glue, see Encrypting Data Written by Crawlers, Jobs, and Development Endpoints.

Parameter Syntax

$result = $client->createSecurityConfiguration([
    'EncryptionConfiguration' => [ // REQUIRED
        'CloudWatchEncryption' => [
            'CloudWatchEncryptionMode' => 'DISABLED|SSE-KMS',
            'KmsKeyArn' => '<string>',
        ],
        'JobBookmarksEncryption' => [
            'JobBookmarksEncryptionMode' => 'DISABLED|CSE-KMS',
            'KmsKeyArn' => '<string>',
        ],
        'S3Encryption' => [
            [
                'KmsKeyArn' => '<string>',
                'S3EncryptionMode' => 'DISABLED|SSE-KMS|SSE-S3',
            ],
            // ...
        ],
    ],
    'Name' => '<string>', // REQUIRED
]);

Parameter Details

Members
EncryptionConfiguration
Required: Yes
Type: EncryptionConfiguration structure

The encryption configuration for the new security configuration.

Name
Required: Yes
Type: string

The name for the new security configuration.

Result Syntax

[
    'CreatedTimestamp' => <DateTime>,
    'Name' => '<string>',
]

Result Details

Members
CreatedTimestamp
Type: timestamp (string|DateTime or anything parsable by strtotime)

The time at which the new security configuration was created.

Name
Type: string

The name assigned to the new security configuration.

Errors

AlreadyExistsException:

A resource to be created or added already exists.

InvalidInputException:

The input provided was not valid.

InternalServiceException:

An internal service error occurred.

OperationTimeoutException:

The operation timed out.

ResourceNumberLimitExceededException:

A resource numerical limit was exceeded.

CreateSession

$result = $client->createSession([/* ... */]);
$promise = $client->createSessionAsync([/* ... */]);

Creates a new session.

Parameter Syntax

$result = $client->createSession([
    'Command' => [ // REQUIRED
        'Name' => '<string>',
        'PythonVersion' => '<string>',
    ],
    'Connections' => [
        'Connections' => ['<string>', ...],
    ],
    'DefaultArguments' => ['<string>', ...],
    'Description' => '<string>',
    'GlueVersion' => '<string>',
    'Id' => '<string>', // REQUIRED
    'IdleTimeout' => <integer>,
    'MaxCapacity' => <float>,
    'NumberOfWorkers' => <integer>,
    'RequestOrigin' => '<string>',
    'Role' => '<string>', // REQUIRED
    'SecurityConfiguration' => '<string>',
    'Tags' => ['<string>', ...],
    'Timeout' => <integer>,
    'WorkerType' => 'Standard|G.1X|G.2X|G.025X|G.4X|G.8X|Z.2X',
]);

Parameter Details

Members
Command
Required: Yes
Type: SessionCommand structure

The SessionCommand that runs the job.

Connections
Type: ConnectionsList structure

The number of connections to use for the session.

DefaultArguments
Type: Associative array of custom strings keys (OrchestrationNameString) to strings

A map array of key-value pairs. Max is 75 pairs.

Description
Type: string

The description of the session.

GlueVersion
Type: string

The Glue version determines the versions of Apache Spark and Python that Glue supports. The GlueVersion must be greater than 2.0.

Id
Required: Yes
Type: string

The ID of the session request.

IdleTimeout
Type: int

The number of minutes when idle before session times out. Default for Spark ETL jobs is value of Timeout. Consult the documentation for other job types.

MaxCapacity
Type: double

The number of Glue data processing units (DPUs) that can be allocated when the job runs. A DPU is a relative measure of processing power that consists of 4 vCPUs of compute capacity and 16 GB memory.

NumberOfWorkers
Type: int

The number of workers of a defined WorkerType to use for the session.

RequestOrigin
Type: string

The origin of the request.

Role
Required: Yes
Type: string

The IAM Role ARN

SecurityConfiguration
Type: string

The name of the SecurityConfiguration structure to be used with the session

Tags
Type: Associative array of custom strings keys (TagKey) to strings

The map of key value pairs (tags) belonging to the session.

Timeout
Type: int

The number of minutes before session times out. Default for Spark ETL jobs is 48 hours (2880 minutes), the maximum session lifetime for this job type. Consult the documentation for other job types.

WorkerType
Type: string

The type of predefined worker that is allocated when a job runs. Accepts a value of G.1X, G.2X, G.4X, or G.8X for Spark jobs. Accepts the value Z.2X for Ray notebooks.

  • For the G.1X worker type, each worker maps to 1 DPU (4 vCPUs, 16 GB of memory) with 84GB disk (approximately 34GB free), and provides 1 executor per worker. We recommend this worker type for workloads such as data transforms, joins, and queries, to offers a scalable and cost effective way to run most jobs.

  • For the G.2X worker type, each worker maps to 2 DPU (8 vCPUs, 32 GB of memory) with 128GB disk (approximately 77GB free), and provides 1 executor per worker. We recommend this worker type for workloads such as data transforms, joins, and queries, to offers a scalable and cost effective way to run most jobs.

  • For the G.4X worker type, each worker maps to 4 DPU (16 vCPUs, 64 GB of memory) with 256GB disk (approximately 235GB free), and provides 1 executor per worker. We recommend this worker type for jobs whose workloads contain your most demanding transforms, aggregations, joins, and queries. This worker type is available only for Glue version 3.0 or later Spark ETL jobs in the following Amazon Web Services Regions: US East (Ohio), US East (N. Virginia), US West (Oregon), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Tokyo), Canada (Central), Europe (Frankfurt), Europe (Ireland), and Europe (Stockholm).

  • For the G.8X worker type, each worker maps to 8 DPU (32 vCPUs, 128 GB of memory) with 512GB disk (approximately 487GB free), and provides 1 executor per worker. We recommend this worker type for jobs whose workloads contain your most demanding transforms, aggregations, joins, and queries. This worker type is available only for Glue version 3.0 or later Spark ETL jobs, in the same Amazon Web Services Regions as supported for the G.4X worker type.

  • For the Z.2X worker type, each worker maps to 2 M-DPU (8vCPUs, 64 GB of memory) with 128 GB disk (approximately 120GB free), and provides up to 8 Ray workers based on the autoscaler.

Result Syntax

[
    'Session' => [
        'Command' => [
            'Name' => '<string>',
            'PythonVersion' => '<string>',
        ],
        'CompletedOn' => <DateTime>,
        'Connections' => [
            'Connections' => ['<string>', ...],
        ],
        'CreatedOn' => <DateTime>,
        'DPUSeconds' => <float>,
        'DefaultArguments' => ['<string>', ...],
        'Description' => '<string>',
        'ErrorMessage' => '<string>',
        'ExecutionTime' => <float>,
        'GlueVersion' => '<string>',
        'Id' => '<string>',
        'IdleTimeout' => <integer>,
        'MaxCapacity' => <float>,
        'NumberOfWorkers' => <integer>,
        'ProfileName' => '<string>',
        'Progress' => <float>,
        'Role' => '<string>',
        'SecurityConfiguration' => '<string>',
        'Status' => 'PROVISIONING|READY|FAILED|TIMEOUT|STOPPING|STOPPED',
        'WorkerType' => 'Standard|G.1X|G.2X|G.025X|G.4X|G.8X|Z.2X',
    ],
]

Result Details

Members
Session
Type: Session structure

Returns the session object in the response.

Errors

AccessDeniedException:

Access to a resource was denied.

IdempotentParameterMismatchException:

The same unique identifier was associated with two different records.

InternalServiceException:

An internal service error occurred.

OperationTimeoutException:

The operation timed out.

InvalidInputException:

The input provided was not valid.

ValidationException:

A value could not be validated.

AlreadyExistsException:

A resource to be created or added already exists.

ResourceNumberLimitExceededException:

A resource numerical limit was exceeded.

CreateTable

$result = $client->createTable([/* ... */]);
$promise = $client->createTableAsync([/* ... */]);

Creates a new table definition in the Data Catalog.

Parameter Syntax

$result = $client->createTable([
    'CatalogId' => '<string>',
    'DatabaseName' => '<string>', // REQUIRED
    'OpenTableFormatInput' => [
        'IcebergInput' => [
            'MetadataOperation' => 'CREATE', // REQUIRED
            'Version' => '<string>',
        ],
    ],
    'PartitionIndexes' => [
        [
            'IndexName' => '<string>', // REQUIRED
            'Keys' => ['<string>', ...], // REQUIRED
        ],
        // ...
    ],
    'TableInput' => [ // REQUIRED
        'Description' => '<string>',
        'LastAccessTime' => <integer || string || DateTime>,
        'LastAnalyzedTime' => <integer || string || DateTime>,
        'Name' => '<string>', // REQUIRED
        'Owner' => '<string>',
        'Parameters' => ['<string>', ...],
        'PartitionKeys' => [
            [
                'Comment' => '<string>',
                'Name' => '<string>', // REQUIRED
                'Parameters' => ['<string>', ...],
                'Type' => '<string>',
            ],
            // ...
        ],
        'Retention' => <integer>,
        'StorageDescriptor' => [
            'AdditionalLocations' => ['<string>', ...],
            'BucketColumns' => ['<string>', ...],
            'Columns' => [
                [
                    'Comment' => '<string>',
                    'Name' => '<string>', // REQUIRED
                    'Parameters' => ['<string>', ...],
                    'Type' => '<string>',
                ],
                // ...
            ],
            'Compressed' => true || false,
            'InputFormat' => '<string>',
            'Location' => '<string>',
            'NumberOfBuckets' => <integer>,
            'OutputFormat' => '<string>',
            'Parameters' => ['<string>', ...],
            'SchemaReference' => [
                'SchemaId' => [
                    'RegistryName' => '<string>',
                    'SchemaArn' => '<string>',
                    'SchemaName' => '<string>',
                ],
                'SchemaVersionId' => '<string>',
                'SchemaVersionNumber' => <integer>,
            ],
            'SerdeInfo' => [
                'Name' => '<string>',
                'Parameters' => ['<string>', ...],
                'SerializationLibrary' => '<string>',
            ],
            'SkewedInfo' => [
                'SkewedColumnNames' => ['<string>', ...],
                'SkewedColumnValueLocationMaps' => ['<string>', ...],
                'SkewedColumnValues' => ['<string>', ...],
            ],
            'SortColumns' => [
                [
                    'Column' => '<string>', // REQUIRED
                    'SortOrder' => <integer>, // REQUIRED
                ],
                // ...
            ],
            'StoredAsSubDirectories' => true || false,
        ],
        'TableType' => '<string>',
        'TargetTable' => [
            'CatalogId' => '<string>',
            'DatabaseName' => '<string>',
            'Name' => '<string>',
            'Region' => '<string>',
        ],
        'ViewDefinition' => [
            'Definer' => '<string>',
            'IsProtected' => true || false,
            'Representations' => [
                [
                    'Dialect' => 'REDSHIFT|ATHENA|SPARK',
                    'DialectVersion' => '<string>',
                    'ValidationConnection' => '<string>',
                    'ViewExpandedText' => '<string>',
                    'ViewOriginalText' => '<string>',
                ],
                // ...
            ],
            'SubObjects' => ['<string>', ...],
        ],
        'ViewExpandedText' => '<string>',
        'ViewOriginalText' => '<string>',
    ],
    'TransactionId' => '<string>',
]);

Parameter Details

Members
CatalogId
Type: string

The ID of the Data Catalog in which to create the Table. If none is supplied, the Amazon Web Services account ID is used by default.

DatabaseName
Required: Yes
Type: string

The catalog database in which to create the new table. For Hive compatibility, this name is entirely lowercase.

OpenTableFormatInput
Type: OpenTableFormatInput structure

Specifies an OpenTableFormatInput structure when creating an open format table.

PartitionIndexes
Type: Array of PartitionIndex structures

A list of partition indexes, PartitionIndex structures, to create in the table.

TableInput
Required: Yes
Type: TableInput structure

The TableInput object that defines the metadata table to create in the catalog.

TransactionId
Type: string

The ID of the transaction.

Result Syntax

[]

Result Details

The results for this operation are always empty.

Errors

AlreadyExistsException:

A resource to be created or added already exists.

InvalidInputException:

The input provided was not valid.

EntityNotFoundException:

A specified entity does not exist

ResourceNumberLimitExceededException:

A resource numerical limit was exceeded.

InternalServiceException:

An internal service error occurred.

OperationTimeoutException:

The operation timed out.

GlueEncryptionException:

An encryption operation failed.

ConcurrentModificationException:

Two processes are trying to modify a resource simultaneously.

ResourceNotReadyException:

A resource was not ready for a transaction.

CreateTableOptimizer

$result = $client->createTableOptimizer([/* ... */]);
$promise = $client->createTableOptimizerAsync([/* ... */]);

Creates a new table optimizer for a specific function.

Parameter Syntax

$result = $client->createTableOptimizer([
    'CatalogId' => '<string>', // REQUIRED
    'DatabaseName' => '<string>', // REQUIRED
    'TableName' => '<string>', // REQUIRED
    'TableOptimizerConfiguration' => [ // REQUIRED
        'enabled' => true || false,
        'orphanFileDeletionConfiguration' => [
            'icebergConfiguration' => [
                'location' => '<string>',
                'orphanFileRetentionPeriodInDays' => <integer>,
            ],
        ],
        'retentionConfiguration' => [
            'icebergConfiguration' => [
                'cleanExpiredFiles' => true || false,
                'numberOfSnapshotsToRetain' => <integer>,
                'snapshotRetentionPeriodInDays' => <integer>,
            ],
        ],
        'roleArn' => '<string>',
        'vpcConfiguration' => [
            'glueConnectionName' => '<string>',
        ],
    ],
    'Type' => 'compaction|retention|orphan_file_deletion', // REQUIRED
]);

Parameter Details

Members
CatalogId
Required: Yes
Type: string

The Catalog ID of the table.

DatabaseName
Required: Yes
Type: string

The name of the database in the catalog in which the table resides.

TableName
Required: Yes
Type: string

The name of the table.

TableOptimizerConfiguration
Required: Yes
Type: TableOptimizerConfiguration structure

A TableOptimizerConfiguration object representing the configuration of a table optimizer.

Type
Required: Yes
Type: string

The type of table optimizer.

Result Syntax

[]

Result Details

The results for this operation are always empty.

Errors

EntityNotFoundException:

A specified entity does not exist

ValidationException:

A value could not be validated.

InvalidInputException:

The input provided was not valid.

AccessDeniedException:

Access to a resource was denied.

AlreadyExistsException:

A resource to be created or added already exists.

InternalServiceException:

An internal service error occurred.

ThrottlingException:

The throttling threshhold was exceeded.

CreateTrigger

$result = $client->createTrigger([/* ... */]);
$promise = $client->createTriggerAsync([/* ... */]);

Creates a new trigger.

Parameter Syntax

$result = $client->createTrigger([
    'Actions' => [ // REQUIRED
        [
            'Arguments' => ['<string>', ...],
            'CrawlerName' => '<string>',
            'JobName' => '<string>',
            'NotificationProperty' => [
                'NotifyDelayAfter' => <integer>,
            ],
            'SecurityConfiguration' => '<string>',
            'Timeout' => <integer>,
        ],
        // ...
    ],
    'Description' => '<string>',
    'EventBatchingCondition' => [
        'BatchSize' => <integer>, // REQUIRED
        'BatchWindow' => <integer>,
    ],
    'Name' => '<string>', // REQUIRED
    'Predicate' => [
        'Conditions' => [
            [
                'CrawlState' => 'RUNNING|CANCELLING|CANCELLED|SUCCEEDED|FAILED|ERROR',
                'CrawlerName' => '<string>',
                'JobName' => '<string>',
                'LogicalOperator' => 'EQUALS',
                'State' => 'STARTING|RUNNING|STOPPING|STOPPED|SUCCEEDED|FAILED|TIMEOUT|ERROR|WAITING|EXPIRED',
            ],
            // ...
        ],
        'Logical' => 'AND|ANY',
    ],
    'Schedule' => '<string>',
    'StartOnCreation' => true || false,
    'Tags' => ['<string>', ...],
    'Type' => 'SCHEDULED|CONDITIONAL|ON_DEMAND|EVENT', // REQUIRED
    'WorkflowName' => '<string>',
]);

Parameter Details

Members
Actions
Required: Yes
Type: Array of Action structures

The actions initiated by this trigger when it fires.

Description
Type: string

A description of the new trigger.

EventBatchingCondition
Type: EventBatchingCondition structure

Batch condition that must be met (specified number of events received or batch time window expired) before EventBridge event trigger fires.

Name
Required: Yes
Type: string

The name of the trigger.

Predicate
Type: Predicate structure

A predicate to specify when the new trigger should fire.

This field is required when the trigger type is CONDITIONAL.

Schedule
Type: string

A cron expression used to specify the schedule (see Time-Based Schedules for Jobs and Crawlers. For example, to run something every day at 12:15 UTC, you would specify: cron(15 12 * * ? *).

This field is required when the trigger type is SCHEDULED.

StartOnCreation
Type: boolean

Set to true to start SCHEDULED and CONDITIONAL triggers when created. True is not supported for ON_DEMAND triggers.

Tags
Type: Associative array of custom strings keys (TagKey) to strings

The tags to use with this trigger. You may use tags to limit access to the trigger. For more information about tags in Glue, see Amazon Web Services Tags in Glue in the developer guide.

Type
Required: Yes
Type: string

The type of the new trigger.

WorkflowName
Type: string

The name of the workflow associated with the trigger.

Result Syntax

[
    'Name' => '<string>',
]

Result Details

Members
Name
Type: string

The name of the trigger.

Errors

AlreadyExistsException:

A resource to be created or added already exists.

EntityNotFoundException:

A specified entity does not exist

InvalidInputException:

The input provided was not valid.

IdempotentParameterMismatchException:

The same unique identifier was associated with two different records.

InternalServiceException:

An internal service error occurred.

OperationTimeoutException:

The operation timed out.

ResourceNumberLimitExceededException:

A resource numerical limit was exceeded.

ConcurrentModificationException:

Two processes are trying to modify a resource simultaneously.

CreateUsageProfile

$result = $client->createUsageProfile([/* ... */]);
$promise = $client->createUsageProfileAsync([/* ... */]);

Creates an Glue usage profile.

Parameter Syntax

$result = $client->createUsageProfile([
    'Configuration' => [ // REQUIRED
        'JobConfiguration' => [
            '<NameString>' => [
                'AllowedValues' => ['<string>', ...],
                'DefaultValue' => '<string>',
                'MaxValue' => '<string>',
                'MinValue' => '<string>',
            ],
            // ...
        ],
        'SessionConfiguration' => [
            '<NameString>' => [
                'AllowedValues' => ['<string>', ...],
                'DefaultValue' => '<string>',
                'MaxValue' => '<string>',
                'MinValue' => '<string>',
            ],
            // ...
        ],
    ],
    'Description' => '<string>',
    'Name' => '<string>', // REQUIRED
    'Tags' => ['<string>', ...],
]);

Parameter Details

Members
Configuration
Required: Yes
Type: ProfileConfiguration structure

A ProfileConfiguration object specifying the job and session values for the profile.

Description
Type: string

A description of the usage profile.

Name
Required: Yes
Type: string

The name of the usage profile.

Tags
Type: Associative array of custom strings keys (TagKey) to strings

A list of tags applied to the usage profile.

Result Syntax

[
    'Name' => '<string>',
]

Result Details

Members
Name
Type: string

The name of the usage profile that was created.

Errors

InvalidInputException:

The input provided was not valid.

InternalServiceException:

An internal service error occurred.

AlreadyExistsException:

A resource to be created or added already exists.

OperationTimeoutException:

The operation timed out.

ResourceNumberLimitExceededException:

A resource numerical limit was exceeded.

OperationNotSupportedException:

The operation is not available in the region.

CreateUserDefinedFunction

$result = $client->createUserDefinedFunction([/* ... */]);
$promise = $client->createUserDefinedFunctionAsync([/* ... */]);

Creates a new function definition in the Data Catalog.

Parameter Syntax

$result = $client->createUserDefinedFunction([
    'CatalogId' => '<string>',
    'DatabaseName' => '<string>', // REQUIRED
    'FunctionInput' => [ // REQUIRED
        'ClassName' => '<string>',
        'FunctionName' => '<string>',
        'OwnerName' => '<string>',
        'OwnerType' => 'USER|ROLE|GROUP',
        'ResourceUris' => [
            [
                'ResourceType' => 'JAR|FILE|ARCHIVE',
                'Uri' => '<string>',
            ],
            // ...
        ],
    ],
]);

Parameter Details

Members
CatalogId
Type: string

The ID of the Data Catalog in which to create the function. If none is provided, the Amazon Web Services account ID is used by default.

DatabaseName
Required: Yes
Type: string

The name of the catalog database in which to create the function.

FunctionInput
Required: Yes
Type: UserDefinedFunctionInput structure

A FunctionInput object that defines the function to create in the Data Catalog.

Result Syntax

[]

Result Details

The results for this operation are always empty.

Errors

AlreadyExistsException:

A resource to be created or added already exists.

InvalidInputException:

The input provided was not valid.

InternalServiceException:

An internal service error occurred.

EntityNotFoundException:

A specified entity does not exist

OperationTimeoutException:

The operation timed out.

ResourceNumberLimitExceededException:

A resource numerical limit was exceeded.

GlueEncryptionException:

An encryption operation failed.

CreateWorkflow

$result = $client->createWorkflow([/* ... */]);
$promise = $client->createWorkflowAsync([/* ... */]);

Creates a new workflow.

Parameter Syntax

$result = $client->createWorkflow([
    'DefaultRunProperties' => ['<string>', ...],
    'Description' => '<string>',
    'MaxConcurrentRuns' => <integer>,
    'Name' => '<string>', // REQUIRED
    'Tags' => ['<string>', ...],
]);

Parameter Details

Members
DefaultRunProperties
Type: Associative array of custom strings keys (IdString) to strings

A collection of properties to be used as part of each execution of the workflow.

Description
Type: string

A description of the workflow.

MaxConcurrentRuns
Type: int

You can use this parameter to prevent unwanted multiple updates to data, to control costs, or in some cases, to prevent exceeding the maximum number of concurrent runs of any of the component jobs. If you leave this parameter blank, there is no limit to the number of concurrent workflow runs.

Name
Required: Yes
Type: string

The name to be assigned to the workflow. It should be unique within your account.

Tags
Type: Associative array of custom strings keys (TagKey) to strings

The tags to be used with this workflow.

Result Syntax

[
    'Name' => '<string>',
]

Result Details

Members
Name
Type: string

The name of the workflow which was provided as part of the request.

Errors

AlreadyExistsException:

A resource to be created or added already exists.

InvalidInputException:

The input provided was not valid.

InternalServiceException:

An internal service error occurred.

OperationTimeoutException:

The operation timed out.

ResourceNumberLimitExceededException:

A resource numerical limit was exceeded.

ConcurrentModificationException:

Two processes are trying to modify a resource simultaneously.

DeleteBlueprint

$result = $client->deleteBlueprint([/* ... */]);
$promise = $client->deleteBlueprintAsync([/* ... */]);

Deletes an existing blueprint.

Parameter Syntax

$result = $client->deleteBlueprint([
    'Name' => '<string>', // REQUIRED
]);

Parameter Details

Members
Name
Required: Yes
Type: string

The name of the blueprint to delete.

Result Syntax

[
    'Name' => '<string>',
]

Result Details

Members
Name
Type: string

Returns the name of the blueprint that was deleted.

Errors

InvalidInputException:

The input provided was not valid.

OperationTimeoutException:

The operation timed out.

InternalServiceException:

An internal service error occurred.

DeleteClassifier

$result = $client->deleteClassifier([/* ... */]);
$promise = $client->deleteClassifierAsync([/* ... */]);

Removes a classifier from the Data Catalog.

Parameter Syntax

$result = $client->deleteClassifier([
    'Name' => '<string>', // REQUIRED
]);

Parameter Details

Members
Name
Required: Yes
Type: string

Name of the classifier to remove.

Result Syntax

[]

Result Details

The results for this operation are always empty.

Errors

EntityNotFoundException:

A specified entity does not exist

OperationTimeoutException:

The operation timed out.

DeleteColumnStatisticsForPartition

$result = $client->deleteColumnStatisticsForPartition([/* ... */]);
$promise = $client->deleteColumnStatisticsForPartitionAsync([/* ... */]);

Delete the partition column statistics of a column.

The Identity and Access Management (IAM) permission required for this operation is DeletePartition.

Parameter Syntax

$result = $client->deleteColumnStatisticsForPartition([
    'CatalogId' => '<string>',
    'ColumnName' => '<string>', // REQUIRED
    'DatabaseName' => '<string>', // REQUIRED
    'PartitionValues' => ['<string>', ...], // REQUIRED
    'TableName' => '<string>', // REQUIRED
]);

Parameter Details

Members
CatalogId
Type: string

The ID of the Data Catalog where the partitions in question reside. If none is supplied, the Amazon Web Services account ID is used by default.

ColumnName
Required: Yes
Type: string

Name of the column.

DatabaseName
Required: Yes
Type: string

The name of the catalog database where the partitions reside.

PartitionValues
Required: Yes
Type: Array of strings

A list of partition values identifying the partition.

TableName
Required: Yes
Type: string

The name of the partitions' table.

Result Syntax

[]

Result Details

The results for this operation are always empty.

Errors

EntityNotFoundException:

A specified entity does not exist

InvalidInputException:

The input provided was not valid.

InternalServiceException:

An internal service error occurred.

OperationTimeoutException:

The operation timed out.

GlueEncryptionException:

An encryption operation failed.

DeleteColumnStatisticsForTable

$result = $client->deleteColumnStatisticsForTable([/* ... */]);
$promise = $client->deleteColumnStatisticsForTableAsync([/* ... */]);

Retrieves table statistics of columns.

The Identity and Access Management (IAM) permission required for this operation is DeleteTable.

Parameter Syntax

$result = $client->deleteColumnStatisticsForTable([
    'CatalogId' => '<string>',
    'ColumnName' => '<string>', // REQUIRED
    'DatabaseName' => '<string>', // REQUIRED
    'TableName' => '<string>', // REQUIRED
]);

Parameter Details

Members
CatalogId
Type: string

The ID of the Data Catalog where the partitions in question reside. If none is supplied, the Amazon Web Services account ID is used by default.

ColumnName
Required: Yes
Type: string

The name of the column.

DatabaseName
Required: Yes
Type: string

The name of the catalog database where the partitions reside.

TableName
Required: Yes
Type: string

The name of the partitions' table.

Result Syntax

[]

Result Details

The results for this operation are always empty.

Errors

EntityNotFoundException:

A specified entity does not exist

InvalidInputException:

The input provided was not valid.

InternalServiceException:

An internal service error occurred.

OperationTimeoutException:

The operation timed out.

GlueEncryptionException:

An encryption operation failed.

DeleteColumnStatisticsTaskSettings

$result = $client->deleteColumnStatisticsTaskSettings([/* ... */]);
$promise = $client->deleteColumnStatisticsTaskSettingsAsync([/* ... */]);

Deletes settings for a column statistics task.

Parameter Syntax

$result = $client->deleteColumnStatisticsTaskSettings([
    'DatabaseName' => '<string>', // REQUIRED
    'TableName' => '<string>', // REQUIRED
]);

Parameter Details

Members
DatabaseName
Required: Yes
Type: string

The name of the database where the table resides.

TableName
Required: Yes
Type: string

The name of the table for which to delete column statistics.

Result Syntax

[]

Result Details

The results for this operation are always empty.

Errors

EntityNotFoundException:

A specified entity does not exist

InvalidInputException:

The input provided was not valid.

OperationTimeoutException:

The operation timed out.

DeleteConnection

$result = $client->deleteConnection([/* ... */]);
$promise = $client->deleteConnectionAsync([/* ... */]);

Deletes a connection from the Data Catalog.

Parameter Syntax

$result = $client->deleteConnection([
    'CatalogId' => '<string>',
    'ConnectionName' => '<string>', // REQUIRED
]);

Parameter Details

Members
CatalogId
Type: string

The ID of the Data Catalog in which the connection resides. If none is provided, the Amazon Web Services account ID is used by default.

ConnectionName
Required: Yes
Type: string

The name of the connection to delete.

Result Syntax

[]

Result Details

The results for this operation are always empty.

Errors

EntityNotFoundException:

A specified entity does not exist

OperationTimeoutException:

The operation timed out.

DeleteCrawler

$result = $client->deleteCrawler([/* ... */]);
$promise = $client->deleteCrawlerAsync([/* ... */]);

Removes a specified crawler from the Glue Data Catalog, unless the crawler state is RUNNING.

Parameter Syntax

$result = $client->deleteCrawler([
    'Name' => '<string>', // REQUIRED
]);

Parameter Details

Members
Name
Required: Yes
Type: string

The name of the crawler to remove.

Result Syntax

[]

Result Details

The results for this operation are always empty.

Errors

EntityNotFoundException:

A specified entity does not exist

CrawlerRunningException:

The operation cannot be performed because the crawler is already running.

SchedulerTransitioningException:

The specified scheduler is transitioning.

OperationTimeoutException:

The operation timed out.

DeleteCustomEntityType

$result = $client->deleteCustomEntityType([/* ... */]);
$promise = $client->deleteCustomEntityTypeAsync([/* ... */]);

Deletes a custom pattern by specifying its name.

Parameter Syntax

$result = $client->deleteCustomEntityType([
    'Name' => '<string>', // REQUIRED
]);

Parameter Details

Members
Name
Required: Yes
Type: string

The name of the custom pattern that you want to delete.

Result Syntax

[
    'Name' => '<string>',
]

Result Details

Members
Name
Type: string

The name of the custom pattern you deleted.

Errors

EntityNotFoundException:

A specified entity does not exist

AccessDeniedException:

Access to a resource was denied.

InternalServiceException:

An internal service error occurred.

InvalidInputException:

The input provided was not valid.

OperationTimeoutException:

The operation timed out.

DeleteDataQualityRuleset

$result = $client->deleteDataQualityRuleset([/* ... */]);
$promise = $client->deleteDataQualityRulesetAsync([/* ... */]);

Deletes a data quality ruleset.

Parameter Syntax

$result = $client->deleteDataQualityRuleset([
    'Name' => '<string>', // REQUIRED
]);

Parameter Details

Members
Name
Required: Yes
Type: string

A name for the data quality ruleset.

Result Syntax

[]

Result Details

The results for this operation are always empty.

Errors

EntityNotFoundException:

A specified entity does not exist

InvalidInputException:

The input provided was not valid.

OperationTimeoutException:

The operation timed out.

InternalServiceException:

An internal service error occurred.

DeleteDatabase

$result = $client->deleteDatabase([/* ... */]);
$promise = $client->deleteDatabaseAsync([/* ... */]);

Removes a specified database from a Data Catalog.

After completing this operation, you no longer have access to the tables (and all table versions and partitions that might belong to the tables) and the user-defined functions in the deleted database. Glue deletes these "orphaned" resources asynchronously in a timely manner, at the discretion of the service.

To ensure the immediate deletion of all related resources, before calling DeleteDatabase, use DeleteTableVersion or BatchDeleteTableVersion, DeletePartition or BatchDeletePartition, DeleteUserDefinedFunction, and DeleteTable or BatchDeleteTable, to delete any resources that belong to the database.

Parameter Syntax

$result = $client->deleteDatabase([
    'CatalogId' => '<string>',
    'Name' => '<string>', // REQUIRED
]);

Parameter Details

Members
CatalogId
Type: string

The ID of the Data Catalog in which the database resides. If none is provided, the Amazon Web Services account ID is used by default.

Name
Required: Yes
Type: string

The name of the database to delete. For Hive compatibility, this must be all lowercase.

Result Syntax

[]

Result Details

The results for this operation are always empty.

Errors

EntityNotFoundException:

A specified entity does not exist

InvalidInputException:

The input provided was not valid.

InternalServiceException:

An internal service error occurred.

OperationTimeoutException:

The operation timed out.

ConcurrentModificationException:

Two processes are trying to modify a resource simultaneously.

DeleteDevEndpoint

$result = $client->deleteDevEndpoint([/* ... */]);
$promise = $client->deleteDevEndpointAsync([/* ... */]);

Deletes a specified development endpoint.

Parameter Syntax

$result = $client->deleteDevEndpoint([
    'EndpointName' => '<string>', // REQUIRED
]);

Parameter Details

Members
EndpointName
Required: Yes
Type: string

The name of the DevEndpoint.

Result Syntax

[]

Result Details

The results for this operation are always empty.

Errors

EntityNotFoundException:

A specified entity does not exist

InternalServiceException:

An internal service error occurred.

OperationTimeoutException:

The operation timed out.

InvalidInputException:

The input provided was not valid.

DeleteJob

$result = $client->deleteJob([/* ... */]);
$promise = $client->deleteJobAsync([/* ... */]);

Deletes a specified job definition. If the job definition is not found, no exception is thrown.

Parameter Syntax

$result = $client->deleteJob([
    'JobName' => '<string>', // REQUIRED
]);

Parameter Details

Members
JobName
Required: Yes
Type: string

The name of the job definition to delete.

Result Syntax

[
    'JobName' => '<string>',
]

Result Details

Members
JobName
Type: string

The name of the job definition that was deleted.

Errors

InvalidInputException:

The input provided was not valid.

InternalServiceException:

An internal service error occurred.

OperationTimeoutException:

The operation timed out.

DeleteMLTransform

$result = $client->deleteMLTransform([/* ... */]);
$promise = $client->deleteMLTransformAsync([/* ... */]);

Deletes an Glue machine learning transform. Machine learning transforms are a special type of transform that use machine learning to learn the details of the transformation to be performed by learning from examples provided by humans. These transformations are then saved by Glue. If you no longer need a transform, you can delete it by calling DeleteMLTransforms. However, any Glue jobs that still reference the deleted transform will no longer succeed.

Parameter Syntax

$result = $client->deleteMLTransform([
    'TransformId' => '<string>', // REQUIRED
]);

Parameter Details

Members
TransformId
Required: Yes
Type: string

The unique identifier of the transform to delete.

Result Syntax

[
    'TransformId' => '<string>',
]

Result Details

Members
TransformId
Type: string

The unique identifier of the transform that was deleted.

Errors

EntityNotFoundException:

A specified entity does not exist

InvalidInputException:

The input provided was not valid.

OperationTimeoutException:

The operation timed out.

InternalServiceException:

An internal service error occurred.

DeletePartition

$result = $client->deletePartition([/* ... */]);
$promise = $client->deletePartitionAsync([/* ... */]);

Deletes a specified partition.

Parameter Syntax

$result = $client->deletePartition([
    'CatalogId' => '<string>',
    'DatabaseName' => '<string>', // REQUIRED
    'PartitionValues' => ['<string>', ...], // REQUIRED
    'TableName' => '<string>', // REQUIRED
]);

Parameter Details

Members
CatalogId
Type: string

The ID of the Data Catalog where the partition to be deleted resides. If none is provided, the Amazon Web Services account ID is used by default.

DatabaseName
Required: Yes
Type: string

The name of the catalog database in which the table in question resides.

PartitionValues
Required: Yes
Type: Array of strings

The values that define the partition.

TableName
Required: Yes
Type: string

The name of the table that contains the partition to be deleted.

Result Syntax

[]

Result Details

The results for this operation are always empty.

Errors

EntityNotFoundException:

A specified entity does not exist

InvalidInputException:

The input provided was not valid.

InternalServiceException:

An internal service error occurred.

OperationTimeoutException:

The operation timed out.

DeletePartitionIndex

$result = $client->deletePartitionIndex([/* ... */]);
$promise = $client->deletePartitionIndexAsync([/* ... */]);

Deletes a specified partition index from an existing table.

Parameter Syntax

$result = $client->deletePartitionIndex([
    'CatalogId' => '<string>',
    'DatabaseName' => '<string>', // REQUIRED
    'IndexName' => '<string>', // REQUIRED
    'TableName' => '<string>', // REQUIRED
]);

Parameter Details

Members
CatalogId
Type: string

The catalog ID where the table resides.

DatabaseName
Required: Yes
Type: string

Specifies the name of a database from which you want to delete a partition index.

IndexName
Required: Yes
Type: string

The name of the partition index to be deleted.

TableName
Required: Yes
Type: string

Specifies the name of a table from which you want to delete a partition index.

Result Syntax

[]

Result Details

The results for this operation are always empty.

Errors

InternalServiceException:

An internal service error occurred.

OperationTimeoutException:

The operation timed out.

InvalidInputException:

The input provided was not valid.

EntityNotFoundException:

A specified entity does not exist

ConflictException:

The CreatePartitions API was called on a table that has indexes enabled.

GlueEncryptionException:

An encryption operation failed.

DeleteRegistry

$result = $client->deleteRegistry([/* ... */]);
$promise = $client->deleteRegistryAsync([/* ... */]);

Delete the entire registry including schema and all of its versions. To get the status of the delete operation, you can call the GetRegistry API after the asynchronous call. Deleting a registry will deactivate all online operations for the registry such as the UpdateRegistry, CreateSchema, UpdateSchema, and RegisterSchemaVersion APIs.

Parameter Syntax

$result = $client->deleteRegistry([
    'RegistryId' => [ // REQUIRED
        'RegistryArn' => '<string>',
        'RegistryName' => '<string>',
    ],
]);

Parameter Details

Members
RegistryId
Required: Yes
Type: RegistryId structure

This is a wrapper structure that may contain the registry name and Amazon Resource Name (ARN).

Result Syntax

[
    'RegistryArn' => '<string>',
    'RegistryName' => '<string>',
    'Status' => 'AVAILABLE|DELETING',
]

Result Details

Members
RegistryArn
Type: string

The Amazon Resource Name (ARN) of the registry being deleted.

RegistryName
Type: string

The name of the registry being deleted.

Status
Type: string

The status of the registry. A successful operation will return the Deleting status.

Errors

InvalidInputException:

The input provided was not valid.

EntityNotFoundException:

A specified entity does not exist

AccessDeniedException:

Access to a resource was denied.

ConcurrentModificationException:

Two processes are trying to modify a resource simultaneously.

DeleteResourcePolicy

$result = $client->deleteResourcePolicy([/* ... */]);
$promise = $client->deleteResourcePolicyAsync([/* ... */]);

Deletes a specified policy.

Parameter Syntax

$result = $client->deleteResourcePolicy([
    'PolicyHashCondition' => '<string>',
    'ResourceArn' => '<string>',
]);

Parameter Details

Members
PolicyHashCondition
Type: string

The hash value returned when this policy was set.

ResourceArn
Type: string

The ARN of the Glue resource for the resource policy to be deleted.

Result Syntax

[]

Result Details

The results for this operation are always empty.

Errors

EntityNotFoundException:

A specified entity does not exist

InternalServiceException:

An internal service error occurred.

OperationTimeoutException:

The operation timed out.

InvalidInputException:

The input provided was not valid.

ConditionCheckFailureException:

A specified condition was not satisfied.

DeleteSchema

$result = $client->deleteSchema([/* ... */]);
$promise = $client->deleteSchemaAsync([/* ... */]);

Deletes the entire schema set, including the schema set and all of its versions. To get the status of the delete operation, you can call GetSchema API after the asynchronous call. Deleting a registry will deactivate all online operations for the schema, such as the GetSchemaByDefinition, and RegisterSchemaVersion APIs.

Parameter Syntax

$result = $client->deleteSchema([
    'SchemaId' => [ // REQUIRED
        'RegistryName' => '<string>',
        'SchemaArn' => '<string>',
        'SchemaName' => '<string>',
    ],
]);

Parameter Details

Members
SchemaId
Required: Yes
Type: SchemaId structure

This is a wrapper structure that may contain the schema name and Amazon Resource Name (ARN).

Result Syntax

[
    'SchemaArn' => '<string>',
    'SchemaName' => '<string>',
    'Status' => 'AVAILABLE|PENDING|DELETING',
]

Result Details

Members
SchemaArn
Type: string

The Amazon Resource Name (ARN) of the schema being deleted.

SchemaName
Type: string

The name of the schema being deleted.

Status
Type: string

The status of the schema.

Errors

InvalidInputException:

The input provided was not valid.

EntityNotFoundException:

A specified entity does not exist

AccessDeniedException:

Access to a resource was denied.

ConcurrentModificationException:

Two processes are trying to modify a resource simultaneously.

DeleteSchemaVersions

$result = $client->deleteSchemaVersions([/* ... */]);
$promise = $client->deleteSchemaVersionsAsync([/* ... */]);

Remove versions from the specified schema. A version number or range may be supplied. If the compatibility mode forbids deleting of a version that is necessary, such as BACKWARDS_FULL, an error is returned. Calling the GetSchemaVersions API after this call will list the status of the deleted versions.

When the range of version numbers contain check pointed version, the API will return a 409 conflict and will not proceed with the deletion. You have to remove the checkpoint first using the DeleteSchemaCheckpoint API before using this API.

You cannot use the DeleteSchemaVersions API to delete the first schema version in the schema set. The first schema version can only be deleted by the DeleteSchema API. This operation will also delete the attached SchemaVersionMetadata under the schema versions. Hard deletes will be enforced on the database.

If the compatibility mode forbids deleting of a version that is necessary, such as BACKWARDS_FULL, an error is returned.

Parameter Syntax

$result = $client->deleteSchemaVersions([
    'SchemaId' => [ // REQUIRED
        'RegistryName' => '<string>',
        'SchemaArn' => '<string>',
        'SchemaName' => '<string>',
    ],
    'Versions' => '<string>', // REQUIRED
]);

Parameter Details

Members
SchemaId
Required: Yes
Type: SchemaId structure

This is a wrapper structure that may contain the schema name and Amazon Resource Name (ARN).

Versions
Required: Yes
Type: string

A version range may be supplied which may be of the format:

  • a single version number, 5

  • a range, 5-8 : deletes versions 5, 6, 7, 8

Result Syntax

[
    'SchemaVersionErrors' => [
        [
            'ErrorDetails' => [
                'ErrorCode' => '<string>',
                'ErrorMessage' => '<string>',
            ],
            'VersionNumber' => <integer>,
        ],
        // ...
    ],
]

Result Details

Members
SchemaVersionErrors
Type: Array of SchemaVersionErrorItem structures

A list of SchemaVersionErrorItem objects, each containing an error and schema version.

Errors

InvalidInputException:

The input provided was not valid.

EntityNotFoundException:

A specified entity does not exist

AccessDeniedException:

Access to a resource was denied.

ConcurrentModificationException:

Two processes are trying to modify a resource simultaneously.

DeleteSecurityConfiguration

$result = $client->deleteSecurityConfiguration([/* ... */]);
$promise = $client->deleteSecurityConfigurationAsync([/* ... */]);

Deletes a specified security configuration.

Parameter Syntax

$result = $client->deleteSecurityConfiguration([
    'Name' => '<string>', // REQUIRED
]);

Parameter Details

Members
Name
Required: Yes
Type: string

The name of the security configuration to delete.

Result Syntax

[]

Result Details

The results for this operation are always empty.

Errors

EntityNotFoundException:

A specified entity does not exist

InvalidInputException:

The input provided was not valid.

InternalServiceException:

An internal service error occurred.

OperationTimeoutException:

The operation timed out.

DeleteSession

$result = $client->deleteSession([/* ... */]);
$promise = $client->deleteSessionAsync([/* ... */]);

Deletes the session.

Parameter Syntax

$result = $client->deleteSession([
    'Id' => '<string>', // REQUIRED
    'RequestOrigin' => '<string>',
]);

Parameter Details

Members
Id
Required: Yes
Type: string

The ID of the session to be deleted.

RequestOrigin
Type: string

The name of the origin of the delete session request.

Result Syntax

[
    'Id' => '<string>',
]

Result Details

Members
Id
Type: string

Returns the ID of the deleted session.

Errors

AccessDeniedException:

Access to a resource was denied.

InternalServiceException:

An internal service error occurred.

OperationTimeoutException:

The operation timed out.

InvalidInputException:

The input provided was not valid.

IllegalSessionStateException:

The session is in an invalid state to perform a requested operation.

ConcurrentModificationException:

Two processes are trying to modify a resource simultaneously.

DeleteTable

$result = $client->deleteTable([/* ... */]);
$promise = $client->deleteTableAsync([/* ... */]);

Removes a table definition from the Data Catalog.

After completing this operation, you no longer have access to the table versions and partitions that belong to the deleted table. Glue deletes these "orphaned" resources asynchronously in a timely manner, at the discretion of the service.

To ensure the immediate deletion of all related resources, before calling DeleteTable, use DeleteTableVersion or BatchDeleteTableVersion, and DeletePartition or BatchDeletePartition, to delete any resources that belong to the table.

Parameter Syntax

$result = $client->deleteTable([
    'CatalogId' => '<string>',
    'DatabaseName' => '<string>', // REQUIRED
    'Name' => '<string>', // REQUIRED
    'TransactionId' => '<string>',
]);

Parameter Details

Members
CatalogId
Type: string

The ID of the Data Catalog where the table resides. If none is provided, the Amazon Web Services account ID is used by default.

DatabaseName
Required: Yes
Type: string

The name of the catalog database in which the table resides. For Hive compatibility, this name is entirely lowercase.

Name
Required: Yes
Type: string

The name of the table to be deleted. For Hive compatibility, this name is entirely lowercase.

TransactionId
Type: string

The transaction ID at which to delete the table contents.

Result Syntax

[]

Result Details

The results for this operation are always empty.

Errors

EntityNotFoundException:

A specified entity does not exist

InvalidInputException:

The input provided was not valid.

InternalServiceException:

An internal service error occurred.

OperationTimeoutException:

The operation timed out.

ConcurrentModificationException:

Two processes are trying to modify a resource simultaneously.

ResourceNotReadyException:

A resource was not ready for a transaction.

DeleteTableOptimizer

$result = $client->deleteTableOptimizer([/* ... */]);
$promise = $client->deleteTableOptimizerAsync([/* ... */]);

Deletes an optimizer and all associated metadata for a table. The optimization will no longer be performed on the table.

Parameter Syntax

$result = $client->deleteTableOptimizer([
    'CatalogId' => '<string>', // REQUIRED
    'DatabaseName' => '<string>', // REQUIRED
    'TableName' => '<string>', // REQUIRED
    'Type' => 'compaction|retention|orphan_file_deletion', // REQUIRED
]);

Parameter Details

Members
CatalogId
Required: Yes
Type: string

The Catalog ID of the table.

DatabaseName
Required: Yes
Type: string

The name of the database in the catalog in which the table resides.

TableName
Required: Yes
Type: string

The name of the table.

Type
Required: Yes
Type: string

The type of table optimizer.

Result Syntax

[]

Result Details

The results for this operation are always empty.

Errors

EntityNotFoundException:

A specified entity does not exist

InvalidInputException:

The input provided was not valid.

AccessDeniedException:

Access to a resource was denied.

InternalServiceException:

An internal service error occurred.

ThrottlingException:

The throttling threshhold was exceeded.

DeleteTableVersion

$result = $client->deleteTableVersion([/* ... */]);
$promise = $client->deleteTableVersionAsync([/* ... */]);

Deletes a specified version of a table.

Parameter Syntax

$result = $client->deleteTableVersion([
    'CatalogId' => '<string>',
    'DatabaseName' => '<string>', // REQUIRED
    'TableName' => '<string>', // REQUIRED
    'VersionId' => '<string>', // REQUIRED
]);

Parameter Details

Members
CatalogId
Type: string

The ID of the Data Catalog where the tables reside. If none is provided, the Amazon Web Services account ID is used by default.

DatabaseName
Required: Yes
Type: string

The database in the catalog in which the table resides. For Hive compatibility, this name is entirely lowercase.

TableName
Required: Yes
Type: string

The name of the table. For Hive compatibility, this name is entirely lowercase.

VersionId
Required: Yes
Type: string

The ID of the table version to be deleted. A VersionID is a string representation of an integer. Each version is incremented by 1.

Result Syntax

[]

Result Details

The results for this operation are always empty.

Errors

EntityNotFoundException:

A specified entity does not exist

InvalidInputException:

The input provided was not valid.

InternalServiceException:

An internal service error occurred.

OperationTimeoutException:

The operation timed out.

DeleteTrigger

$result = $client->deleteTrigger([/* ... */]);
$promise = $client->deleteTriggerAsync([/* ... */]);

Deletes a specified trigger. If the trigger is not found, no exception is thrown.

Parameter Syntax

$result = $client->deleteTrigger([
    'Name' => '<string>', // REQUIRED
]);

Parameter Details

Members
Name
Required: Yes
Type: string

The name of the trigger to delete.

Result Syntax

[
    'Name' => '<string>',
]

Result Details

Members
Name
Type: string

The name of the trigger that was deleted.

Errors

InvalidInputException:

The input provided was not valid.

InternalServiceException:

An internal service error occurred.

OperationTimeoutException:

The operation timed out.

ConcurrentModificationException:

Two processes are trying to modify a resource simultaneously.

DeleteUsageProfile

$result = $client->deleteUsageProfile([/* ... */]);
$promise = $client->deleteUsageProfileAsync([/* ... */]);

Deletes the Glue specified usage profile.

Parameter Syntax

$result = $client->deleteUsageProfile([
    'Name' => '<string>', // REQUIRED
]);

Parameter Details

Members
Name
Required: Yes
Type: string

The name of the usage profile to delete.

Result Syntax

[]

Result Details

The results for this operation are always empty.

Errors

InvalidInputException:

The input provided was not valid.

InternalServiceException:

An internal service error occurred.

OperationTimeoutException:

The operation timed out.

OperationNotSupportedException:

The operation is not available in the region.

DeleteUserDefinedFunction

$result = $client->deleteUserDefinedFunction([/* ... */]);
$promise = $client->deleteUserDefinedFunctionAsync([/* ... */]);

Deletes an existing function definition from the Data Catalog.

Parameter Syntax

$result = $client->deleteUserDefinedFunction([
    'CatalogId' => '<string>',
    'DatabaseName' => '<string>', // REQUIRED
    'FunctionName' => '<string>', // REQUIRED
]);

Parameter Details

Members
CatalogId
Type: string

The ID of the Data Catalog where the function to be deleted is located. If none is supplied, the Amazon Web Services account ID is used by default.

DatabaseName
Required: Yes
Type: string

The name of the catalog database where the function is located.

FunctionName
Required: Yes
Type: string

The name of the function definition to be deleted.

Result Syntax

[]

Result Details

The results for this operation are always empty.

Errors

EntityNotFoundException:

A specified entity does not exist

InvalidInputException:

The input provided was not valid.

InternalServiceException:

An internal service error occurred.

OperationTimeoutException:

The operation timed out.

DeleteWorkflow

$result = $client->deleteWorkflow([/* ... */]);
$promise = $client->deleteWorkflowAsync([/* ... */]);

Deletes a workflow.

Parameter Syntax

$result = $client->deleteWorkflow([
    'Name' => '<string>', // REQUIRED
]);

Parameter Details

Members
Name
Required: Yes
Type: string

Name of the workflow to be deleted.

Result Syntax

[
    'Name' => '<string>',
]

Result Details

Members
Name
Type: string

Name of the workflow specified in input.

Errors

InvalidInputException:

The input provided was not valid.

InternalServiceException:

An internal service error occurred.

OperationTimeoutException:

The operation timed out.

ConcurrentModificationException:

Two processes are trying to modify a resource simultaneously.

GetBlueprint

$result = $client->getBlueprint([/* ... */]);
$promise = $client->getBlueprintAsync([/* ... */]);

Retrieves the details of a blueprint.

Parameter Syntax

$result = $client->getBlueprint([
    'IncludeBlueprint' => true || false,
    'IncludeParameterSpec' => true || false,
    'Name' => '<string>', // REQUIRED
]);

Parameter Details

Members
IncludeBlueprint
Type: boolean

Specifies whether or not to include the blueprint in the response.

IncludeParameterSpec
Type: boolean

Specifies whether or not to include the parameter specification.

Name
Required: Yes
Type: string

The name of the blueprint.

Result Syntax

[
    'Blueprint' => [
        'BlueprintLocation' => '<string>',
        'BlueprintServiceLocation' => '<string>',
        'CreatedOn' => <DateTime>,
        'Description' => '<string>',
        'ErrorMessage' => '<string>',
        'LastActiveDefinition' => [
            'BlueprintLocation' => '<string>',
            'BlueprintServiceLocation' => '<string>',
            'Description' => '<string>',
            'LastModifiedOn' => <DateTime>,
            'ParameterSpec' => '<string>',
        ],
        'LastModifiedOn' => <DateTime>,
        'Name' => '<string>',
        'ParameterSpec' => '<string>',
        'Status' => 'CREATING|ACTIVE|UPDATING|FAILED',
    ],
]

Result Details

Members
Blueprint
Type: Blueprint structure

Returns a Blueprint object.

Errors

EntityNotFoundException:

A specified entity does not exist

InvalidInputException:

The input provided was not valid.

OperationTimeoutException:

The operation timed out.

InternalServiceException:

An internal service error occurred.

GetBlueprintRun

$result = $client->getBlueprintRun([/* ... */]);
$promise = $client->getBlueprintRunAsync([/* ... */]);

Retrieves the details of a blueprint run.

Parameter Syntax

$result = $client->getBlueprintRun([
    'BlueprintName' => '<string>', // REQUIRED
    'RunId' => '<string>', // REQUIRED
]);

Parameter Details

Members
BlueprintName
Required: Yes
Type: string

The name of the blueprint.

RunId
Required: Yes
Type: string

The run ID for the blueprint run you want to retrieve.

Result Syntax

[
    'BlueprintRun' => [
        'BlueprintName' => '<string>',
        'CompletedOn' => <DateTime>,
        'ErrorMessage' => '<string>',
        'Parameters' => '<string>',
        'RoleArn' => '<string>',
        'RollbackErrorMessage' => '<string>',
        'RunId' => '<string>',
        'StartedOn' => <DateTime>,
        'State' => 'RUNNING|SUCCEEDED|FAILED|ROLLING_BACK',
        'WorkflowName' => '<string>',
    ],
]

Result Details

Members
BlueprintRun
Type: BlueprintRun structure

Returns a BlueprintRun object.

Errors

EntityNotFoundException:

A specified entity does not exist

InternalServiceException:

An internal service error occurred.

OperationTimeoutException:

The operation timed out.

GetBlueprintRuns

$result = $client->getBlueprintRuns([/* ... */]);
$promise = $client->getBlueprintRunsAsync([/* ... */]);

Retrieves the details of blueprint runs for a specified blueprint.

Parameter Syntax

$result = $client->getBlueprintRuns([
    'BlueprintName' => '<string>', // REQUIRED
    'MaxResults' => <integer>,
    'NextToken' => '<string>',
]);

Parameter Details

Members
BlueprintName
Required: Yes
Type: string

The name of the blueprint.

MaxResults
Type: int

The maximum size of a list to return.

NextToken
Type: string

A continuation token, if this is a continuation request.

Result Syntax

[
    'BlueprintRuns' => [
        [
            'BlueprintName' => '<string>',
            'CompletedOn' => <DateTime>,
            'ErrorMessage' => '<string>',
            'Parameters' => '<string>',
            'RoleArn' => '<string>',
            'RollbackErrorMessage' => '<string>',
            'RunId' => '<string>',
            'StartedOn' => <DateTime>,
            'State' => 'RUNNING|SUCCEEDED|FAILED|ROLLING_BACK',
            'WorkflowName' => '<string>',
        ],
        // ...
    ],
    'NextToken' => '<string>',
]

Result Details

Members
BlueprintRuns
Type: Array of BlueprintRun structures

Returns a list of BlueprintRun objects.

NextToken
Type: string

A continuation token, if not all blueprint runs have been returned.

Errors

EntityNotFoundException:

A specified entity does not exist

InternalServiceException:

An internal service error occurred.

OperationTimeoutException:

The operation timed out.

InvalidInputException:

The input provided was not valid.

GetCatalogImportStatus

$result = $client->getCatalogImportStatus([/* ... */]);
$promise = $client->getCatalogImportStatusAsync([/* ... */]);

Retrieves the status of a migration operation.

Parameter Syntax

$result = $client->getCatalogImportStatus([
    'CatalogId' => '<string>',
]);

Parameter Details

Members
CatalogId
Type: string

The ID of the catalog to migrate. Currently, this should be the Amazon Web Services account ID.

Result Syntax

[
    'ImportStatus' => [
        'ImportCompleted' => true || false,
        'ImportTime' => <DateTime>,
        'ImportedBy' => '<string>',
    ],
]

Result Details

Members
ImportStatus
Type: CatalogImportStatus structure

The status of the specified catalog migration.

Errors

InternalServiceException:

An internal service error occurred.

OperationTimeoutException:

The operation timed out.

GetClassifier

$result = $client->getClassifier([/* ... */]);
$promise = $client->getClassifierAsync([/* ... */]);

Retrieve a classifier by name.

Parameter Syntax

$result = $client->getClassifier([
    'Name' => '<string>', // REQUIRED
]);

Parameter Details

Members
Name
Required: Yes
Type: string

Name of the classifier to retrieve.

Result Syntax

[
    'Classifier' => [
        'CsvClassifier' => [
            'AllowSingleColumn' => true || false,
            'ContainsHeader' => 'UNKNOWN|PRESENT|ABSENT',
            'CreationTime' => <DateTime>,
            'CustomDatatypeConfigured' => true || false,
            'CustomDatatypes' => ['<string>', ...],
            'Delimiter' => '<string>',
            'DisableValueTrimming' => true || false,
            'Header' => ['<string>', ...],
            'LastUpdated' => <DateTime>,
            'Name' => '<string>',
            'QuoteSymbol' => '<string>',
            'Serde' => 'OpenCSVSerDe|LazySimpleSerDe|None',
            'Version' => <integer>,
        ],
        'GrokClassifier' => [
            'Classification' => '<string>',
            'CreationTime' => <DateTime>,
            'CustomPatterns' => '<string>',
            'GrokPattern' => '<string>',
            'LastUpdated' => <DateTime>,
            'Name' => '<string>',
            'Version' => <integer>,
        ],
        'JsonClassifier' => [
            'CreationTime' => <DateTime>,
            'JsonPath' => '<string>',
            'LastUpdated' => <DateTime>,
            'Name' => '<string>',
            'Version' => <integer>,
        ],
        'XMLClassifier' => [
            'Classification' => '<string>',
            'CreationTime' => <DateTime>,
            'LastUpdated' => <DateTime>,
            'Name' => '<string>',
            'RowTag' => '<string>',
            'Version' => <integer>,
        ],
    ],
]

Result Details

Members
Classifier
Type: Classifier structure

The requested classifier.

Errors

EntityNotFoundException:

A specified entity does not exist

OperationTimeoutException:

The operation timed out.

GetClassifiers

$result = $client->getClassifiers([/* ... */]);
$promise = $client->getClassifiersAsync([/* ... */]);

Lists all classifier objects in the Data Catalog.

Parameter Syntax

$result = $client->getClassifiers([
    'MaxResults' => <integer>,
    'NextToken' => '<string>',
]);

Parameter Details

Members
MaxResults
Type: int

The size of the list to return (optional).

NextToken
Type: string

An optional continuation token.

Result Syntax

[
    'Classifiers' => [
        [
            'CsvClassifier' => [
                'AllowSingleColumn' => true || false,
                'ContainsHeader' => 'UNKNOWN|PRESENT|ABSENT',
                'CreationTime' => <DateTime>,
                'CustomDatatypeConfigured' => true || false,
                'CustomDatatypes' => ['<string>', ...],
                'Delimiter' => '<string>',
                'DisableValueTrimming' => true || false,
                'Header' => ['<string>', ...],
                'LastUpdated' => <DateTime>,
                'Name' => '<string>',
                'QuoteSymbol' => '<string>',
                'Serde' => 'OpenCSVSerDe|LazySimpleSerDe|None',
                'Version' => <integer>,
            ],
            'GrokClassifier' => [
                'Classification' => '<string>',
                'CreationTime' => <DateTime>,
                'CustomPatterns' => '<string>',
                'GrokPattern' => '<string>',
                'LastUpdated' => <DateTime>,
                'Name' => '<string>',
                'Version' => <integer>,
            ],
            'JsonClassifier' => [
                'CreationTime' => <DateTime>,
                'JsonPath' => '<string>',
                'LastUpdated' => <DateTime>,
                'Name' => '<string>',
                'Version' => <integer>,
            ],
            'XMLClassifier' => [
                'Classification' => '<string>',
                'CreationTime' => <DateTime>,
                'LastUpdated' => <DateTime>,
                'Name' => '<string>',
                'RowTag' => '<string>',
                'Version' => <integer>,
            ],
        ],
        // ...
    ],
    'NextToken' => '<string>',
]

Result Details

Members
Classifiers
Type: Array of Classifier structures

The requested list of classifier objects.

NextToken
Type: string

A continuation token.

Errors

OperationTimeoutException:

The operation timed out.

GetColumnStatisticsForPartition

$result = $client->getColumnStatisticsForPartition([/* ... */]);
$promise = $client->getColumnStatisticsForPartitionAsync([/* ... */]);

Retrieves partition statistics of columns.

The Identity and Access Management (IAM) permission required for this operation is GetPartition.

Parameter Syntax

$result = $client->getColumnStatisticsForPartition([
    'CatalogId' => '<string>',
    'ColumnNames' => ['<string>', ...], // REQUIRED
    'DatabaseName' => '<string>', // REQUIRED
    'PartitionValues' => ['<string>', ...], // REQUIRED
    'TableName' => '<string>', // REQUIRED
]);

Parameter Details

Members
CatalogId
Type: string

The ID of the Data Catalog where the partitions in question reside. If none is supplied, the Amazon Web Services account ID is used by default.

ColumnNames
Required: Yes
Type: Array of strings

A list of the column names.

DatabaseName
Required: Yes
Type: string

The name of the catalog database where the partitions reside.

PartitionValues
Required: Yes
Type: Array of strings

A list of partition values identifying the partition.

TableName
Required: Yes
Type: string

The name of the partitions' table.

Result Syntax

[
    'ColumnStatisticsList' => [
        [
            'AnalyzedTime' => <DateTime>,
            'ColumnName' => '<string>',
            'ColumnType' => '<string>',
            'StatisticsData' => [
                'BinaryColumnStatisticsData' => [
                    'AverageLength' => <float>,
                    'MaximumLength' => <integer>,
                    'NumberOfNulls' => <integer>,
                ],
                'BooleanColumnStatisticsData' => [
                    'NumberOfFalses' => <integer>,
                    'NumberOfNulls' => <integer>,
                    'NumberOfTrues' => <integer>,
                ],
                'DateColumnStatisticsData' => [
                    'MaximumValue' => <DateTime>,
                    'MinimumValue' => <DateTime>,
                    'NumberOfDistinctValues' => <integer>,
                    'NumberOfNulls' => <integer>,
                ],
                'DecimalColumnStatisticsData' => [
                    'MaximumValue' => [
                        'Scale' => <integer>,
                        'UnscaledValue' => <string || resource || Psr\Http\Message\StreamInterface>,
                    ],
                    'MinimumValue' => [
                        'Scale' => <integer>,
                        'UnscaledValue' => <string || resource || Psr\Http\Message\StreamInterface>,
                    ],
                    'NumberOfDistinctValues' => <integer>,
                    'NumberOfNulls' => <integer>,
                ],
                'DoubleColumnStatisticsData' => [
                    'MaximumValue' => <float>,
                    'MinimumValue' => <float>,
                    'NumberOfDistinctValues' => <integer>,
                    'NumberOfNulls' => <integer>,
                ],
                'LongColumnStatisticsData' => [
                    'MaximumValue' => <integer>,
                    'MinimumValue' => <integer>,
                    'NumberOfDistinctValues' => <integer>,
                    'NumberOfNulls' => <integer>,
                ],
                'StringColumnStatisticsData' => [
                    'AverageLength' => <float>,
                    'MaximumLength' => <integer>,
                    'NumberOfDistinctValues' => <integer>,
                    'NumberOfNulls' => <integer>,
                ],
                'Type' => 'BOOLEAN|DATE|DECIMAL|DOUBLE|LONG|STRING|BINARY',
            ],
        ],
        // ...
    ],
    'Errors' => [
        [
            'ColumnName' => '<string>',
            'Error' => [
                'ErrorCode' => '<string>',
                'ErrorMessage' => '<string>',
            ],
        ],
        // ...
    ],
]

Result Details

Members
ColumnStatisticsList
Type: Array of ColumnStatistics structures

List of ColumnStatistics that failed to be retrieved.

Errors
Type: Array of ColumnError structures

Error occurred during retrieving column statistics data.

Errors

EntityNotFoundException:

A specified entity does not exist

InvalidInputException:

The input provided was not valid.

InternalServiceException:

An internal service error occurred.

OperationTimeoutException:

The operation timed out.

GlueEncryptionException:

An encryption operation failed.

GetColumnStatisticsForTable

$result = $client->getColumnStatisticsForTable([/* ... */]);
$promise = $client->getColumnStatisticsForTableAsync([/* ... */]);

Retrieves table statistics of columns.

The Identity and Access Management (IAM) permission required for this operation is GetTable.

Parameter Syntax

$result = $client->getColumnStatisticsForTable([
    'CatalogId' => '<string>',
    'ColumnNames' => ['<string>', ...], // REQUIRED
    'DatabaseName' => '<string>', // REQUIRED
    'TableName' => '<string>', // REQUIRED
]);

Parameter Details

Members
CatalogId
Type: string

The ID of the Data Catalog where the partitions in question reside. If none is supplied, the Amazon Web Services account ID is used by default.

ColumnNames
Required: Yes
Type: Array of strings

A list of the column names.

DatabaseName
Required: Yes
Type: string

The name of the catalog database where the partitions reside.

TableName
Required: Yes
Type: string

The name of the partitions' table.

Result Syntax

[
    'ColumnStatisticsList' => [
        [
            'AnalyzedTime' => <DateTime>,
            'ColumnName' => '<string>',
            'ColumnType' => '<string>',
            'StatisticsData' => [
                'BinaryColumnStatisticsData' => [
                    'AverageLength' => <float>,
                    'MaximumLength' => <integer>,
                    'NumberOfNulls' => <integer>,
                ],
                'BooleanColumnStatisticsData' => [
                    'NumberOfFalses' => <integer>,
                    'NumberOfNulls' => <integer>,
                    'NumberOfTrues' => <integer>,
                ],
                'DateColumnStatisticsData' => [
                    'MaximumValue' => <DateTime>,
                    'MinimumValue' => <DateTime>,
                    'NumberOfDistinctValues' => <integer>,
                    'NumberOfNulls' => <integer>,
                ],
                'DecimalColumnStatisticsData' => [
                    'MaximumValue' => [
                        'Scale' => <integer>,
                        'UnscaledValue' => <string || resource || Psr\Http\Message\StreamInterface>,
                    ],
                    'MinimumValue' => [
                        'Scale' => <integer>,
                        'UnscaledValue' => <string || resource || Psr\Http\Message\StreamInterface>,
                    ],
                    'NumberOfDistinctValues' => <integer>,
                    'NumberOfNulls' => <integer>,
                ],
                'DoubleColumnStatisticsData' => [
                    'MaximumValue' => <float>,
                    'MinimumValue' => <float>,
                    'NumberOfDistinctValues' => <integer>,
                    'NumberOfNulls' => <integer>,
                ],
                'LongColumnStatisticsData' => [
                    'MaximumValue' => <integer>,
                    'MinimumValue' => <integer>,
                    'NumberOfDistinctValues' => <integer>,
                    'NumberOfNulls' => <integer>,
                ],
                'StringColumnStatisticsData' => [
                    'AverageLength' => <float>,
                    'MaximumLength' => <integer>,
                    'NumberOfDistinctValues' => <integer>,
                    'NumberOfNulls' => <integer>,
                ],
                'Type' => 'BOOLEAN|DATE|DECIMAL|DOUBLE|LONG|STRING|BINARY',
            ],
        ],
        // ...
    ],
    'Errors' => [
        [
            'ColumnName' => '<string>',
            'Error' => [
                'ErrorCode' => '<string>',
                'ErrorMessage' => '<string>',
            ],
        ],
        // ...
    ],
]

Result Details

Members
ColumnStatisticsList
Type: Array of ColumnStatistics structures

List of ColumnStatistics.

Errors
Type: Array of ColumnError structures

List of ColumnStatistics that failed to be retrieved.

Errors

EntityNotFoundException:

A specified entity does not exist

InvalidInputException:

The input provided was not valid.

InternalServiceException:

An internal service error occurred.

OperationTimeoutException:

The operation timed out.

GlueEncryptionException:

An encryption operation failed.

GetColumnStatisticsTaskRun

$result = $client->getColumnStatisticsTaskRun([/* ... */]);
$promise = $client->getColumnStatisticsTaskRunAsync([/* ... */]);

Get the associated metadata/information for a task run, given a task run ID.

Parameter Syntax

$result = $client->getColumnStatisticsTaskRun([
    'ColumnStatisticsTaskRunId' => '<string>', // REQUIRED
]);

Parameter Details

Members
ColumnStatisticsTaskRunId
Required: Yes
Type: string

The identifier for the particular column statistics task run.

Result Syntax

[
    'ColumnStatisticsTaskRun' => [
        'CatalogID' => '<string>',
        'ColumnNameList' => ['<string>', ...],
        'ColumnStatisticsTaskRunId' => '<string>',
        'ComputationType' => 'FULL|INCREMENTAL',
        'CreationTime' => <DateTime>,
        'CustomerId' => '<string>',
        'DPUSeconds' => <float>,
        'DatabaseName' => '<string>',
        'EndTime' => <DateTime>,
        'ErrorMessage' => '<string>',
        'LastUpdated' => <DateTime>,
        'NumberOfWorkers' => <integer>,
        'Role' => '<string>',
        'SampleSize' => <float>,
        'SecurityConfiguration' => '<string>',
        'StartTime' => <DateTime>,
        'Status' => 'STARTING|RUNNING|SUCCEEDED|FAILED|STOPPED',
        'TableName' => '<string>',
        'WorkerType' => '<string>',
    ],
]

Result Details

Members
ColumnStatisticsTaskRun
Type: ColumnStatisticsTaskRun structure

A ColumnStatisticsTaskRun object representing the details of the column stats run.

Errors

EntityNotFoundException:

A specified entity does not exist

OperationTimeoutException:

The operation timed out.

InvalidInputException:

The input provided was not valid.

GetColumnStatisticsTaskRuns

$result = $client->getColumnStatisticsTaskRuns([/* ... */]);
$promise = $client->getColumnStatisticsTaskRunsAsync([/* ... */]);

Retrieves information about all runs associated with the specified table.

Parameter Syntax

$result = $client->getColumnStatisticsTaskRuns([
    'DatabaseName' => '<string>', // REQUIRED
    'MaxResults' => <integer>,
    'NextToken' => '<string>',
    'TableName' => '<string>', // REQUIRED
]);

Parameter Details

Members
DatabaseName
Required: Yes
Type: string

The name of the database where the table resides.

MaxResults
Type: int

The maximum size of the response.

NextToken
Type: string

A continuation token, if this is a continuation call.

TableName
Required: Yes
Type: string

The name of the table.

Result Syntax

[
    'ColumnStatisticsTaskRuns' => [
        [
            'CatalogID' => '<string>',
            'ColumnNameList' => ['<string>', ...],
            'ColumnStatisticsTaskRunId' => '<string>',
            'ComputationType' => 'FULL|INCREMENTAL',
            'CreationTime' => <DateTime>,
            'CustomerId' => '<string>',
            'DPUSeconds' => <float>,
            'DatabaseName' => '<string>',
            'EndTime' => <DateTime>,
            'ErrorMessage' => '<string>',
            'LastUpdated' => <DateTime>,
            'NumberOfWorkers' => <integer>,
            'Role' => '<string>',
            'SampleSize' => <float>,
            'SecurityConfiguration' => '<string>',
            'StartTime' => <DateTime>,
            'Status' => 'STARTING|RUNNING|SUCCEEDED|FAILED|STOPPED',
            'TableName' => '<string>',
            'WorkerType' => '<string>',
        ],
        // ...
    ],
    'NextToken' => '<string>',
]

Result Details

Members
ColumnStatisticsTaskRuns
Type: Array of ColumnStatisticsTaskRun structures

A list of column statistics task runs.

NextToken
Type: string

A continuation token, if not all task runs have yet been returned.

Errors

OperationTimeoutException:

The operation timed out.

GetColumnStatisticsTaskSettings

$result = $client->getColumnStatisticsTaskSettings([/* ... */]);
$promise = $client->getColumnStatisticsTaskSettingsAsync([/* ... */]);

Gets settings for a column statistics task.

Parameter Syntax

$result = $client->getColumnStatisticsTaskSettings([
    'DatabaseName' => '<string>', // REQUIRED
    'TableName' => '<string>', // REQUIRED
]);

Parameter Details

Members
DatabaseName
Required: Yes
Type: string

The name of the database where the table resides.

TableName
Required: Yes
Type: string

The name of the table for which to retrieve column statistics.

Result Syntax

[
    'ColumnStatisticsTaskSettings' => [
        'CatalogID' => '<string>',
        'ColumnNameList' => ['<string>', ...],
        'DatabaseName' => '<string>',
        'Role' => '<string>',
        'SampleSize' => <float>,
        'Schedule' => [
            'ScheduleExpression' => '<string>',
            'State' => 'SCHEDULED|NOT_SCHEDULED|TRANSITIONING',
        ],
        'SecurityConfiguration' => '<string>',
        'TableName' => '<string>',
    ],
]

Result Details

Members
ColumnStatisticsTaskSettings

A ColumnStatisticsTaskSettings object representing the settings for the column statistics task.

Errors

EntityNotFoundException:

A specified entity does not exist

InvalidInputException:

The input provided was not valid.

OperationTimeoutException:

The operation timed out.

GetConnection

$result = $client->getConnection([/* ... */]);
$promise = $client->getConnectionAsync([/* ... */]);

Retrieves a connection definition from the Data Catalog.

Parameter Syntax

$result = $client->getConnection([
    'CatalogId' => '<string>',
    'HidePassword' => true || false,
    'Name' => '<string>', // REQUIRED
]);

Parameter Details

Members
CatalogId
Type: string

The ID of the Data Catalog in which the connection resides. If none is provided, the Amazon Web Services account ID is used by default.

HidePassword
Type: boolean

Allows you to retrieve the connection metadata without returning the password. For instance, the Glue console uses this flag to retrieve the connection, and does not display the password. Set this parameter when the caller might not have permission to use the KMS key to decrypt the password, but it does have permission to access the rest of the connection properties.

Name
Required: Yes
Type: string

The name of the connection definition to retrieve.

Result Syntax

[
    'Connection' => [
        'AthenaProperties' => ['<string>', ...],
        'AuthenticationConfiguration' => [
            'AuthenticationType' => 'BASIC|OAUTH2|CUSTOM',
            'OAuth2Properties' => [
                'OAuth2ClientApplication' => [
                    'AWSManagedClientApplicationReference' => '<string>',
                    'UserManagedClientApplicationClientId' => '<string>',
                ],
                'OAuth2GrantType' => 'AUTHORIZATION_CODE|CLIENT_CREDENTIALS|JWT_BEARER',
                'TokenUrl' => '<string>',
                'TokenUrlParametersMap' => ['<string>', ...],
            ],
            'SecretArn' => '<string>',
        ],
        'ConnectionProperties' => ['<string>', ...],
        'ConnectionType' => 'JDBC|SFTP|MONGODB|KAFKA|NETWORK|MARKETPLACE|CUSTOM|SALESFORCE|VIEW_VALIDATION_REDSHIFT|VIEW_VALIDATION_ATHENA',
        'CreationTime' => <DateTime>,
        'Description' => '<string>',
        'LastConnectionValidationTime' => <DateTime>,
        'LastUpdatedBy' => '<string>',
        'LastUpdatedTime' => <DateTime>,
        'MatchCriteria' => ['<string>', ...],
        'Name' => '<string>',
        'PhysicalConnectionRequirements' => [
            'AvailabilityZone' => '<string>',
            'SecurityGroupIdList' => ['<string>', ...],
            'SubnetId' => '<string>',
        ],
        'Status' => 'READY|IN_PROGRESS|FAILED',
        'StatusReason' => '<string>',
    ],
]

Result Details

Members
Connection
Type: Connection structure

The requested connection definition.

Errors

EntityNotFoundException:

A specified entity does not exist

OperationTimeoutException:

The operation timed out.

InvalidInputException:

The input provided was not valid.

GlueEncryptionException:

An encryption operation failed.

GetConnections

$result = $client->getConnections([/* ... */]);
$promise = $client->getConnectionsAsync([/* ... */]);

Retrieves a list of connection definitions from the Data Catalog.

Parameter Syntax

$result = $client->getConnections([
    'CatalogId' => '<string>',
    'Filter' => [
        'ConnectionType' => 'JDBC|SFTP|MONGODB|KAFKA|NETWORK|MARKETPLACE|CUSTOM|SALESFORCE|VIEW_VALIDATION_REDSHIFT|VIEW_VALIDATION_ATHENA',
        'MatchCriteria' => ['<string>', ...],
    ],
    'HidePassword' => true || false,
    'MaxResults' => <integer>,
    'NextToken' => '<string>',
]);

Parameter Details

Members
CatalogId
Type: string

The ID of the Data Catalog in which the connections reside. If none is provided, the Amazon Web Services account ID is used by default.

Filter
Type: GetConnectionsFilter structure

A filter that controls which connections are returned.

HidePassword
Type: boolean

Allows you to retrieve the connection metadata without returning the password. For instance, the Glue console uses this flag to retrieve the connection, and does not display the password. Set this parameter when the caller might not have permission to use the KMS key to decrypt the password, but it does have permission to access the rest of the connection properties.

MaxResults
Type: int

The maximum number of connections to return in one response.

NextToken
Type: string

A continuation token, if this is a continuation call.

Result Syntax

[
    'ConnectionList' => [
        [
            'AthenaProperties' => ['<string>', ...],
            'AuthenticationConfiguration' => [
                'AuthenticationType' => 'BASIC|OAUTH2|CUSTOM',
                'OAuth2Properties' => [
                    'OAuth2ClientApplication' => [
                        'AWSManagedClientApplicationReference' => '<string>',
                        'UserManagedClientApplicationClientId' => '<string>',
                    ],
                    'OAuth2GrantType' => 'AUTHORIZATION_CODE|CLIENT_CREDENTIALS|JWT_BEARER',
                    'TokenUrl' => '<string>',
                    'TokenUrlParametersMap' => ['<string>', ...],
                ],
                'SecretArn' => '<string>',
            ],
            'ConnectionProperties' => ['<string>', ...],
            'ConnectionType' => 'JDBC|SFTP|MONGODB|KAFKA|NETWORK|MARKETPLACE|CUSTOM|SALESFORCE|VIEW_VALIDATION_REDSHIFT|VIEW_VALIDATION_ATHENA',
            'CreationTime' => <DateTime>,
            'Description' => '<string>',
            'LastConnectionValidationTime' => <DateTime>,
            'LastUpdatedBy' => '<string>',
            'LastUpdatedTime' => <DateTime>,
            'MatchCriteria' => ['<string>', ...],
            'Name' => '<string>',
            'PhysicalConnectionRequirements' => [
                'AvailabilityZone' => '<string>',
                'SecurityGroupIdList' => ['<string>', ...],
                'SubnetId' => '<string>',
            ],
            'Status' => 'READY|IN_PROGRESS|FAILED',
            'StatusReason' => '<string>',
        ],
        // ...
    ],
    'NextToken' => '<string>',
]

Result Details

Members
ConnectionList
Type: Array of Connection structures

A list of requested connection definitions.

NextToken
Type: string

A continuation token, if the list of connections returned does not include the last of the filtered connections.

Errors

EntityNotFoundException:

A specified entity does not exist

OperationTimeoutException:

The operation timed out.

InvalidInputException:

The input provided was not valid.

GlueEncryptionException:

An encryption operation failed.

GetCrawler

$result = $client->getCrawler([/* ... */]);
$promise = $client->getCrawlerAsync([/* ... */]);

Retrieves metadata for a specified crawler.

Parameter Syntax

$result = $client->getCrawler([
    'Name' => '<string>', // REQUIRED
]);

Parameter Details

Members
Name
Required: Yes
Type: string

The name of the crawler to retrieve metadata for.

Result Syntax

[
    'Crawler' => [
        'Classifiers' => ['<string>', ...],
        'Configuration' => '<string>',
        'CrawlElapsedTime' => <integer>,
        'CrawlerSecurityConfiguration' => '<string>',
        'CreationTime' => <DateTime>,
        'DatabaseName' => '<string>',
        'Description' => '<string>',
        'LakeFormationConfiguration' => [
            'AccountId' => '<string>',
            'UseLakeFormationCredentials' => true || false,
        ],
        'LastCrawl' => [
            'ErrorMessage' => '<string>',
            'LogGroup' => '<string>',
            'LogStream' => '<string>',
            'MessagePrefix' => '<string>',
            'StartTime' => <DateTime>,
            'Status' => 'SUCCEEDED|CANCELLED|FAILED',
        ],
        'LastUpdated' => <DateTime>,
        'LineageConfiguration' => [
            'CrawlerLineageSettings' => 'ENABLE|DISABLE',
        ],
        'Name' => '<string>',
        'RecrawlPolicy' => [
            'RecrawlBehavior' => 'CRAWL_EVERYTHING|CRAWL_NEW_FOLDERS_ONLY|CRAWL_EVENT_MODE',
        ],
        'Role' => '<string>',
        'Schedule' => [
            'ScheduleExpression' => '<string>',
            'State' => 'SCHEDULED|NOT_SCHEDULED|TRANSITIONING',
        ],
        'SchemaChangePolicy' => [
            'DeleteBehavior' => 'LOG|DELETE_FROM_DATABASE|DEPRECATE_IN_DATABASE',
            'UpdateBehavior' => 'LOG|UPDATE_IN_DATABASE',
        ],
        'State' => 'READY|RUNNING|STOPPING',
        'TablePrefix' => '<string>',
        'Targets' => [
            'CatalogTargets' => [
                [
                    'ConnectionName' => '<string>',
                    'DatabaseName' => '<string>',
                    'DlqEventQueueArn' => '<string>',
                    'EventQueueArn' => '<string>',
                    'Tables' => ['<string>', ...],
                ],
                // ...
            ],
            'DeltaTargets' => [
                [
                    'ConnectionName' => '<string>',
                    'CreateNativeDeltaTable' => true || false,
                    'DeltaTables' => ['<string>', ...],
                    'WriteManifest' => true || false,
                ],
                // ...
            ],
            'DynamoDBTargets' => [
                [
                    'Path' => '<string>',
                    'scanAll' => true || false,
                    'scanRate' => <float>,
                ],
                // ...
            ],
            'HudiTargets' => [
                [
                    'ConnectionName' => '<string>',
                    'Exclusions' => ['<string>', ...],
                    'MaximumTraversalDepth' => <integer>,
                    'Paths' => ['<string>', ...],
                ],
                // ...
            ],
            'IcebergTargets' => [
                [
                    'ConnectionName' => '<string>',
                    'Exclusions' => ['<string>', ...],
                    'MaximumTraversalDepth' => <integer>,
                    'Paths' => ['<string>', ...],
                ],
                // ...
            ],
            'JdbcTargets' => [
                [
                    'ConnectionName' => '<string>',
                    'EnableAdditionalMetadata' => ['<string>', ...],
                    'Exclusions' => ['<string>', ...],
                    'Path' => '<string>',
                ],
                // ...
            ],
            'MongoDBTargets' => [
                [
                    'ConnectionName' => '<string>',
                    'Path' => '<string>',
                    'ScanAll' => true || false,
                ],
                // ...
            ],
            'S3Targets' => [
                [
                    'ConnectionName' => '<string>',
                    'DlqEventQueueArn' => '<string>',
                    'EventQueueArn' => '<string>',
                    'Exclusions' => ['<string>', ...],
                    'Path' => '<string>',
                    'SampleSize' => <integer>,
                ],
                // ...
            ],
        ],
        'Version' => <integer>,
    ],
]

Result Details

Members
Crawler
Type: Crawler structure

The metadata for the specified crawler.

Errors

EntityNotFoundException:

A specified entity does not exist

OperationTimeoutException:

The operation timed out.

GetCrawlerMetrics

$result = $client->getCrawlerMetrics([/* ... */]);
$promise = $client->getCrawlerMetricsAsync([/* ... */]);

Retrieves metrics about specified crawlers.

Parameter Syntax

$result = $client->getCrawlerMetrics([
    'CrawlerNameList' => ['<string>', ...],
    'MaxResults' => <integer>,
    'NextToken' => '<string>',
]);

Parameter Details

Members
CrawlerNameList
Type: Array of strings

A list of the names of crawlers about which to retrieve metrics.

MaxResults
Type: int

The maximum size of a list to return.

NextToken
Type: string

A continuation token, if this is a continuation call.

Result Syntax

[
    'CrawlerMetricsList' => [
        [
            'CrawlerName' => '<string>',
            'LastRuntimeSeconds' => <float>,
            'MedianRuntimeSeconds' => <float>,
            'StillEstimating' => true || false,
            'TablesCreated' => <integer>,
            'TablesDeleted' => <integer>,
            'TablesUpdated' => <integer>,
            'TimeLeftSeconds' => <float>,
        ],
        // ...
    ],
    'NextToken' => '<string>',
]

Result Details

Members
CrawlerMetricsList
Type: Array of CrawlerMetrics structures

A list of metrics for the specified crawler.

NextToken
Type: string

A continuation token, if the returned list does not contain the last metric available.

Errors

OperationTimeoutException:

The operation timed out.

GetCrawlers

$result = $client->getCrawlers([/* ... */]);
$promise = $client->getCrawlersAsync([/* ... */]);

Retrieves metadata for all crawlers defined in the customer account.

Parameter Syntax

$result = $client->getCrawlers([
    'MaxResults' => <integer>,
    'NextToken' => '<string>',
]);

Parameter Details

Members
MaxResults
Type: int

The number of crawlers to return on each call.

NextToken
Type: string

A continuation token, if this is a continuation request.

Result Syntax

[
    'Crawlers' => [
        [
            'Classifiers' => ['<string>', ...],
            'Configuration' => '<string>',
            'CrawlElapsedTime' => <integer>,
            'CrawlerSecurityConfiguration' => '<string>',
            'CreationTime' => <DateTime>,
            'DatabaseName' => '<string>',
            'Description' => '<string>',
            'LakeFormationConfiguration' => [
                'AccountId' => '<string>',
                'UseLakeFormationCredentials' => true || false,
            ],
            'LastCrawl' => [
                'ErrorMessage' => '<string>',
                'LogGroup' => '<string>',
                'LogStream' => '<string>',
                'MessagePrefix' => '<string>',
                'StartTime' => <DateTime>,
                'Status' => 'SUCCEEDED|CANCELLED|FAILED',
            ],
            'LastUpdated' => <DateTime>,
            'LineageConfiguration' => [
                'CrawlerLineageSettings' => 'ENABLE|DISABLE',
            ],
            'Name' => '<string>',
            'RecrawlPolicy' => [
                'RecrawlBehavior' => 'CRAWL_EVERYTHING|CRAWL_NEW_FOLDERS_ONLY|CRAWL_EVENT_MODE',
            ],
            'Role' => '<string>',
            'Schedule' => [
                'ScheduleExpression' => '<string>',
                'State' => 'SCHEDULED|NOT_SCHEDULED|TRANSITIONING',
            ],
            'SchemaChangePolicy' => [
                'DeleteBehavior' => 'LOG|DELETE_FROM_DATABASE|DEPRECATE_IN_DATABASE',
                'UpdateBehavior' => 'LOG|UPDATE_IN_DATABASE',
            ],
            'State' => 'READY|RUNNING|STOPPING',
            'TablePrefix' => '<string>',
            'Targets' => [
                'CatalogTargets' => [
                    [
                        'ConnectionName' => '<string>',
                        'DatabaseName' => '<string>',
                        'DlqEventQueueArn' => '<string>',
                        'EventQueueArn' => '<string>',
                        'Tables' => ['<string>', ...],
                    ],
                    // ...
                ],
                'DeltaTargets' => [
                    [
                        'ConnectionName' => '<string>',
                        'CreateNativeDeltaTable' => true || false,
                        'DeltaTables' => ['<string>', ...],
                        'WriteManifest' => true || false,
                    ],
                    // ...
                ],
                'DynamoDBTargets' => [
                    [
                        'Path' => '<string>',
                        'scanAll' => true || false,
                        'scanRate' => <float>,
                    ],
                    // ...
                ],
                'HudiTargets' => [
                    [
                        'ConnectionName' => '<string>',
                        'Exclusions' => ['<string>', ...],
                        'MaximumTraversalDepth' => <integer>,
                        'Paths' => ['<string>', ...],
                    ],
                    // ...
                ],
                'IcebergTargets' => [
                    [
                        'ConnectionName' => '<string>',
                        'Exclusions' => ['<string>', ...],
                        'MaximumTraversalDepth' => <integer>,
                        'Paths' => ['<string>', ...],
                    ],
                    // ...
                ],
                'JdbcTargets' => [
                    [
                        'ConnectionName' => '<string>',
                        'EnableAdditionalMetadata' => ['<string>', ...],
                        'Exclusions' => ['<string>', ...],
                        'Path' => '<string>',
                    ],
                    // ...
                ],
                'MongoDBTargets' => [
                    [
                        'ConnectionName' => '<string>',
                        'Path' => '<string>',
                        'ScanAll' => true || false,
                    ],
                    // ...
                ],
                'S3Targets' => [
                    [
                        'ConnectionName' => '<string>',
                        'DlqEventQueueArn' => '<string>',
                        'EventQueueArn' => '<string>',
                        'Exclusions' => ['<string>', ...],
                        'Path' => '<string>',
                        'SampleSize' => <integer>,
                    ],
                    // ...
                ],
            ],
            'Version' => <integer>,
        ],
        // ...
    ],
    'NextToken' => '<string>',
]

Result Details

Members
Crawlers
Type: Array of Crawler structures

A list of crawler metadata.

NextToken
Type: string

A continuation token, if the returned list has not reached the end of those defined in this customer account.

Errors

OperationTimeoutException:

The operation timed out.

GetCustomEntityType

$result = $client->getCustomEntityType([/* ... */]);
$promise = $client->getCustomEntityTypeAsync([/* ... */]);

Retrieves the details of a custom pattern by specifying its name.

Parameter Syntax

$result = $client->getCustomEntityType([
    'Name' => '<string>', // REQUIRED
]);

Parameter Details

Members
Name
Required: Yes
Type: string

The name of the custom pattern that you want to retrieve.

Result Syntax

[
    'ContextWords' => ['<string>', ...],
    'Name' => '<string>',
    'RegexString' => '<string>',
]

Result Details

Members
ContextWords
Type: Array of strings

A list of context words if specified when you created the custom pattern. If none of these context words are found within the vicinity of the regular expression the data will not be detected as sensitive data.

Name
Type: string

The name of the custom pattern that you retrieved.

RegexString
Type: string

A regular expression string that is used for detecting sensitive data in a custom pattern.

Errors

EntityNotFoundException:

A specified entity does not exist

AccessDeniedException:

Access to a resource was denied.

InternalServiceException:

An internal service error occurred.

InvalidInputException:

The input provided was not valid.

OperationTimeoutException:

The operation timed out.

GetDataCatalogEncryptionSettings

$result = $client->getDataCatalogEncryptionSettings([/* ... */]);
$promise = $client->getDataCatalogEncryptionSettingsAsync([/* ... */]);

Retrieves the security configuration for a specified catalog.

Parameter Syntax

$result = $client->getDataCatalogEncryptionSettings([
    'CatalogId' => '<string>',
]);

Parameter Details

Members
CatalogId
Type: string

The ID of the Data Catalog to retrieve the security configuration for. If none is provided, the Amazon Web Services account ID is used by default.

Result Syntax

[
    'DataCatalogEncryptionSettings' => [
        'ConnectionPasswordEncryption' => [
            'AwsKmsKeyId' => '<string>',
            'ReturnConnectionPasswordEncrypted' => true || false,
        ],
        'EncryptionAtRest' => [
            'CatalogEncryptionMode' => 'DISABLED|SSE-KMS|SSE-KMS-WITH-SERVICE-ROLE',
            'CatalogEncryptionServiceRole' => '<string>',
            'SseAwsKmsKeyId' => '<string>',
        ],
    ],
]

Result Details

Members
DataCatalogEncryptionSettings

The requested security configuration.

Errors

InternalServiceException:

An internal service error occurred.

InvalidInputException:

The input provided was not valid.

OperationTimeoutException:

The operation timed out.

GetDataQualityModel

$result = $client->getDataQualityModel([/* ... */]);
$promise = $client->getDataQualityModelAsync([/* ... */]);

Retrieve the training status of the model along with more information (CompletedOn, StartedOn, FailureReason).

Parameter Syntax

$result = $client->getDataQualityModel([
    'ProfileId' => '<string>', // REQUIRED
    'StatisticId' => '<string>',
]);

Parameter Details

Members
ProfileId
Required: Yes
Type: string

The Profile ID.

StatisticId
Type: string

The Statistic ID.

Result Syntax

[
    'CompletedOn' => <DateTime>,
    'FailureReason' => '<string>',
    'StartedOn' => <DateTime>,
    'Status' => 'RUNNING|SUCCEEDED|FAILED',
]

Result Details

Members
CompletedOn
Type: timestamp (string|DateTime or anything parsable by strtotime)

The timestamp when the data quality model training completed.

FailureReason
Type: string

The training failure reason.

StartedOn
Type: timestamp (string|DateTime or anything parsable by strtotime)

The timestamp when the data quality model training started.

Status
Type: string

The training status of the data quality model.

Errors

EntityNotFoundException:

A specified entity does not exist

InvalidInputException:

The input provided was not valid.

OperationTimeoutException:

The operation timed out.

InternalServiceException:

An internal service error occurred.

GetDataQualityModelResult

$result = $client->getDataQualityModelResult([/* ... */]);
$promise = $client->getDataQualityModelResultAsync([/* ... */]);

Retrieve a statistic's predictions for a given Profile ID.

Parameter Syntax

$result = $client->getDataQualityModelResult([
    'ProfileId' => '<string>', // REQUIRED
    'StatisticId' => '<string>', // REQUIRED
]);

Parameter Details

Members
ProfileId
Required: Yes
Type: string

The Profile ID.

StatisticId
Required: Yes
Type: string

The Statistic ID.

Result Syntax

[
    'CompletedOn' => <DateTime>,
    'Model' => [
        [
            'ActualValue' => <float>,
            'Date' => <DateTime>,
            'InclusionAnnotation' => 'INCLUDE|EXCLUDE',
            'LowerBound' => <float>,
            'PredictedValue' => <float>,
            'UpperBound' => <float>,
        ],
        // ...
    ],
]

Result Details

Members
CompletedOn
Type: timestamp (string|DateTime or anything parsable by strtotime)

The timestamp when the data quality model training completed.

Model
Type: Array of StatisticModelResult structures

A list of StatisticModelResult

Errors

EntityNotFoundException:

A specified entity does not exist

InvalidInputException:

The input provided was not valid.

OperationTimeoutException:

The operation timed out.

InternalServiceException:

An internal service error occurred.

GetDataQualityResult

$result = $client->getDataQualityResult([/* ... */]);
$promise = $client->getDataQualityResultAsync([/* ... */]);

Retrieves the result of a data quality rule evaluation.

Parameter Syntax

$result = $client->getDataQualityResult([
    'ResultId' => '<string>', // REQUIRED
]);

Parameter Details

Members
ResultId
Required: Yes
Type: string

A unique result ID for the data quality result.

Result Syntax

[
    'AnalyzerResults' => [
        [
            'Description' => '<string>',
            'EvaluatedMetrics' => [<float>, ...],
            'EvaluationMessage' => '<string>',
            'Name' => '<string>',
        ],
        // ...
    ],
    'CompletedOn' => <DateTime>,
    'DataSource' => [
        'GlueTable' => [
            'AdditionalOptions' => ['<string>', ...],
            'CatalogId' => '<string>',
            'ConnectionName' => '<string>',
            'DatabaseName' => '<string>',
            'TableName' => '<string>',
        ],
    ],
    'EvaluationContext' => '<string>',
    'JobName' => '<string>',
    'JobRunId' => '<string>',
    'Observations' => [
        [
            'Description' => '<string>',
            'MetricBasedObservation' => [
                'MetricName' => '<string>',
                'MetricValues' => [
                    'ActualValue' => <float>,
                    'ExpectedValue' => <float>,
                    'LowerLimit' => <float>,
                    'UpperLimit' => <float>,
                ],
                'NewRules' => ['<string>', ...],
                'StatisticId' => '<string>',
            ],
        ],
        // ...
    ],
    'ProfileId' => '<string>',
    'ResultId' => '<string>',
    'RuleResults' => [
        [
            'Description' => '<string>',
            'EvaluatedMetrics' => [<float>, ...],
            'EvaluatedRule' => '<string>',
            'EvaluationMessage' => '<string>',
            'Name' => '<string>',
            'Result' => 'PASS|FAIL|ERROR',
        ],
        // ...
    ],
    'RulesetEvaluationRunId' => '<string>',
    'RulesetName' => '<string>',
    'Score' => <float>,
    'StartedOn' => <DateTime>,
]

Result Details

Members
AnalyzerResults
Type: Array of DataQualityAnalyzerResult structures

A list of DataQualityAnalyzerResult objects representing the results for each analyzer.

CompletedOn
Type: timestamp (string|DateTime or anything parsable by strtotime)

The date and time when the run for this data quality result was completed.

DataSource
Type: DataSource structure

The table associated with the data quality result, if any.

EvaluationContext
Type: string

In the context of a job in Glue Studio, each node in the canvas is typically assigned some sort of name and data quality nodes will have names. In the case of multiple nodes, the evaluationContext can differentiate the nodes.

JobName
Type: string

The job name associated with the data quality result, if any.

JobRunId
Type: string

The job run ID associated with the data quality result, if any.

Observations
Type: Array of DataQualityObservation structures

A list of DataQualityObservation objects representing the observations generated after evaluating the rules and analyzers.

ProfileId
Type: string

The Profile ID for the data quality result.

ResultId
Type: string

A unique result ID for the data quality result.

RuleResults
Type: Array of DataQualityRuleResult structures

A list of DataQualityRuleResult objects representing the results for each rule.

RulesetEvaluationRunId
Type: string

The unique run ID associated with the ruleset evaluation.

RulesetName
Type: string

The name of the ruleset associated with the data quality result.

Score
Type: double

An aggregate data quality score. Represents the ratio of rules that passed to the total number of rules.

StartedOn
Type: timestamp (string|DateTime or anything parsable by strtotime)

The date and time when the run for this data quality result started.

Errors

InvalidInputException:

The input provided was not valid.

OperationTimeoutException:

The operation timed out.

InternalServiceException:

An internal service error occurred.

EntityNotFoundException:

A specified entity does not exist

GetDataQualityRuleRecommendationRun

$result = $client->getDataQualityRuleRecommendationRun([/* ... */]);
$promise = $client->getDataQualityRuleRecommendationRunAsync([/* ... */]);

Gets the specified recommendation run that was used to generate rules.

Parameter Syntax

$result = $client->getDataQualityRuleRecommendationRun([
    'RunId' => '<string>', // REQUIRED
]);

Parameter Details

Members
RunId
Required: Yes
Type: string

The unique run identifier associated with this run.

Result Syntax

[
    'CompletedOn' => <DateTime>,
    'CreatedRulesetName' => '<string>',
    'DataQualitySecurityConfiguration' => '<string>',
    'DataSource' => [
        'GlueTable' => [
            'AdditionalOptions' => ['<string>', ...],
            'CatalogId' => '<string>',
            'ConnectionName' => '<string>',
            'DatabaseName' => '<string>',
            'TableName' => '<string>',
        ],
    ],
    'ErrorString' => '<string>',
    'ExecutionTime' => <integer>,
    'LastModifiedOn' => <DateTime>,
    'NumberOfWorkers' => <integer>,
    'RecommendedRuleset' => '<string>',
    'Role' => '<string>',
    'RunId' => '<string>',
    'StartedOn' => <DateTime>,
    'Status' => 'STARTING|RUNNING|STOPPING|STOPPED|SUCCEEDED|FAILED|TIMEOUT',
    'Timeout' => <integer>,
]

Result Details

Members
CompletedOn
Type: timestamp (string|DateTime or anything parsable by strtotime)

The date and time when this run was completed.

CreatedRulesetName
Type: string

The name of the ruleset that was created by the run.

DataQualitySecurityConfiguration
Type: string

The name of the security configuration created with the data quality encryption option.

DataSource
Type: DataSource structure

The data source (an Glue table) associated with this run.

ErrorString
Type: string

The error strings that are associated with the run.

ExecutionTime
Type: int

The amount of time (in seconds) that the run consumed resources.

LastModifiedOn
Type: timestamp (string|DateTime or anything parsable by strtotime)

A timestamp. The last point in time when this data quality rule recommendation run was modified.

NumberOfWorkers
Type: int

The number of G.1X workers to be used in the run. The default is 5.

RecommendedRuleset
Type: string

When a start rule recommendation run completes, it creates a recommended ruleset (a set of rules). This member has those rules in Data Quality Definition Language (DQDL) format.

Role
Type: string

An IAM role supplied to encrypt the results of the run.

RunId
Type: string

The unique run identifier associated with this run.

StartedOn
Type: timestamp (string|DateTime or anything parsable by strtotime)

The date and time when this run started.

Status
Type: string

The status for this run.

Timeout
Type: int

The timeout for a run in minutes. This is the maximum time that a run can consume resources before it is terminated and enters TIMEOUT status. The default is 2,880 minutes (48 hours).

Errors

EntityNotFoundException:

A specified entity does not exist

InvalidInputException:

The input provided was not valid.

OperationTimeoutException:

The operation timed out.

InternalServiceException:

An internal service error occurred.

GetDataQualityRuleset

$result = $client->getDataQualityRuleset([/* ... */]);
$promise = $client->getDataQualityRulesetAsync([/* ... */]);

Returns an existing ruleset by identifier or name.

Parameter Syntax

$result = $client->getDataQualityRuleset([
    'Name' => '<string>', // REQUIRED
]);

Parameter Details

Members
Name
Required: Yes
Type: string

The name of the ruleset.

Result Syntax

[
    'CreatedOn' => <DateTime>,
    'DataQualitySecurityConfiguration' => '<string>',
    'Description' => '<string>',
    'LastModifiedOn' => <DateTime>,
    'Name' => '<string>',
    'RecommendationRunId' => '<string>',
    'Ruleset' => '<string>',
    'TargetTable' => [
        'CatalogId' => '<string>',
        'DatabaseName' => '<string>',
        'TableName' => '<string>',
    ],
]

Result Details

Members
CreatedOn
Type: timestamp (string|DateTime or anything parsable by strtotime)

A timestamp. The time and date that this data quality ruleset was created.

DataQualitySecurityConfiguration
Type: string

The name of the security configuration created with the data quality encryption option.

Description
Type: string

A description of the ruleset.

LastModifiedOn
Type: timestamp (string|DateTime or anything parsable by strtotime)

A timestamp. The last point in time when this data quality ruleset was modified.

Name
Type: string

The name of the ruleset.

RecommendationRunId
Type: string

When a ruleset was created from a recommendation run, this run ID is generated to link the two together.

Ruleset
Type: string

A Data Quality Definition Language (DQDL) ruleset. For more information, see the Glue developer guide.

TargetTable
Type: DataQualityTargetTable structure

The name and database name of the target table.

Errors

EntityNotFoundException:

A specified entity does not exist

InvalidInputException:

The input provided was not valid.

OperationTimeoutException:

The operation timed out.

InternalServiceException:

An internal service error occurred.

GetDataQualityRulesetEvaluationRun

$result = $client->getDataQualityRulesetEvaluationRun([/* ... */]);
$promise = $client->getDataQualityRulesetEvaluationRunAsync([/* ... */]);

Retrieves a specific run where a ruleset is evaluated against a data source.

Parameter Syntax

$result = $client->getDataQualityRulesetEvaluationRun([
    'RunId' => '<string>', // REQUIRED
]);

Parameter Details

Members
RunId
Required: Yes
Type: string

The unique run identifier associated with this run.

Result Syntax

[
    'AdditionalDataSources' => [
        '<NameString>' => [
            'GlueTable' => [
                'AdditionalOptions' => ['<string>', ...],
                'CatalogId' => '<string>',
                'ConnectionName' => '<string>',
                'DatabaseName' => '<string>',
                'TableName' => '<string>',
            ],
        ],
        // ...
    ],
    'AdditionalRunOptions' => [
        'CloudWatchMetricsEnabled' => true || false,
        'CompositeRuleEvaluationMethod' => 'COLUMN|ROW',
        'ResultsS3Prefix' => '<string>',
    ],
    'CompletedOn' => <DateTime>,
    'DataSource' => [
        'GlueTable' => [
            'AdditionalOptions' => ['<string>', ...],
            'CatalogId' => '<string>',
            'ConnectionName' => '<string>',
            'DatabaseName' => '<string>',
            'TableName' => '<string>',
        ],
    ],
    'ErrorString' => '<string>',
    'ExecutionTime' => <integer>,
    'LastModifiedOn' => <DateTime>,
    'NumberOfWorkers' => <integer>,
    'ResultIds' => ['<string>', ...],
    'Role' => '<string>',
    'RulesetNames' => ['<string>', ...],
    'RunId' => '<string>',
    'StartedOn' => <DateTime>,
    'Status' => 'STARTING|RUNNING|STOPPING|STOPPED|SUCCEEDED|FAILED|TIMEOUT',
    'Timeout' => <integer>,
]

Result Details

Members
AdditionalDataSources
Type: Associative array of custom strings keys (NameString) to DataSource structures

A map of reference strings to additional data sources you can specify for an evaluation run.

AdditionalRunOptions

Additional run options you can specify for an evaluation run.

CompletedOn
Type: timestamp (string|DateTime or anything parsable by strtotime)

The date and time when this run was completed.

DataSource
Type: DataSource structure

The data source (an Glue table) associated with this evaluation run.

ErrorString
Type: string

The error strings that are associated with the run.

ExecutionTime
Type: int

The amount of time (in seconds) that the run consumed resources.

LastModifiedOn
Type: timestamp (string|DateTime or anything parsable by strtotime)

A timestamp. The last point in time when this data quality rule recommendation run was modified.

NumberOfWorkers
Type: int

The number of G.1X workers to be used in the run. The default is 5.

ResultIds
Type: Array of strings

A list of result IDs for the data quality results for the run.

Role
Type: string

An IAM role supplied to encrypt the results of the run.

RulesetNames
Type: Array of strings

A list of ruleset names for the run. Currently, this parameter takes only one Ruleset name.

RunId
Type: string

The unique run identifier associated with this run.

StartedOn
Type: timestamp (string|DateTime or anything parsable by strtotime)

The date and time when this run started.

Status
Type: string

The status for this run.

Timeout
Type: int

The timeout for a run in minutes. This is the maximum time that a run can consume resources before it is terminated and enters TIMEOUT status. The default is 2,880 minutes (48 hours).

Errors

EntityNotFoundException:

A specified entity does not exist

InvalidInputException:

The input provided was not valid.

OperationTimeoutException:

The operation timed out.

InternalServiceException:

An internal service error occurred.

GetDatabase

$result = $client->getDatabase([/* ... */]);
$promise = $client->getDatabaseAsync([/* ... */]);

Retrieves the definition of a specified database.

Parameter Syntax

$result = $client->getDatabase([
    'CatalogId' => '<string>',
    'Name' => '<string>', // REQUIRED
]);

Parameter Details

Members
CatalogId
Type: string

The ID of the Data Catalog in which the database resides. If none is provided, the Amazon Web Services account ID is used by default.

Name
Required: Yes
Type: string

The name of the database to retrieve. For Hive compatibility, this should be all lowercase.

Result Syntax

[
    'Database' => [
        'CatalogId' => '<string>',
        'CreateTableDefaultPermissions' => [
            [
                'Permissions' => ['<string>', ...],
                'Principal' => [
                    'DataLakePrincipalIdentifier' => '<string>',
                ],
            ],
            // ...
        ],
        'CreateTime' => <DateTime>,
        'Description' => '<string>',
        'FederatedDatabase' => [
            'ConnectionName' => '<string>',
            'Identifier' => '<string>',
        ],
        'LocationUri' => '<string>',
        'Name' => '<string>',
        'Parameters' => ['<string>', ...],
        'TargetDatabase' => [
            'CatalogId' => '<string>',
            'DatabaseName' => '<string>',
            'Region' => '<string>',
        ],
    ],
]

Result Details

Members
Database
Type: Database structure

The definition of the specified database in the Data Catalog.

Errors

InvalidInputException:

The input provided was not valid.

EntityNotFoundException:

A specified entity does not exist

InternalServiceException:

An internal service error occurred.

OperationTimeoutException:

The operation timed out.

GlueEncryptionException:

An encryption operation failed.

FederationSourceException:

A federation source failed.

GetDatabases

$result = $client->getDatabases([/* ... */]);
$promise = $client->getDatabasesAsync([/* ... */]);

Retrieves all databases defined in a given Data Catalog.

Parameter Syntax

$result = $client->getDatabases([
    'AttributesToGet' => ['<string>', ...],
    'CatalogId' => '<string>',
    'MaxResults' => <integer>,
    'NextToken' => '<string>',
    'ResourceShareType' => 'FOREIGN|ALL|FEDERATED',
]);

Parameter Details

Members
AttributesToGet
Type: Array of strings

Specifies the database fields returned by the GetDatabases call. This parameter doesn’t accept an empty list. The request must include the NAME.

CatalogId
Type: string

The ID of the Data Catalog from which to retrieve Databases. If none is provided, the Amazon Web Services account ID is used by default.

MaxResults
Type: int

The maximum number of databases to return in one response.

NextToken
Type: string

A continuation token, if this is a continuation call.

ResourceShareType
Type: string

Allows you to specify that you want to list the databases shared with your account. The allowable values are FEDERATED, FOREIGN or ALL.

  • If set to FEDERATED, will list the federated databases (referencing an external entity) shared with your account.

  • If set to FOREIGN, will list the databases shared with your account.

  • If set to ALL, will list the databases shared with your account, as well as the databases in yor local account.

Result Syntax

[
    'DatabaseList' => [
        [
            'CatalogId' => '<string>',
            'CreateTableDefaultPermissions' => [
                [
                    'Permissions' => ['<string>', ...],
                    'Principal' => [
                        'DataLakePrincipalIdentifier' => '<string>',
                    ],
                ],
                // ...
            ],
            'CreateTime' => <DateTime>,
            'Description' => '<string>',
            'FederatedDatabase' => [
                'ConnectionName' => '<string>',
                'Identifier' => '<string>',
            ],
            'LocationUri' => '<string>',
            'Name' => '<string>',
            'Parameters' => ['<string>', ...],
            'TargetDatabase' => [
                'CatalogId' => '<string>',
                'DatabaseName' => '<string>',
                'Region' => '<string>',
            ],
        ],
        // ...
    ],
    'NextToken' => '<string>',
]

Result Details

Members
DatabaseList
Required: Yes
Type: Array of Database structures

A list of Database objects from the specified catalog.

NextToken
Type: string

A continuation token for paginating the returned list of tokens, returned if the current segment of the list is not the last.

Errors

InvalidInputException:

The input provided was not valid.

InternalServiceException:

An internal service error occurred.

OperationTimeoutException:

The operation timed out.

GlueEncryptionException:

An encryption operation failed.

GetDataflowGraph

$result = $client->getDataflowGraph([/* ... */]);
$promise = $client->getDataflowGraphAsync([/* ... */]);

Transforms a Python script into a directed acyclic graph (DAG).

Parameter Syntax

$result = $client->getDataflowGraph([
    'PythonScript' => '<string>',
]);

Parameter Details

Members
PythonScript
Type: string

The Python script to transform.

Result Syntax

[
    'DagEdges' => [
        [
            'Source' => '<string>',
            'Target' => '<string>',
            'TargetParameter' => '<string>',
        ],
        // ...
    ],
    'DagNodes' => [
        [
            'Args' => [
                [
                    'Name' => '<string>',
                    'Param' => true || false,
                    'Value' => '<string>',
                ],
                // ...
            ],
            'Id' => '<string>',
            'LineNumber' => <integer>,
            'NodeType' => '<string>',
        ],
        // ...
    ],
]

Result Details

Members
DagEdges
Type: Array of CodeGenEdge structures

A list of the edges in the resulting DAG.

DagNodes
Type: Array of CodeGenNode structures

A list of the nodes in the resulting DAG.

Errors

InvalidInputException:

The input provided was not valid.

InternalServiceException:

An internal service error occurred.

OperationTimeoutException:

The operation timed out.

GetDevEndpoint

$result = $client->getDevEndpoint([/* ... */]);
$promise = $client->getDevEndpointAsync([/* ... */]);

Retrieves information about a specified development endpoint.

When you create a development endpoint in a virtual private cloud (VPC), Glue returns only a private IP address, and the public IP address field is not populated. When you create a non-VPC development endpoint, Glue returns only a public IP address.

Parameter Syntax

$result = $client->getDevEndpoint([
    'EndpointName' => '<string>', // REQUIRED
]);

Parameter Details

Members
EndpointName
Required: Yes
Type: string

Name of the DevEndpoint to retrieve information for.

Result Syntax

[
    'DevEndpoint' => [
        'Arguments' => ['<string>', ...],
        'AvailabilityZone' => '<string>',
        'CreatedTimestamp' => <DateTime>,
        'EndpointName' => '<string>',
        'ExtraJarsS3Path' => '<string>',
        'ExtraPythonLibsS3Path' => '<string>',
        'FailureReason' => '<string>',
        'GlueVersion' => '<string>',
        'LastModifiedTimestamp' => <DateTime>,
        'LastUpdateStatus' => '<string>',
        'NumberOfNodes' => <integer>,
        'NumberOfWorkers' => <integer>,
        'PrivateAddress' => '<string>',
        'PublicAddress' => '<string>',
        'PublicKey' => '<string>',
        'PublicKeys' => ['<string>', ...],
        'RoleArn' => '<string>',
        'SecurityConfiguration' => '<string>',
        'SecurityGroupIds' => ['<string>', ...],
        'Status' => '<string>',
        'SubnetId' => '<string>',
        'VpcId' => '<string>',
        'WorkerType' => 'Standard|G.1X|G.2X|G.025X|G.4X|G.8X|Z.2X',
        'YarnEndpointAddress' => '<string>',
        'ZeppelinRemoteSparkInterpreterPort' => <integer>,
    ],
]

Result Details

Members
DevEndpoint
Type: DevEndpoint structure

A DevEndpoint definition.

Errors

EntityNotFoundException:

A specified entity does not exist

InternalServiceException:

An internal service error occurred.

OperationTimeoutException:

The operation timed out.

InvalidInputException:

The input provided was not valid.

GetDevEndpoints

$result = $client->getDevEndpoints([/* ... */]);
$promise = $client->getDevEndpointsAsync([/* ... */]);

Retrieves all the development endpoints in this Amazon Web Services account.

When you create a development endpoint in a virtual private cloud (VPC), Glue returns only a private IP address and the public IP address field is not populated. When you create a non-VPC development endpoint, Glue returns only a public IP address.

Parameter Syntax

$result = $client->getDevEndpoints([
    'MaxResults' => <integer>,
    'NextToken' => '<string>',
]);

Parameter Details

Members
MaxResults
Type: int

The maximum size of information to return.

NextToken
Type: string

A continuation token, if this is a continuation call.

Result Syntax

[
    'DevEndpoints' => [
        [
            'Arguments' => ['<string>', ...],
            'AvailabilityZone' => '<string>',
            'CreatedTimestamp' => <DateTime>,
            'EndpointName' => '<string>',
            'ExtraJarsS3Path' => '<string>',
            'ExtraPythonLibsS3Path' => '<string>',
            'FailureReason' => '<string>',
            'GlueVersion' => '<string>',
            'LastModifiedTimestamp' => <DateTime>,
            'LastUpdateStatus' => '<string>',
            'NumberOfNodes' => <integer>,
            'NumberOfWorkers' => <integer>,
            'PrivateAddress' => '<string>',
            'PublicAddress' => '<string>',
            'PublicKey' => '<string>',
            'PublicKeys' => ['<string>', ...],
            'RoleArn' => '<string>',
            'SecurityConfiguration' => '<string>',
            'SecurityGroupIds' => ['<string>', ...],
            'Status' => '<string>',
            'SubnetId' => '<string>',
            'VpcId' => '<string>',
            'WorkerType' => 'Standard|G.1X|G.2X|G.025X|G.4X|G.8X|Z.2X',
            'YarnEndpointAddress' => '<string>',
            'ZeppelinRemoteSparkInterpreterPort' => <integer>,
        ],
        // ...
    ],
    'NextToken' => '<string>',
]

Result Details

Members
DevEndpoints
Type: Array of DevEndpoint structures

A list of DevEndpoint definitions.

NextToken
Type: string

A continuation token, if not all DevEndpoint definitions have yet been returned.

Errors

EntityNotFoundException:

A specified entity does not exist

InternalServiceException:

An internal service error occurred.

OperationTimeoutException:

The operation timed out.

InvalidInputException:

The input provided was not valid.

GetJob

$result = $client->getJob([/* ... */]);
$promise = $client->getJobAsync([/* ... */]);

Retrieves an existing job definition.

Parameter Syntax

$result = $client->getJob([
    'JobName' => '<string>', // REQUIRED
]);

Parameter Details

Members
JobName
Required: Yes
Type: string

The name of the job definition to retrieve.

Result Syntax

[
    'Job' => [
        'AllocatedCapacity' => <integer>,
        'CodeGenConfigurationNodes' => [
            '<NodeId>' => [
                'Aggregate' => [
                    'Aggs' => [
                        [
                            'AggFunc' => 'avg|countDistinct|count|first|last|kurtosis|max|min|skewness|stddev_samp|stddev_pop|sum|sumDistinct|var_samp|var_pop',
                            'Column' => ['<string>', ...],
                        ],
                        // ...
                    ],
                    'Groups' => [
                        ['<string>', ...],
                        // ...
                    ],
                    'Inputs' => ['<string>', ...],
                    'Name' => '<string>',
                ],
                'AmazonRedshiftSource' => [
                    'Data' => [
                        'AccessType' => '<string>',
                        'Action' => '<string>',
                        'AdvancedOptions' => [
                            [
                                'Key' => '<string>',
                                'Value' => '<string>',
                            ],
                            // ...
                        ],
                        'CatalogDatabase' => [
                            'Description' => '<string>',
                            'Label' => '<string>',
                            'Value' => '<string>',
                        ],
                        'CatalogRedshiftSchema' => '<string>',
                        'CatalogRedshiftTable' => '<string>',
                        'CatalogTable' => [
                            'Description' => '<string>',
                            'Label' => '<string>',
                            'Value' => '<string>',
                        ],
                        'Connection' => [
                            'Description' => '<string>',
                            'Label' => '<string>',
                            'Value' => '<string>',
                        ],
                        'CrawlerConnection' => '<string>',
                        'IamRole' => [
                            'Description' => '<string>',
                            'Label' => '<string>',
                            'Value' => '<string>',
                        ],
                        'MergeAction' => '<string>',
                        'MergeClause' => '<string>',
                        'MergeWhenMatched' => '<string>',
                        'MergeWhenNotMatched' => '<string>',
                        'PostAction' => '<string>',
                        'PreAction' => '<string>',
                        'SampleQuery' => '<string>',
                        'Schema' => [
                            'Description' => '<string>',
                            'Label' => '<string>',
                            'Value' => '<string>',
                        ],
                        'SelectedColumns' => [
                            [
                                'Description' => '<string>',
                                'Label' => '<string>',
                                'Value' => '<string>',
                            ],
                            // ...
                        ],
                        'SourceType' => '<string>',
                        'StagingTable' => '<string>',
                        'Table' => [
                            'Description' => '<string>',
                            'Label' => '<string>',
                            'Value' => '<string>',
                        ],
                        'TablePrefix' => '<string>',
                        'TableSchema' => [
                            [
                                'Description' => '<string>',
                                'Label' => '<string>',
                                'Value' => '<string>',
                            ],
                            // ...
                        ],
                        'TempDir' => '<string>',
                        'Upsert' => true || false,
                    ],
                    'Name' => '<string>',
                ],
                'AmazonRedshiftTarget' => [
                    'Data' => [
                        'AccessType' => '<string>',
                        'Action' => '<string>',
                        'AdvancedOptions' => [
                            [
                                'Key' => '<string>',
                                'Value' => '<string>',
                            ],
                            // ...
                        ],
                        'CatalogDatabase' => [
                            'Description' => '<string>',
                            'Label' => '<string>',
                            'Value' => '<string>',
                        ],
                        'CatalogRedshiftSchema' => '<string>',
                        'CatalogRedshiftTable' => '<string>',
                        'CatalogTable' => [
                            'Description' => '<string>',
                            'Label' => '<string>',
                            'Value' => '<string>',
                        ],
                        'Connection' => [
                            'Description' => '<string>',
                            'Label' => '<string>',
                            'Value' => '<string>',
                        ],
                        'CrawlerConnection' => '<string>',
                        'IamRole' => [
                            'Description' => '<string>',
                            'Label' => '<string>',
                            'Value' => '<string>',
                        ],
                        'MergeAction' => '<string>',
                        'MergeClause' => '<string>',
                        'MergeWhenMatched' => '<string>',
                        'MergeWhenNotMatched' => '<string>',
                        'PostAction' => '<string>',
                        'PreAction' => '<string>',
                        'SampleQuery' => '<string>',
                        'Schema' => [
                            'Description' => '<string>',
                            'Label' => '<string>',
                            'Value' => '<string>',
                        ],
                        'SelectedColumns' => [
                            [
                                'Description' => '<string>',
                                'Label' => '<string>',
                                'Value' => '<string>',
                            ],
                            // ...
                        ],
                        'SourceType' => '<string>',
                        'StagingTable' => '<string>',
                        'Table' => [
                            'Description' => '<string>',
                            'Label' => '<string>',
                            'Value' => '<string>',
                        ],
                        'TablePrefix' => '<string>',
                        'TableSchema' => [
                            [
                                'Description' => '<string>',
                                'Label' => '<string>',
                                'Value' => '<string>',
                            ],
                            // ...
                        ],
                        'TempDir' => '<string>',
                        'Upsert' => true || false,
                    ],
                    'Inputs' => ['<string>', ...],
                    'Name' => '<string>',
                ],
                'ApplyMapping' => [
                    'Inputs' => ['<string>', ...],
                    'Mapping' => [
                        [
                            'Children' => [...], // RECURSIVE
                            'Dropped' => true || false,
                            'FromPath' => ['<string>', ...],
                            'FromType' => '<string>',
                            'ToKey' => '<string>',
                            'ToType' => '<string>',
                        ],
                        // ...
                    ],
                    'Name' => '<string>',
                ],
                'AthenaConnectorSource' => [
                    'ConnectionName' => '<string>',
                    'ConnectionTable' => '<string>',
                    'ConnectionType' => '<string>',
                    'ConnectorName' => '<string>',
                    'Name' => '<string>',
                    'OutputSchemas' => [
                        [
                            'Columns' => [
                                [
                                    'Name' => '<string>',
                                    'Type' => '<string>',
                                ],
                                // ...
                            ],
                        ],
                        // ...
                    ],
                    'SchemaName' => '<string>',
                ],
                'CatalogDeltaSource' => [
                    'AdditionalDeltaOptions' => ['<string>', ...],
                    'Database' => '<string>',
                    'Name' => '<string>',
                    'OutputSchemas' => [
                        [
                            'Columns' => [
                                [
                                    'Name' => '<string>',
                                    'Type' => '<string>',
                                ],
                                // ...
                            ],
                        ],
                        // ...
                    ],
                    'Table' => '<string>',
                ],
                'CatalogHudiSource' => [
                    'AdditionalHudiOptions' => ['<string>', ...],
                    'Database' => '<string>',
                    'Name' => '<string>',
                    'OutputSchemas' => [
                        [
                            'Columns' => [
                                [
                                    'Name' => '<string>',
                                    'Type' => '<string>',
                                ],
                                // ...
                            ],
                        ],
                        // ...
                    ],
                    'Table' => '<string>',
                ],
                'CatalogKafkaSource' => [
                    'DataPreviewOptions' => [
                        'PollingTime' => <integer>,
                        'RecordPollingLimit' => <integer>,
                    ],
                    'Database' => '<string>',
                    'DetectSchema' => true || false,
                    'Name' => '<string>',
                    'StreamingOptions' => [
                        'AddRecordTimestamp' => '<string>',
                        'Assign' => '<string>',
                        'BootstrapServers' => '<string>',
                        'Classification' => '<string>',
                        'ConnectionName' => '<string>',
                        'Delimiter' => '<string>',
                        'EmitConsumerLagMetrics' => '<string>',
                        'EndingOffsets' => '<string>',
                        'IncludeHeaders' => true || false,
                        'MaxOffsetsPerTrigger' => <integer>,
                        'MinPartitions' => <integer>,
                        'NumRetries' => <integer>,
                        'PollTimeoutMs' => <integer>,
                        'RetryIntervalMs' => <integer>,
                        'SecurityProtocol' => '<string>',
                        'StartingOffsets' => '<string>',
                        'StartingTimestamp' => <DateTime>,
                        'SubscribePattern' => '<string>',
                        'TopicName' => '<string>',
                    ],
                    'Table' => '<string>',
                    'WindowSize' => <integer>,
                ],
                'CatalogKinesisSource' => [
                    'DataPreviewOptions' => [
                        'PollingTime' => <integer>,
                        'RecordPollingLimit' => <integer>,
                    ],
                    'Database' => '<string>',
                    'DetectSchema' => true || false,
                    'Name' => '<string>',
                    'StreamingOptions' => [
                        'AddIdleTimeBetweenReads' => true || false,
                        'AddRecordTimestamp' => '<string>',
                        'AvoidEmptyBatches' => true || false,
                        'Classification' => '<string>',
                        'Delimiter' => '<string>',
                        'DescribeShardInterval' => <integer>,
                        'EmitConsumerLagMetrics' => '<string>',
                        'EndpointUrl' => '<string>',
                        'IdleTimeBetweenReadsInMs' => <integer>,
                        'MaxFetchRecordsPerShard' => <integer>,
                        'MaxFetchTimeInMs' => <integer>,
                        'MaxRecordPerRead' => <integer>,
                        'MaxRetryIntervalMs' => <integer>,
                        'NumRetries' => <integer>,
                        'RetryIntervalMs' => <integer>,
                        'RoleArn' => '<string>',
                        'RoleSessionName' => '<string>',
                        'StartingPosition' => 'latest|trim_horizon|earliest|timestamp',
                        'StartingTimestamp' => <DateTime>,
                        'StreamArn' => '<string>',
                        'StreamName' => '<string>',
                    ],
                    'Table' => '<string>',
                    'WindowSize' => <integer>,
                ],
                'CatalogSource' => [
                    'Database' => '<string>',
                    'Name' => '<string>',
                    'Table' => '<string>',
                ],
                'CatalogTarget' => [
                    'Database' => '<string>',
                    'Inputs' => ['<string>', ...],
                    'Name' => '<string>',
                    'PartitionKeys' => [
                        ['<string>', ...],
                        // ...
                    ],
                    'Table' => '<string>',
                ],
                'ConnectorDataSource' => [
                    'ConnectionType' => '<string>',
                    'Data' => ['<string>', ...],
                    'Name' => '<string>',
                    'OutputSchemas' => [
                        [
                            'Columns' => [
                                [
                                    'Name' => '<string>',
                                    'Type' => '<string>',
                                ],
                                // ...
                            ],
                        ],
                        // ...
                    ],
                ],
                'ConnectorDataTarget' => [
                    'ConnectionType' => '<string>',
                    'Data' => ['<string>', ...],
                    'Inputs' => ['<string>', ...],
                    'Name' => '<string>',
                ],
                'CustomCode' => [
                    'ClassName' => '<string>',
                    'Code' => '<string>',
                    'Inputs' => ['<string>', ...],
                    'Name' => '<string>',
                    'OutputSchemas' => [
                        [
                            'Columns' => [
                                [
                                    'Name' => '<string>',
                                    'Type' => '<string>',
                                ],
                                // ...
                            ],
                        ],
                        // ...
                    ],
                ],
                'DirectJDBCSource' => [
                    'ConnectionName' => '<string>',
                    'ConnectionType' => 'sqlserver|mysql|oracle|postgresql|redshift',
                    'Database' => '<string>',
                    'Name' => '<string>',
                    'RedshiftTmpDir' => '<string>',
                    'Table' => '<string>',
                ],
                'DirectKafkaSource' => [
                    'DataPreviewOptions' => [
                        'PollingTime' => <integer>,
                        'RecordPollingLimit' => <integer>,
                    ],
                    'DetectSchema' => true || false,
                    'Name' => '<string>',
                    'StreamingOptions' => [
                        'AddRecordTimestamp' => '<string>',
                        'Assign' => '<string>',
                        'BootstrapServers' => '<string>',
                        'Classification' => '<string>',
                        'ConnectionName' => '<string>',
                        'Delimiter' => '<string>',
                        'EmitConsumerLagMetrics' => '<string>',
                        'EndingOffsets' => '<string>',
                        'IncludeHeaders' => true || false,
                        'MaxOffsetsPerTrigger' => <integer>,
                        'MinPartitions' => <integer>,
                        'NumRetries' => <integer>,
                        'PollTimeoutMs' => <integer>,
                        'RetryIntervalMs' => <integer>,
                        'SecurityProtocol' => '<string>',
                        'StartingOffsets' => '<string>',
                        'StartingTimestamp' => <DateTime>,
                        'SubscribePattern' => '<string>',
                        'TopicName' => '<string>',
                    ],
                    'WindowSize' => <integer>,
                ],
                'DirectKinesisSource' => [
                    'DataPreviewOptions' => [
                        'PollingTime' => <integer>,
                        'RecordPollingLimit' => <integer>,
                    ],
                    'DetectSchema' => true || false,
                    'Name' => '<string>',
                    'StreamingOptions' => [
                        'AddIdleTimeBetweenReads' => true || false,
                        'AddRecordTimestamp' => '<string>',
                        'AvoidEmptyBatches' => true || false,
                        'Classification' => '<string>',
                        'Delimiter' => '<string>',
                        'DescribeShardInterval' => <integer>,
                        'EmitConsumerLagMetrics' => '<string>',
                        'EndpointUrl' => '<string>',
                        'IdleTimeBetweenReadsInMs' => <integer>,
                        'MaxFetchRecordsPerShard' => <integer>,
                        'MaxFetchTimeInMs' => <integer>,
                        'MaxRecordPerRead' => <integer>,
                        'MaxRetryIntervalMs' => <integer>,
                        'NumRetries' => <integer>,
                        'RetryIntervalMs' => <integer>,
                        'RoleArn' => '<string>',
                        'RoleSessionName' => '<string>',
                        'StartingPosition' => 'latest|trim_horizon|earliest|timestamp',
                        'StartingTimestamp' => <DateTime>,
                        'StreamArn' => '<string>',
                        'StreamName' => '<string>',
                    ],
                    'WindowSize' => <integer>,
                ],
                'DropDuplicates' => [
                    'Columns' => [
                        ['<string>', ...],
                        // ...
                    ],
                    'Inputs' => ['<string>', ...],
                    'Name' => '<string>',
                ],
                'DropFields' => [
                    'Inputs' => ['<string>', ...],
                    'Name' => '<string>',
                    'Paths' => [
                        ['<string>', ...],
                        // ...
                    ],
                ],
                'DropNullFields' => [
                    'Inputs' => ['<string>', ...],
                    'Name' => '<string>',
                    'NullCheckBoxList' => [
                        'IsEmpty' => true || false,
                        'IsNegOne' => true || false,
                        'IsNullString' => true || false,
                    ],
                    'NullTextList' => [
                        [
                            'Datatype' => [
                                'Id' => '<string>',
                                'Label' => '<string>',
                            ],
                            'Value' => '<string>',
                        ],
                        // ...
                    ],
                ],
                'DynamicTransform' => [
                    'FunctionName' => '<string>',
                    'Inputs' => ['<string>', ...],
                    'Name' => '<string>',
                    'OutputSchemas' => [
                        [
                            'Columns' => [
                                [
                                    'Name' => '<string>',
                                    'Type' => '<string>',
                                ],
                                // ...
                            ],
                        ],
                        // ...
                    ],
                    'Parameters' => [
                        [
                            'IsOptional' => true || false,
                            'ListType' => 'str|int|float|complex|bool|list|null',
                            'Name' => '<string>',
                            'Type' => 'str|int|float|complex|bool|list|null',
                            'ValidationMessage' => '<string>',
                            'ValidationRule' => '<string>',
                            'Value' => ['<string>', ...],
                        ],
                        // ...
                    ],
                    'Path' => '<string>',
                    'TransformName' => '<string>',
                    'Version' => '<string>',
                ],
                'DynamoDBCatalogSource' => [
                    'Database' => '<string>',
                    'Name' => '<string>',
                    'Table' => '<string>',
                ],
                'EvaluateDataQuality' => [
                    'Inputs' => ['<string>', ...],
                    'Name' => '<string>',
                    'Output' => 'PrimaryInput|EvaluationResults',
                    'PublishingOptions' => [
                        'CloudWatchMetricsEnabled' => true || false,
                        'EvaluationContext' => '<string>',
                        'ResultsPublishingEnabled' => true || false,
                        'ResultsS3Prefix' => '<string>',
                    ],
                    'Ruleset' => '<string>',
                    'StopJobOnFailureOptions' => [
                        'StopJobOnFailureTiming' => 'Immediate|AfterDataLoad',
                    ],
                ],
                'EvaluateDataQualityMultiFrame' => [
                    'AdditionalDataSources' => ['<string>', ...],
                    'AdditionalOptions' => ['<string>', ...],
                    'Inputs' => ['<string>', ...],
                    'Name' => '<string>',
                    'PublishingOptions' => [
                        'CloudWatchMetricsEnabled' => true || false,
                        'EvaluationContext' => '<string>',
                        'ResultsPublishingEnabled' => true || false,
                        'ResultsS3Prefix' => '<string>',
                    ],
                    'Ruleset' => '<string>',
                    'StopJobOnFailureOptions' => [
                        'StopJobOnFailureTiming' => 'Immediate|AfterDataLoad',
                    ],
                ],
                'FillMissingValues' => [
                    'FilledPath' => '<string>',
                    'ImputedPath' => '<string>',
                    'Inputs' => ['<string>', ...],
                    'Name' => '<string>',
                ],
                'Filter' => [
                    'Filters' => [
                        [
                            'Negated' => true || false,
                            'Operation' => 'EQ|LT|GT|LTE|GTE|REGEX|ISNULL',
                            'Values' => [
                                [
                                    'Type' => 'COLUMNEXTRACTED|CONSTANT',
                                    'Value' => ['<string>', ...],
                                ],
                                // ...
                            ],
                        ],
                        // ...
                    ],
                    'Inputs' => ['<string>', ...],
                    'LogicalOperator' => 'AND|OR',
                    'Name' => '<string>',
                ],
                'GovernedCatalogSource' => [
                    'AdditionalOptions' => [
                        'BoundedFiles' => <integer>,
                        'BoundedSize' => <integer>,
                    ],
                    'Database' => '<string>',
                    'Name' => '<string>',
                    'PartitionPredicate' => '<string>',
                    'Table' => '<string>',
                ],
                'GovernedCatalogTarget' => [
                    'Database' => '<string>',
                    'Inputs' => ['<string>', ...],
                    'Name' => '<string>',
                    'PartitionKeys' => [
                        ['<string>', ...],
                        // ...
                    ],
                    'SchemaChangePolicy' => [
                        'EnableUpdateCatalog' => true || false,
                        'UpdateBehavior' => 'UPDATE_IN_DATABASE|LOG',
                    ],
                    'Table' => '<string>',
                ],
                'JDBCConnectorSource' => [
                    'AdditionalOptions' => [
                        'DataTypeMapping' => ['<string>', ...],
                        'FilterPredicate' => '<string>',
                        'JobBookmarkKeys' => ['<string>', ...],
                        'JobBookmarkKeysSortOrder' => '<string>',
                        'LowerBound' => <integer>,
                        'NumPartitions' => <integer>,
                        'PartitionColumn' => '<string>',
                        'UpperBound' => <integer>,
                    ],
                    'ConnectionName' => '<string>',
                    'ConnectionTable' => '<string>',
                    'ConnectionType' => '<string>',
                    'ConnectorName' => '<string>',
                    'Name' => '<string>',
                    'OutputSchemas' => [
                        [
                            'Columns' => [
                                [
                                    'Name' => '<string>',
                                    'Type' => '<string>',
                                ],
                                // ...
                            ],
                        ],
                        // ...
                    ],
                    'Query' => '<string>',
                ],
                'JDBCConnectorTarget' => [
                    'AdditionalOptions' => ['<string>', ...],
                    'ConnectionName' => '<string>',
                    'ConnectionTable' => '<string>',
                    'ConnectionType' => '<string>',
                    'ConnectorName' => '<string>',
                    'Inputs' => ['<string>', ...],
                    'Name' => '<string>',
                    'OutputSchemas' => [
                        [
                            'Columns' => [
                                [
                                    'Name' => '<string>',
                                    'Type' => '<string>',
                                ],
                                // ...
                            ],
                        ],
                        // ...
                    ],
                ],
                'Join' => [
                    'Columns' => [
                        [
                            'From' => '<string>',
                            'Keys' => [
                                ['<string>', ...],
                                // ...
                            ],
                        ],
                        // ...
                    ],
                    'Inputs' => ['<string>', ...],
                    'JoinType' => 'equijoin|left|right|outer|leftsemi|leftanti',
                    'Name' => '<string>',
                ],
                'Merge' => [
                    'Inputs' => ['<string>', ...],
                    'Name' => '<string>',
                    'PrimaryKeys' => [
                        ['<string>', ...],
                        // ...
                    ],
                    'Source' => '<string>',
                ],
                'MicrosoftSQLServerCatalogSource' => [
                    'Database' => '<string>',
                    'Name' => '<string>',
                    'Table' => '<string>',
                ],
                'MicrosoftSQLServerCatalogTarget' => [
                    'Database' => '<string>',
                    'Inputs' => ['<string>', ...],
                    'Name' => '<string>',
                    'Table' => '<string>',
                ],
                'MySQLCatalogSource' => [
                    'Database' => '<string>',
                    'Name' => '<string>',
                    'Table' => '<string>',
                ],
                'MySQLCatalogTarget' => [
                    'Database' => '<string>',
                    'Inputs' => ['<string>', ...],
                    'Name' => '<string>',
                    'Table' => '<string>',
                ],
                'OracleSQLCatalogSource' => [
                    'Database' => '<string>',
                    'Name' => '<string>',
                    'Table' => '<string>',
                ],
                'OracleSQLCatalogTarget' => [
                    'Database' => '<string>',
                    'Inputs' => ['<string>', ...],
                    'Name' => '<string>',
                    'Table' => '<string>',
                ],
                'PIIDetection' => [
                    'EntityTypesToDetect' => ['<string>', ...],
                    'Inputs' => ['<string>', ...],
                    'MaskValue' => '<string>',
                    'Name' => '<string>',
                    'OutputColumnName' => '<string>',
                    'PiiType' => 'RowAudit|RowMasking|ColumnAudit|ColumnMasking',
                    'SampleFraction' => <float>,
                    'ThresholdFraction' => <float>,
                ],
                'PostgreSQLCatalogSource' => [
                    'Database' => '<string>',
                    'Name' => '<string>',
                    'Table' => '<string>',
                ],
                'PostgreSQLCatalogTarget' => [
                    'Database' => '<string>',
                    'Inputs' => ['<string>', ...],
                    'Name' => '<string>',
                    'Table' => '<string>',
                ],
                'Recipe' => [
                    'Inputs' => ['<string>', ...],
                    'Name' => '<string>',
                    'RecipeReference' => [
                        'RecipeArn' => '<string>',
                        'RecipeVersion' => '<string>',
                    ],
                    'RecipeSteps' => [
                        [
                            'Action' => [
                                'Operation' => '<string>',
                                'Parameters' => ['<string>', ...],
                            ],
                            'ConditionExpressions' => [
                                [
                                    'Condition' => '<string>',
                                    'TargetColumn' => '<string>',
                                    'Value' => '<string>',
                                ],
                                // ...
                            ],
                        ],
                        // ...
                    ],
                ],
                'RedshiftSource' => [
                    'Database' => '<string>',
                    'Name' => '<string>',
                    'RedshiftTmpDir' => '<string>',
                    'Table' => '<string>',
                    'TmpDirIAMRole' => '<string>',
                ],
                'RedshiftTarget' => [
                    'Database' => '<string>',
                    'Inputs' => ['<string>', ...],
                    'Name' => '<string>',
                    'RedshiftTmpDir' => '<string>',
                    'Table' => '<string>',
                    'TmpDirIAMRole' => '<string>',
                    'UpsertRedshiftOptions' => [
                        'ConnectionName' => '<string>',
                        'TableLocation' => '<string>',
                        'UpsertKeys' => ['<string>', ...],
                    ],
                ],
                'RelationalCatalogSource' => [
                    'Database' => '<string>',
                    'Name' => '<string>',
                    'Table' => '<string>',
                ],
                'RenameField' => [
                    'Inputs' => ['<string>', ...],
                    'Name' => '<string>',
                    'SourcePath' => ['<string>', ...],
                    'TargetPath' => ['<string>', ...],
                ],
                'S3CatalogDeltaSource' => [
                    'AdditionalDeltaOptions' => ['<string>', ...],
                    'Database' => '<string>',
                    'Name' => '<string>',
                    'OutputSchemas' => [
                        [
                            'Columns' => [
                                [
                                    'Name' => '<string>',
                                    'Type' => '<string>',
                                ],
                                // ...
                            ],
                        ],
                        // ...
                    ],
                    'Table' => '<string>',
                ],
                'S3CatalogHudiSource' => [
                    'AdditionalHudiOptions' => ['<string>', ...],
                    'Database' => '<string>',
                    'Name' => '<string>',
                    'OutputSchemas' => [
                        [
                            'Columns' => [
                                [
                                    'Name' => '<string>',
                                    'Type' => '<string>',
                                ],
                                // ...
                            ],
                        ],
                        // ...
                    ],
                    'Table' => '<string>',
                ],
                'S3CatalogSource' => [
                    'AdditionalOptions' => [
                        'BoundedFiles' => <integer>,
                        'BoundedSize' => <integer>,
                    ],
                    'Database' => '<string>',
                    'Name' => '<string>',
                    'PartitionPredicate' => '<string>',
                    'Table' => '<string>',
                ],
                'S3CatalogTarget' => [
                    'Database' => '<string>',
                    'Inputs' => ['<string>', ...],
                    'Name' => '<string>',
                    'PartitionKeys' => [
                        ['<string>', ...],
                        // ...
                    ],
                    'SchemaChangePolicy' => [
                        'EnableUpdateCatalog' => true || false,
                        'UpdateBehavior' => 'UPDATE_IN_DATABASE|LOG',
                    ],
                    'Table' => '<string>',
                ],
                'S3CsvSource' => [
                    'AdditionalOptions' => [
                        'BoundedFiles' => <integer>,
                        'BoundedSize' => <integer>,
                        'EnableSamplePath' => true || false,
                        'SamplePath' => '<string>',
                    ],
                    'CompressionType' => 'gzip|bzip2',
                    'Escaper' => '<string>',
                    'Exclusions' => ['<string>', ...],
                    'GroupFiles' => '<string>',
                    'GroupSize' => '<string>',
                    'MaxBand' => <integer>,
                    'MaxFilesInBand' => <integer>,
                    'Multiline' => true || false,
                    'Name' => '<string>',
                    'OptimizePerformance' => true || false,
                    'OutputSchemas' => [
                        [
                            'Columns' => [
                                [
                                    'Name' => '<string>',
                                    'Type' => '<string>',
                                ],
                                // ...
                            ],
                        ],
                        // ...
                    ],
                    'Paths' => ['<string>', ...],
                    'QuoteChar' => 'quote|quillemet|single_quote|disabled',
                    'Recurse' => true || false,
                    'Separator' => 'comma|ctrla|pipe|semicolon|tab',
                    'SkipFirst' => true || false,
                    'WithHeader' => true || false,
                    'WriteHeader' => true || false,
                ],
                'S3DeltaCatalogTarget' => [
                    'AdditionalOptions' => ['<string>', ...],
                    'Database' => '<string>',
                    'Inputs' => ['<string>', ...],
                    'Name' => '<string>',
                    'PartitionKeys' => [
                        ['<string>', ...],
                        // ...
                    ],
                    'SchemaChangePolicy' => [
                        'EnableUpdateCatalog' => true || false,
                        'UpdateBehavior' => 'UPDATE_IN_DATABASE|LOG',
                    ],
                    'Table' => '<string>',
                ],
                'S3DeltaDirectTarget' => [
                    'AdditionalOptions' => ['<string>', ...],
                    'Compression' => 'uncompressed|snappy',
                    'Format' => 'json|csv|avro|orc|parquet|hudi|delta',
                    'Inputs' => ['<string>', ...],
                    'Name' => '<string>',
                    'PartitionKeys' => [
                        ['<string>', ...],
                        // ...
                    ],
                    'Path' => '<string>',
                    'SchemaChangePolicy' => [
                        'Database' => '<string>',
                        'EnableUpdateCatalog' => true || false,
                        'Table' => '<string>',
                        'UpdateBehavior' => 'UPDATE_IN_DATABASE|LOG',
                    ],
                ],
                'S3DeltaSource' => [
                    'AdditionalDeltaOptions' => ['<string>', ...],
                    'AdditionalOptions' => [
                        'BoundedFiles' => <integer>,
                        'BoundedSize' => <integer>,
                        'EnableSamplePath' => true || false,
                        'SamplePath' => '<string>',
                    ],
                    'Name' => '<string>',
                    'OutputSchemas' => [
                        [
                            'Columns' => [
                                [
                                    'Name' => '<string>',
                                    'Type' => '<string>',
                                ],
                                // ...
                            ],
                        ],
                        // ...
                    ],
                    'Paths' => ['<string>', ...],
                ],
                'S3DirectTarget' => [
                    'Compression' => '<string>',
                    'Format' => 'json|csv|avro|orc|parquet|hudi|delta',
                    'Inputs' => ['<string>', ...],
                    'Name' => '<string>',
                    'PartitionKeys' => [
                        ['<string>', ...],
                        // ...
                    ],
                    'Path' => '<string>',
                    'SchemaChangePolicy' => [
                        'Database' => '<string>',
                        'EnableUpdateCatalog' => true || false,
                        'Table' => '<string>',
                        'UpdateBehavior' => 'UPDATE_IN_DATABASE|LOG',
                    ],
                ],
                'S3GlueParquetTarget' => [
                    'Compression' => 'snappy|lzo|gzip|uncompressed|none',
                    'Inputs' => ['<string>', ...],
                    'Name' => '<string>',
                    'PartitionKeys' => [
                        ['<string>', ...],
                        // ...
                    ],
                    'Path' => '<string>',
                    'SchemaChangePolicy' => [
                        'Database' => '<string>',
                        'EnableUpdateCatalog' => true || false,
                        'Table' => '<string>',
                        'UpdateBehavior' => 'UPDATE_IN_DATABASE|LOG',
                    ],
                ],
                'S3HudiCatalogTarget' => [
                    'AdditionalOptions' => ['<string>', ...],
                    'Database' => '<string>',
                    'Inputs' => ['<string>', ...],
                    'Name' => '<string>',
                    'PartitionKeys' => [
                        ['<string>', ...],
                        // ...
                    ],
                    'SchemaChangePolicy' => [
                        'EnableUpdateCatalog' => true || false,
                        'UpdateBehavior' => 'UPDATE_IN_DATABASE|LOG',
                    ],
                    'Table' => '<string>',
                ],
                'S3HudiDirectTarget' => [
                    'AdditionalOptions' => ['<string>', ...],
                    'Compression' => 'gzip|lzo|uncompressed|snappy',
                    'Format' => 'json|csv|avro|orc|parquet|hudi|delta',
                    'Inputs' => ['<string>', ...],
                    'Name' => '<string>',
                    'PartitionKeys' => [
                        ['<string>', ...],
                        // ...
                    ],
                    'Path' => '<string>',
                    'SchemaChangePolicy' => [
                        'Database' => '<string>',
                        'EnableUpdateCatalog' => true || false,
                        'Table' => '<string>',
                        'UpdateBehavior' => 'UPDATE_IN_DATABASE|LOG',
                    ],
                ],
                'S3HudiSource' => [
                    'AdditionalHudiOptions' => ['<string>', ...],
                    'AdditionalOptions' => [
                        'BoundedFiles' => <integer>,
                        'BoundedSize' => <integer>,
                        'EnableSamplePath' => true || false,
                        'SamplePath' => '<string>',
                    ],
                    'Name' => '<string>',
                    'OutputSchemas' => [
                        [
                            'Columns' => [
                                [
                                    'Name' => '<string>',
                                    'Type' => '<string>',
                                ],
                                // ...
                            ],
                        ],
                        // ...
                    ],
                    'Paths' => ['<string>', ...],
                ],
                'S3JsonSource' => [
                    'AdditionalOptions' => [
                        'BoundedFiles' => <integer>,
                        'BoundedSize' => <integer>,
                        'EnableSamplePath' => true || false,
                        'SamplePath' => '<string>',
                    ],
                    'CompressionType' => 'gzip|bzip2',
                    'Exclusions' => ['<string>', ...],
                    'GroupFiles' => '<string>',
                    'GroupSize' => '<string>',
                    'JsonPath' => '<string>',
                    'MaxBand' => <integer>,
                    'MaxFilesInBand' => <integer>,
                    'Multiline' => true || false,
                    'Name' => '<string>',
                    'OutputSchemas' => [
                        [
                            'Columns' => [
                                [
                                    'Name' => '<string>',
                                    'Type' => '<string>',
                                ],
                                // ...
                            ],
                        ],
                        // ...
                    ],
                    'Paths' => ['<string>', ...],
                    'Recurse' => true || false,
                ],
                'S3ParquetSource' => [
                    'AdditionalOptions' => [
                        'BoundedFiles' => <integer>,
                        'BoundedSize' => <integer>,
                        'EnableSamplePath' => true || false,
                        'SamplePath' => '<string>',
                    ],
                    'CompressionType' => 'snappy|lzo|gzip|uncompressed|none',
                    'Exclusions' => ['<string>', ...],
                    'GroupFiles' => '<string>',
                    'GroupSize' => '<string>',
                    'MaxBand' => <integer>,
                    'MaxFilesInBand' => <integer>,
                    'Name' => '<string>',
                    'OutputSchemas' => [
                        [
                            'Columns' => [
                                [
                                    'Name' => '<string>',
                                    'Type' => '<string>',
                                ],
                                // ...
                            ],
                        ],
                        // ...
                    ],
                    'Paths' => ['<string>', ...],
                    'Recurse' => true || false,
                ],
                'SelectFields' => [
                    'Inputs' => ['<string>', ...],
                    'Name' => '<string>',
                    'Paths' => [
                        ['<string>', ...],
                        // ...
                    ],
                ],
                'SelectFromCollection' => [
                    'Index' => <integer>,
                    'Inputs' => ['<string>', ...],
                    'Name' => '<string>',
                ],
                'SnowflakeSource' => [
                    'Data' => [
                        'Action' => '<string>',
                        'AdditionalOptions' => ['<string>', ...],
                        'AutoPushdown' => true || false,
                        'Connection' => [
                            'Description' => '<string>',
                            'Label' => '<string>',
                            'Value' => '<string>',
                        ],
                        'Database' => '<string>',
                        'IamRole' => [
                            'Description' => '<string>',
                            'Label' => '<string>',
                            'Value' => '<string>',
                        ],
                        'MergeAction' => '<string>',
                        'MergeClause' => '<string>',
                        'MergeWhenMatched' => '<string>',
                        'MergeWhenNotMatched' => '<string>',
                        'PostAction' => '<string>',
                        'PreAction' => '<string>',
                        'SampleQuery' => '<string>',
                        'Schema' => '<string>',
                        'SelectedColumns' => [
                            [
                                'Description' => '<string>',
                                'Label' => '<string>',
                                'Value' => '<string>',
                            ],
                            // ...
                        ],
                        'SourceType' => '<string>',
                        'StagingTable' => '<string>',
                        'Table' => '<string>',
                        'TableSchema' => [
                            [
                                'Description' => '<string>',
                                'Label' => '<string>',
                                'Value' => '<string>',
                            ],
                            // ...
                        ],
                        'TempDir' => '<string>',
                        'Upsert' => true || false,
                    ],
                    'Name' => '<string>',
                    'OutputSchemas' => [
                        [
                            'Columns' => [
                                [
                                    'Name' => '<string>',
                                    'Type' => '<string>',
                                ],
                                // ...
                            ],
                        ],
                        // ...
                    ],
                ],
                'SnowflakeTarget' => [
                    'Data' => [
                        'Action' => '<string>',
                        'AdditionalOptions' => ['<string>', ...],
                        'AutoPushdown' => true || false,
                        'Connection' => [
                            'Description' => '<string>',
                            'Label' => '<string>',
                            'Value' => '<string>',
                        ],
                        'Database' => '<string>',
                        'IamRole' => [
                            'Description' => '<string>',
                            'Label' => '<string>',
                            'Value' => '<string>',
                        ],
                        'MergeAction' => '<string>',
                        'MergeClause' => '<string>',
                        'MergeWhenMatched' => '<string>',
                        'MergeWhenNotMatched' => '<string>',
                        'PostAction' => '<string>',
                        'PreAction' => '<string>',
                        'SampleQuery' => '<string>',
                        'Schema' => '<string>',
                        'SelectedColumns' => [
                            [
                                'Description' => '<string>',
                                'Label' => '<string>',
                                'Value' => '<string>',
                            ],
                            // ...
                        ],
                        'SourceType' => '<string>',
                        'StagingTable' => '<string>',
                        'Table' => '<string>',
                        'TableSchema' => [
                            [
                                'Description' => '<string>',
                                'Label' => '<string>',
                                'Value' => '<string>',
                            ],
                            // ...
                        ],
                        'TempDir' => '<string>',
                        'Upsert' => true || false,
                    ],
                    'Inputs' => ['<string>', ...],
                    'Name' => '<string>',
                ],
                'SparkConnectorSource' => [
                    'AdditionalOptions' => ['<string>', ...],
                    'ConnectionName' => '<string>',
                    'ConnectionType' => '<string>',
                    'ConnectorName' => '<string>',
                    'Name' => '<string>',
                    'OutputSchemas' => [
                        [
                            'Columns' => [
                                [
                                    'Name' => '<string>',
                                    'Type' => '<string>',
                                ],
                                // ...
                            ],
                        ],
                        // ...
                    ],
                ],
                'SparkConnectorTarget' => [
                    'AdditionalOptions' => ['<string>', ...],
                    'ConnectionName' => '<string>',
                    'ConnectionType' => '<string>',
                    'ConnectorName' => '<string>',
                    'Inputs' => ['<string>', ...],
                    'Name' => '<string>',
                    'OutputSchemas' => [
                        [
                            'Columns' => [
                                [
                                    'Name' => '<string>',
                                    'Type' => '<string>',
                                ],
                                // ...
                            ],
                        ],
                        // ...
                    ],
                ],
                'SparkSQL' => [
                    'Inputs' => ['<string>', ...],
                    'Name' => '<string>',
                    'OutputSchemas' => [
                        [
                            'Columns' => [
                                [
                                    'Name' => '<string>',
                                    'Type' => '<string>',
                                ],
                                // ...
                            ],
                        ],
                        // ...
                    ],
                    'SqlAliases' => [
                        [
                            'Alias' => '<string>',
                            'From' => '<string>',
                        ],
                        // ...
                    ],
                    'SqlQuery' => '<string>',
                ],
                'Spigot' => [
                    'Inputs' => ['<string>', ...],
                    'Name' => '<string>',
                    'Path' => '<string>',
                    'Prob' => <float>,
                    'Topk' => <integer>,
                ],
                'SplitFields' => [
                    'Inputs' => ['<string>', ...],
                    'Name' => '<string>',
                    'Paths' => [
                        ['<string>', ...],
                        // ...
                    ],
                ],
                'Union' => [
                    'Inputs' => ['<string>', ...],
                    'Name' => '<string>',
                    'UnionType' => 'ALL|DISTINCT',
                ],
            ],
            // ...
        ],
        'Command' => [
            'Name' => '<string>',
            'PythonVersion' => '<string>',
            'Runtime' => '<string>',
            'ScriptLocation' => '<string>',
        ],
        'Connections' => [
            'Connections' => ['<string>', ...],
        ],
        'CreatedOn' => <DateTime>,
        'DefaultArguments' => ['<string>', ...],
        'Description' => '<string>',
        'ExecutionClass' => 'FLEX|STANDARD',
        'ExecutionProperty' => [
            'MaxConcurrentRuns' => <integer>,
        ],
        'GlueVersion' => '<string>',
        'JobMode' => 'SCRIPT|VISUAL|NOTEBOOK',
        'JobRunQueuingEnabled' => true || false,
        'LastModifiedOn' => <DateTime>,
        'LogUri' => '<string>',
        'MaintenanceWindow' => '<string>',
        'MaxCapacity' => <float>,
        'MaxRetries' => <integer>,
        'Name' => '<string>',
        'NonOverridableArguments' => ['<string>', ...],
        'NotificationProperty' => [
            'NotifyDelayAfter' => <integer>,
        ],
        'NumberOfWorkers' => <integer>,
        'ProfileName' => '<string>',
        'Role' => '<string>',
        'SecurityConfiguration' => '<string>',
        'SourceControlDetails' => [
            'AuthStrategy' => 'PERSONAL_ACCESS_TOKEN|AWS_SECRETS_MANAGER',
            'AuthToken' => '<string>',
            'Branch' => '<string>',
            'Folder' => '<string>',
            'LastCommitId' => '<string>',
            'Owner' => '<string>',
            'Provider' => 'GITHUB|GITLAB|BITBUCKET|AWS_CODE_COMMIT',
            'Repository' => '<string>',
        ],
        'Timeout' => <integer>,
        'WorkerType' => 'Standard|G.1X|G.2X|G.025X|G.4X|G.8X|Z.2X',
    ],
]

Result Details

Members
Job
Type: Job structure

The requested job definition.

Errors

InvalidInputException:

The input provided was not valid.

EntityNotFoundException:

A specified entity does not exist

InternalServiceException:

An internal service error occurred.

OperationTimeoutException:

The operation timed out.

GetJobBookmark

$result = $client->getJobBookmark([/* ... */]);
$promise = $client->getJobBookmarkAsync([/* ... */]);

Returns information on a job bookmark entry.

For more information about enabling and using job bookmarks, see:

Parameter Syntax

$result = $client->getJobBookmark([
    'JobName' => '<string>', // REQUIRED
    'RunId' => '<string>',
]);

Parameter Details

Members
JobName
Required: Yes
Type: string

The name of the job in question.

RunId
Type: string

The unique run identifier associated with this job run.

Result Syntax

[
    'JobBookmarkEntry' => [
        'Attempt' => <integer>,
        'JobBookmark' => '<string>',
        'JobName' => '<string>',
        'PreviousRunId' => '<string>',
        'Run' => <integer>,
        'RunId' => '<string>',
        'Version' => <integer>,
    ],
]

Result Details

Members
JobBookmarkEntry
Type: JobBookmarkEntry structure

A structure that defines a point that a job can resume processing.

Errors

EntityNotFoundException:

A specified entity does not exist

InvalidInputException:

The input provided was not valid.

InternalServiceException:

An internal service error occurred.

OperationTimeoutException:

The operation timed out.

ValidationException:

A value could not be validated.

GetJobRun

$result = $client->getJobRun([/* ... */]);
$promise = $client->getJobRunAsync([/* ... */]);

Retrieves the metadata for a given job run. Job run history is accessible for 90 days for your workflow and job run.

Parameter Syntax

$result = $client->getJobRun([
    'JobName' => '<string>', // REQUIRED
    'PredecessorsIncluded' => true || false,
    'RunId' => '<string>', // REQUIRED
]);

Parameter Details

Members
JobName
Required: Yes
Type: string

Name of the job definition being run.

PredecessorsIncluded
Type: boolean

True if a list of predecessor runs should be returned.

RunId
Required: Yes
Type: string

The ID of the job run.

Result Syntax

[
    'JobRun' => [
        'AllocatedCapacity' => <integer>,
        'Arguments' => ['<string>', ...],
        'Attempt' => <integer>,
        'CompletedOn' => <DateTime>,
        'DPUSeconds' => <float>,
        'ErrorMessage' => '<string>',
        'ExecutionClass' => 'FLEX|STANDARD',
        'ExecutionTime' => <integer>,
        'GlueVersion' => '<string>',
        'Id' => '<string>',
        'JobMode' => 'SCRIPT|VISUAL|NOTEBOOK',
        'JobName' => '<string>',
        'JobRunQueuingEnabled' => true || false,
        'JobRunState' => 'STARTING|RUNNING|STOPPING|STOPPED|SUCCEEDED|FAILED|TIMEOUT|ERROR|WAITING|EXPIRED',
        'LastModifiedOn' => <DateTime>,
        'LogGroupName' => '<string>',
        'MaintenanceWindow' => '<string>',
        'MaxCapacity' => <float>,
        'NotificationProperty' => [
            'NotifyDelayAfter' => <integer>,
        ],
        'NumberOfWorkers' => <integer>,
        'PredecessorRuns' => [
            [
                'JobName' => '<string>',
                'RunId' => '<string>',
            ],
            // ...
        ],
        'PreviousRunId' => '<string>',
        'ProfileName' => '<string>',
        'SecurityConfiguration' => '<string>',
        'StartedOn' => <DateTime>,
        'StateDetail' => '<string>',
        'Timeout' => <integer>,
        'TriggerName' => '<string>',
        'WorkerType' => 'Standard|G.1X|G.2X|G.025X|G.4X|G.8X|Z.2X',
    ],
]

Result Details

Members
JobRun
Type: JobRun structure

The requested job-run metadata.

Errors

InvalidInputException:

The input provided was not valid.

EntityNotFoundException:

A specified entity does not exist

InternalServiceException:

An internal service error occurred.

OperationTimeoutException:

The operation timed out.

GetJobRuns

$result = $client->getJobRuns([/* ... */]);
$promise = $client->getJobRunsAsync([/* ... */]);

Retrieves metadata for all runs of a given job definition.

Parameter Syntax

$result = $client->getJobRuns([
    'JobName' => '<string>', // REQUIRED
    'MaxResults' => <integer>,
    'NextToken' => '<string>',
]);

Parameter Details

Members
JobName
Required: Yes
Type: string

The name of the job definition for which to retrieve all job runs.

MaxResults
Type: int

The maximum size of the response.

NextToken
Type: string

A continuation token, if this is a continuation call.

Result Syntax

[
    'JobRuns' => [
        [
            'AllocatedCapacity' => <integer>,
            'Arguments' => ['<string>', ...],
            'Attempt' => <integer>,
            'CompletedOn' => <DateTime>,
            'DPUSeconds' => <float>,
            'ErrorMessage' => '<string>',
            'ExecutionClass' => 'FLEX|STANDARD',
            'ExecutionTime' => <integer>,
            'GlueVersion' => '<string>',
            'Id' => '<string>',
            'JobMode' => 'SCRIPT|VISUAL|NOTEBOOK',
            'JobName' => '<string>',
            'JobRunQueuingEnabled' => true || false,
            'JobRunState' => 'STARTING|RUNNING|STOPPING|STOPPED|SUCCEEDED|FAILED|TIMEOUT|ERROR|WAITING|EXPIRED',
            'LastModifiedOn' => <DateTime>,
            'LogGroupName' => '<string>',
            'MaintenanceWindow' => '<string>',
            'MaxCapacity' => <float>,
            'NotificationProperty' => [
                'NotifyDelayAfter' => <integer>,
            ],
            'NumberOfWorkers' => <integer>,
            'PredecessorRuns' => [
                [
                    'JobName' => '<string>',
                    'RunId' => '<string>',
                ],
                // ...
            ],
            'PreviousRunId' => '<string>',
            'ProfileName' => '<string>',
            'SecurityConfiguration' => '<string>',
            'StartedOn' => <DateTime>,
            'StateDetail' => '<string>',
            'Timeout' => <integer>,
            'TriggerName' => '<string>',
            'WorkerType' => 'Standard|G.1X|G.2X|G.025X|G.4X|G.8X|Z.2X',
        ],
        // ...
    ],
    'NextToken' => '<string>',
]

Result Details

Members
JobRuns
Type: Array of JobRun structures

A list of job-run metadata objects.

NextToken
Type: string

A continuation token, if not all requested job runs have been returned.

Errors

InvalidInputException:

The input provided was not valid.

EntityNotFoundException:

A specified entity does not exist

InternalServiceException:

An internal service error occurred.

OperationTimeoutException:

The operation timed out.

GetJobs

$result = $client->getJobs([/* ... */]);
$promise = $client->getJobsAsync([/* ... */]);

Retrieves all current job definitions.

Parameter Syntax

$result = $client->getJobs([
    'MaxResults' => <integer>,
    'NextToken' => '<string>',
]);

Parameter Details

Members
MaxResults
Type: int

The maximum size of the response.

NextToken
Type: string

A continuation token, if this is a continuation call.

Result Syntax

[
    'Jobs' => [
        [
            'AllocatedCapacity' => <integer>,
            'CodeGenConfigurationNodes' => [
                '<NodeId>' => [
                    'Aggregate' => [
                        'Aggs' => [
                            [
                                'AggFunc' => 'avg|countDistinct|count|first|last|kurtosis|max|min|skewness|stddev_samp|stddev_pop|sum|sumDistinct|var_samp|var_pop',
                                'Column' => ['<string>', ...],
                            ],
                            // ...
                        ],
                        'Groups' => [
                            ['<string>', ...],
                            // ...
                        ],
                        'Inputs' => ['<string>', ...],
                        'Name' => '<string>',
                    ],
                    'AmazonRedshiftSource' => [
                        'Data' => [
                            'AccessType' => '<string>',
                            'Action' => '<string>',
                            'AdvancedOptions' => [
                                [
                                    'Key' => '<string>',
                                    'Value' => '<string>',
                                ],
                                // ...
                            ],
                            'CatalogDatabase' => [
                                'Description' => '<string>',
                                'Label' => '<string>',
                                'Value' => '<string>',
                            ],
                            'CatalogRedshiftSchema' => '<string>',
                            'CatalogRedshiftTable' => '<string>',
                            'CatalogTable' => [
                                'Description' => '<string>',
                                'Label' => '<string>',
                                'Value' => '<string>',
                            ],
                            'Connection' => [
                                'Description' => '<string>',
                                'Label' => '<string>',
                                'Value' => '<string>',
                            ],
                            'CrawlerConnection' => '<string>',
                            'IamRole' => [
                                'Description' => '<string>',
                                'Label' => '<string>',
                                'Value' => '<string>',
                            ],
                            'MergeAction' => '<string>',
                            'MergeClause' => '<string>',
                            'MergeWhenMatched' => '<string>',
                            'MergeWhenNotMatched' => '<string>',
                            'PostAction' => '<string>',
                            'PreAction' => '<string>',
                            'SampleQuery' => '<string>',
                            'Schema' => [
                                'Description' => '<string>',
                                'Label' => '<string>',
                                'Value' => '<string>',
                            ],
                            'SelectedColumns' => [
                                [
                                    'Description' => '<string>',
                                    'Label' => '<string>',
                                    'Value' => '<string>',
                                ],
                                // ...
                            ],
                            'SourceType' => '<string>',
                            'StagingTable' => '<string>',
                            'Table' => [
                                'Description' => '<string>',
                                'Label' => '<string>',
                                'Value' => '<string>',
                            ],
                            'TablePrefix' => '<string>',
                            'TableSchema' => [
                                [
                                    'Description' => '<string>',
                                    'Label' => '<string>',
                                    'Value' => '<string>',
                                ],
                                // ...
                            ],
                            'TempDir' => '<string>',
                            'Upsert' => true || false,
                        ],
                        'Name' => '<string>',
                    ],
                    'AmazonRedshiftTarget' => [
                        'Data' => [
                            'AccessType' => '<string>',
                            'Action' => '<string>',
                            'AdvancedOptions' => [
                                [
                                    'Key' => '<string>',
                                    'Value' => '<string>',
                                ],
                                // ...
                            ],
                            'CatalogDatabase' => [
                                'Description' => '<string>',
                                'Label' => '<string>',
                                'Value' => '<string>',
                            ],
                            'CatalogRedshiftSchema' => '<string>',
                            'CatalogRedshiftTable' => '<string>',
                            'CatalogTable' => [
                                'Description' => '<string>',
                                'Label' => '<string>',
                                'Value' => '<string>',
                            ],
                            'Connection' => [
                                'Description' => '<string>',
                                'Label' => '<string>',
                                'Value' => '<string>',
                            ],
                            'CrawlerConnection' => '<string>',
                            'IamRole' => [
                                'Description' => '<string>',
                                'Label' => '<string>',
                                'Value' => '<string>',
                            ],
                            'MergeAction' => '<string>',
                            'MergeClause' => '<string>',
                            'MergeWhenMatched' => '<string>',
                            'MergeWhenNotMatched' => '<string>',
                            'PostAction' => '<string>',
                            'PreAction' => '<string>',
                            'SampleQuery' => '<string>',
                            'Schema' => [
                                'Description' => '<string>',
                                'Label' => '<string>',
                                'Value' => '<string>',
                            ],
                            'SelectedColumns' => [
                                [
                                    'Description' => '<string>',
                                    'Label' => '<string>',
                                    'Value' => '<string>',
                                ],
                                // ...
                            ],
                            'SourceType' => '<string>',
                            'StagingTable' => '<string>',
                            'Table' => [
                                'Description' => '<string>',
                                'Label' => '<string>',
                                'Value' => '<string>',
                            ],
                            'TablePrefix' => '<string>',
                            'TableSchema' => [
                                [
                                    'Description' => '<string>',
                                    'Label' => '<string>',
                                    'Value' => '<string>',
                                ],
                                // ...
                            ],
                            'TempDir' => '<string>',
                            'Upsert' => true || false,
                        ],
                        'Inputs' => ['<string>', ...],
                        'Name' => '<string>',
                    ],
                    'ApplyMapping' => [
                        'Inputs' => ['<string>', ...],
                        'Mapping' => [
                            [
                                'Children' => [...], // RECURSIVE
                                'Dropped' => true || false,
                                'FromPath' => ['<string>', ...],
                                'FromType' => '<string>',
                                'ToKey' => '<string>',
                                'ToType' => '<string>',
                            ],
                            // ...
                        ],
                        'Name' => '<string>',
                    ],
                    'AthenaConnectorSource' => [
                        'ConnectionName' => '<string>',
                        'ConnectionTable' => '<string>',
                        'ConnectionType' => '<string>',
                        'ConnectorName' => '<string>',
                        'Name' => '<string>',
                        'OutputSchemas' => [
                            [
                                'Columns' => [
                                    [