AWS Glue 2017-03-31
- Client: Aws\Glue\GlueClient
- Service ID: glue
- Version: 2017-03-31
This page describes the parameters and results for the operations of the AWS Glue (2017-03-31), and shows how to use the Aws\Glue\GlueClient object to call the described operations. This documentation is specific to the 2017-03-31 API version of the service.
Operation Summary
Each of the following operations can be created from a client using
$client->getCommand('CommandName')
, where "CommandName" is the
name of one of the following operations. Note: a command is a value that
encapsulates an operation and the parameters used to create an HTTP request.
You can also create and send a command immediately using the magic methods
available on a client object: $client->commandName(/* parameters */)
.
You can send the command asynchronously (returning a promise) by appending the
word "Async" to the operation name: $client->commandNameAsync(/* parameters */)
.
- BatchCreatePartition ( array $params = [] )
Creates one or more partitions in a batch operation.
- BatchDeleteConnection ( array $params = [] )
Deletes a list of connection definitions from the Data Catalog.
- BatchDeletePartition ( array $params = [] )
Deletes one or more partitions in a batch operation.
- BatchDeleteTable ( array $params = [] )
Deletes multiple tables at once.
- BatchDeleteTableVersion ( array $params = [] )
Deletes a specified batch of versions of a table.
- BatchGetCrawlers ( array $params = [] )
Returns a list of resource metadata for a given list of crawler names.
- BatchGetDevEndpoints ( array $params = [] )
Returns a list of resource metadata for a given list of development endpoint names.
- BatchGetJobs ( array $params = [] )
Returns a list of resource metadata for a given list of job names.
- BatchGetPartition ( array $params = [] )
Retrieves partitions in a batch request.
- BatchGetTriggers ( array $params = [] )
Returns a list of resource metadata for a given list of trigger names.
- BatchGetWorkflows ( array $params = [] )
Returns a list of resource metadata for a given list of workflow names.
- BatchStopJobRun ( array $params = [] )
Stops one or more job runs for a specified job definition.
- BatchUpdatePartition ( array $params = [] )
Updates one or more partitions in a batch operation.
- CancelMLTaskRun ( array $params = [] )
Cancels (stops) a task run.
- CheckSchemaVersionValidity ( array $params = [] )
Validates the supplied schema.
- CreateClassifier ( array $params = [] )
Creates a classifier in the user's account.
- CreateConnection ( array $params = [] )
Creates a connection definition in the Data Catalog.
- CreateCrawler ( array $params = [] )
Creates a new crawler with specified targets, role, configuration, and optional schedule.
- CreateDatabase ( array $params = [] )
Creates a new database in a Data Catalog.
- CreateDevEndpoint ( array $params = [] )
Creates a new development endpoint.
- CreateJob ( array $params = [] )
Creates a new job definition.
- CreateMLTransform ( array $params = [] )
Creates an AWS Glue machine learning transform.
- CreatePartition ( array $params = [] )
Creates a new partition.
- CreatePartitionIndex ( array $params = [] )
Creates a specified partition index in an existing table.
- CreateRegistry ( array $params = [] )
Creates a new registry which may be used to hold a collection of schemas.
- CreateSchema ( array $params = [] )
Creates a new schema set and registers the schema definition.
- CreateScript ( array $params = [] )
Transforms a directed acyclic graph (DAG) into code.
- CreateSecurityConfiguration ( array $params = [] )
Creates a new security configuration.
- CreateTable ( array $params = [] )
Creates a new table definition in the Data Catalog.
- CreateTrigger ( array $params = [] )
Creates a new trigger.
- CreateUserDefinedFunction ( array $params = [] )
Creates a new function definition in the Data Catalog.
- CreateWorkflow ( array $params = [] )
Creates a new workflow.
- DeleteClassifier ( array $params = [] )
Removes a classifier from the Data Catalog.
- DeleteColumnStatisticsForPartition ( array $params = [] )
Delete the partition column statistics of a column.
- DeleteColumnStatisticsForTable ( array $params = [] )
Retrieves table statistics of columns.
- DeleteConnection ( array $params = [] )
Deletes a connection from the Data Catalog.
- DeleteCrawler ( array $params = [] )
Removes a specified crawler from the AWS Glue Data Catalog, unless the crawler state is RUNNING.
- DeleteDatabase ( array $params = [] )
Removes a specified database from a Data Catalog.
- DeleteDevEndpoint ( array $params = [] )
Deletes a specified development endpoint.
- DeleteJob ( array $params = [] )
Deletes a specified job definition.
- DeleteMLTransform ( array $params = [] )
Deletes an AWS Glue machine learning transform.
- DeletePartition ( array $params = [] )
Deletes a specified partition.
- DeletePartitionIndex ( array $params = [] )
Deletes a specified partition index from an existing table.
- DeleteRegistry ( array $params = [] )
Delete the entire registry including schema and all of its versions.
- DeleteResourcePolicy ( array $params = [] )
Deletes a specified policy.
- DeleteSchema ( array $params = [] )
Deletes the entire schema set, including the schema set and all of its versions.
- DeleteSchemaVersions ( array $params = [] )
Remove versions from the specified schema.
- DeleteSecurityConfiguration ( array $params = [] )
Deletes a specified security configuration.
- DeleteTable ( array $params = [] )
Removes a table definition from the Data Catalog.
- DeleteTableVersion ( array $params = [] )
Deletes a specified version of a table.
- DeleteTrigger ( array $params = [] )
Deletes a specified trigger.
- DeleteUserDefinedFunction ( array $params = [] )
Deletes an existing function definition from the Data Catalog.
- DeleteWorkflow ( array $params = [] )
Deletes a workflow.
- GetCatalogImportStatus ( array $params = [] )
Retrieves the status of a migration operation.
- GetClassifier ( array $params = [] )
Retrieve a classifier by name.
- GetClassifiers ( array $params = [] )
Lists all classifier objects in the Data Catalog.
- GetColumnStatisticsForPartition ( array $params = [] )
Retrieves partition statistics of columns.
- GetColumnStatisticsForTable ( array $params = [] )
Retrieves table statistics of columns.
- GetConnection ( array $params = [] )
Retrieves a connection definition from the Data Catalog.
- GetConnections ( array $params = [] )
Retrieves a list of connection definitions from the Data Catalog.
- GetCrawler ( array $params = [] )
Retrieves metadata for a specified crawler.
- GetCrawlerMetrics ( array $params = [] )
Retrieves metrics about specified crawlers.
- GetCrawlers ( array $params = [] )
Retrieves metadata for all crawlers defined in the customer account.
- GetDataCatalogEncryptionSettings ( array $params = [] )
Retrieves the security configuration for a specified catalog.
- GetDatabase ( array $params = [] )
Retrieves the definition of a specified database.
- GetDatabases ( array $params = [] )
Retrieves all databases defined in a given Data Catalog.
- GetDataflowGraph ( array $params = [] )
Transforms a Python script into a directed acyclic graph (DAG).
- GetDevEndpoint ( array $params = [] )
Retrieves information about a specified development endpoint.
- GetDevEndpoints ( array $params = [] )
Retrieves all the development endpoints in this AWS account.
- GetJob ( array $params = [] )
Retrieves an existing job definition.
- GetJobBookmark ( array $params = [] )
Returns information on a job bookmark entry.
- GetJobRun ( array $params = [] )
Retrieves the metadata for a given job run.
- GetJobRuns ( array $params = [] )
Retrieves metadata for all runs of a given job definition.
- GetJobs ( array $params = [] )
Retrieves all current job definitions.
- GetMLTaskRun ( array $params = [] )
Gets details for a specific task run on a machine learning transform.
- GetMLTaskRuns ( array $params = [] )
Gets a list of runs for a machine learning transform.
- GetMLTransform ( array $params = [] )
Gets an AWS Glue machine learning transform artifact and all its corresponding metadata.
- GetMLTransforms ( array $params = [] )
Gets a sortable, filterable list of existing AWS Glue machine learning transforms.
- GetMapping ( array $params = [] )
Creates mappings.
- GetPartition ( array $params = [] )
Retrieves information about a specified partition.
- GetPartitionIndexes ( array $params = [] )
Retrieves the partition indexes associated with a table.
- GetPartitions ( array $params = [] )
Retrieves information about the partitions in a table.
- GetPlan ( array $params = [] )
Gets code to perform a specified mapping.
- GetRegistry ( array $params = [] )
Describes the specified registry in detail.
- GetResourcePolicies ( array $params = [] )
Retrieves the resource policies set on individual resources by AWS Resource Access Manager during cross-account permission grants.
- GetResourcePolicy ( array $params = [] )
Retrieves a specified resource policy.
- GetSchema ( array $params = [] )
Describes the specified schema in detail.
- GetSchemaByDefinition ( array $params = [] )
Retrieves a schema by the SchemaDefinition.
- GetSchemaVersion ( array $params = [] )
Get the specified schema by its unique ID assigned when a version of the schema is created or registered.
- GetSchemaVersionsDiff ( array $params = [] )
Fetches the schema version difference in the specified difference type between two stored schema versions in the Schema Registry.
- GetSecurityConfiguration ( array $params = [] )
Retrieves a specified security configuration.
- GetSecurityConfigurations ( array $params = [] )
Retrieves a list of all security configurations.
- GetTable ( array $params = [] )
Retrieves the Table definition in a Data Catalog for a specified table.
- GetTableVersion ( array $params = [] )
Retrieves a specified version of a table.
- GetTableVersions ( array $params = [] )
Retrieves a list of strings that identify available versions of a specified table.
- GetTables ( array $params = [] )
Retrieves the definitions of some or all of the tables in a given Database.
- GetTags ( array $params = [] )
Retrieves a list of tags associated with a resource.
- GetTrigger ( array $params = [] )
Retrieves the definition of a trigger.
- GetTriggers ( array $params = [] )
Gets all the triggers associated with a job.
- GetUserDefinedFunction ( array $params = [] )
Retrieves a specified function definition from the Data Catalog.
- GetUserDefinedFunctions ( array $params = [] )
Retrieves multiple function definitions from the Data Catalog.
- GetWorkflow ( array $params = [] )
Retrieves resource metadata for a workflow.
- GetWorkflowRun ( array $params = [] )
Retrieves the metadata for a given workflow run.
- GetWorkflowRunProperties ( array $params = [] )
Retrieves the workflow run properties which were set during the run.
- GetWorkflowRuns ( array $params = [] )
Retrieves metadata for all runs of a given workflow.
- ImportCatalogToGlue ( array $params = [] )
Imports an existing Amazon Athena Data Catalog to AWS Glue
- ListCrawlers ( array $params = [] )
Retrieves the names of all crawler resources in this AWS account, or the resources with the specified tag.
- ListDevEndpoints ( array $params = [] )
Retrieves the names of all DevEndpoint resources in this AWS account, or the resources with the specified tag.
- ListJobs ( array $params = [] )
Retrieves the names of all job resources in this AWS account, or the resources with the specified tag.
- ListMLTransforms ( array $params = [] )
Retrieves a sortable, filterable list of existing AWS Glue machine learning transforms in this AWS account, or the resources with the specified tag.
- ListRegistries ( array $params = [] )
Returns a list of registries that you have created, with minimal registry information.
- ListSchemaVersions ( array $params = [] )
Returns a list of schema versions that you have created, with minimal information.
- ListSchemas ( array $params = [] )
Returns a list of schemas with minimal details.
- ListTriggers ( array $params = [] )
Retrieves the names of all trigger resources in this AWS account, or the resources with the specified tag.
- ListWorkflows ( array $params = [] )
Lists names of workflows created in the account.
- PutDataCatalogEncryptionSettings ( array $params = [] )
Sets the security configuration for a specified catalog.
- PutResourcePolicy ( array $params = [] )
Sets the Data Catalog resource policy for access control.
- PutSchemaVersionMetadata ( array $params = [] )
Puts the metadata key value pair for a specified schema version ID.
- PutWorkflowRunProperties ( array $params = [] )
Puts the specified workflow run properties for the given workflow run.
- QuerySchemaVersionMetadata ( array $params = [] )
Queries for the schema version metadata information.
- RegisterSchemaVersion ( array $params = [] )
Adds a new version to the existing schema.
- RemoveSchemaVersionMetadata ( array $params = [] )
Removes a key value pair from the schema version metadata for the specified schema version ID.
- ResetJobBookmark ( array $params = [] )
Resets a bookmark entry.
- ResumeWorkflowRun ( array $params = [] )
Restarts selected nodes of a previous partially completed workflow run and resumes the workflow run.
- SearchTables ( array $params = [] )
Searches a set of tables based on properties in the table metadata as well as on the parent database.
- StartCrawler ( array $params = [] )
Starts a crawl using the specified crawler, regardless of what is scheduled.
- StartCrawlerSchedule ( array $params = [] )
Changes the schedule state of the specified crawler to SCHEDULED, unless the crawler is already running or the schedule state is already SCHEDULED.
- StartExportLabelsTaskRun ( array $params = [] )
Begins an asynchronous task to export all labeled data for a particular transform.
- StartImportLabelsTaskRun ( array $params = [] )
Enables you to provide additional labels (examples of truth) to be used to teach the machine learning transform and improve its quality.
- StartJobRun ( array $params = [] )
Starts a job run using a job definition.
- StartMLEvaluationTaskRun ( array $params = [] )
Starts a task to estimate the quality of the transform.
- StartMLLabelingSetGenerationTaskRun ( array $params = [] )
Starts the active learning workflow for your machine learning transform to improve the transform's quality by generating label sets and adding labels.
- StartTrigger ( array $params = [] )
Starts an existing trigger.
- StartWorkflowRun ( array $params = [] )
Starts a new run of the specified workflow.
- StopCrawler ( array $params = [] )
If the specified crawler is running, stops the crawl.
- StopCrawlerSchedule ( array $params = [] )
Sets the schedule state of the specified crawler to NOT_SCHEDULED, but does not stop the crawler if it is already running.
- StopTrigger ( array $params = [] )
Stops a specified trigger.
- StopWorkflowRun ( array $params = [] )
Stops the execution of the specified workflow run.
- TagResource ( array $params = [] )
Adds tags to a resource.
- UntagResource ( array $params = [] )
Removes tags from a resource.
- UpdateClassifier ( array $params = [] )
Modifies an existing classifier (a GrokClassifier, an XMLClassifier, a JsonClassifier, or a CsvClassifier, depending on which field is present).
- UpdateColumnStatisticsForPartition ( array $params = [] )
Creates or updates partition statistics of columns.
- UpdateColumnStatisticsForTable ( array $params = [] )
Creates or updates table statistics of columns.
- UpdateConnection ( array $params = [] )
Updates a connection definition in the Data Catalog.
- UpdateCrawler ( array $params = [] )
Updates a crawler.
- UpdateCrawlerSchedule ( array $params = [] )
Updates the schedule of a crawler using a cron expression.
- UpdateDatabase ( array $params = [] )
Updates an existing database definition in a Data Catalog.
- UpdateDevEndpoint ( array $params = [] )
Updates a specified development endpoint.
- UpdateJob ( array $params = [] )
Updates an existing job definition.
- UpdateMLTransform ( array $params = [] )
Updates an existing machine learning transform.
- UpdatePartition ( array $params = [] )
Updates a partition.
- UpdateRegistry ( array $params = [] )
Updates an existing registry which is used to hold a collection of schemas.
- UpdateSchema ( array $params = [] )
Updates the description, compatibility setting, or version checkpoint for a schema set.
- UpdateTable ( array $params = [] )
Updates a metadata table in the Data Catalog.
- UpdateTrigger ( array $params = [] )
Updates a trigger definition.
- UpdateUserDefinedFunction ( array $params = [] )
Updates an existing function definition in the Data Catalog.
- UpdateWorkflow ( array $params = [] )
Updates an existing workflow.
Paginators
Paginators handle automatically iterating over paginated API results. Paginators are associated with specific API operations, and they accept the parameters that the corresponding API operation accepts. You can get a paginator from a client class using getPaginator($paginatorName, $operationParameters). This client supports the following paginators:
- GetClassifiers
- GetConnections
- GetCrawlerMetrics
- GetCrawlers
- GetDatabases
- GetDevEndpoints
- GetJobRuns
- GetJobs
- GetMLTaskRuns
- GetMLTransforms
- GetPartitionIndexes
- GetPartitions
- GetResourcePolicies
- GetSecurityConfigurations
- GetTableVersions
- GetTables
- GetTriggers
- GetUserDefinedFunctions
- GetWorkflowRuns
- ListCrawlers
- ListDevEndpoints
- ListJobs
- ListMLTransforms
- ListRegistries
- ListSchemaVersions
- ListSchemas
- ListTriggers
- ListWorkflows
- SearchTables
Operations
BatchCreatePartition
$result = $client->batchCreatePartition
([/* ... */]); $promise = $client->batchCreatePartitionAsync
([/* ... */]);
Creates one or more partitions in a batch operation.
Parameter Syntax
$result = $client->batchCreatePartition([ 'CatalogId' => '<string>', 'DatabaseName' => '<string>', // REQUIRED 'PartitionInputList' => [ // REQUIRED [ 'LastAccessTime' => <integer || string || DateTime>, 'LastAnalyzedTime' => <integer || string || DateTime>, 'Parameters' => ['<string>', ...], 'StorageDescriptor' => [ 'BucketColumns' => ['<string>', ...], 'Columns' => [ [ 'Comment' => '<string>', 'Name' => '<string>', // REQUIRED 'Parameters' => ['<string>', ...], 'Type' => '<string>', ], // ... ], 'Compressed' => true || false, 'InputFormat' => '<string>', 'Location' => '<string>', 'NumberOfBuckets' => <integer>, 'OutputFormat' => '<string>', 'Parameters' => ['<string>', ...], 'SchemaReference' => [ 'SchemaId' => [ 'RegistryName' => '<string>', 'SchemaArn' => '<string>', 'SchemaName' => '<string>', ], 'SchemaVersionId' => '<string>', 'SchemaVersionNumber' => <integer>, ], 'SerdeInfo' => [ 'Name' => '<string>', 'Parameters' => ['<string>', ...], 'SerializationLibrary' => '<string>', ], 'SkewedInfo' => [ 'SkewedColumnNames' => ['<string>', ...], 'SkewedColumnValueLocationMaps' => ['<string>', ...], 'SkewedColumnValues' => ['<string>', ...], ], 'SortColumns' => [ [ 'Column' => '<string>', // REQUIRED 'SortOrder' => <integer>, // REQUIRED ], // ... ], 'StoredAsSubDirectories' => true || false, ], 'Values' => ['<string>', ...], ], // ... ], 'TableName' => '<string>', // REQUIRED ]);
Parameter Details
Members
- CatalogId
-
- Type: string
The ID of the catalog in which the partition is to be created. Currently, this should be the AWS account ID.
- DatabaseName
-
- Required: Yes
- Type: string
The name of the metadata database in which the partition is to be created.
- PartitionInputList
-
- Required: Yes
- Type: Array of PartitionInput structures
A list of
PartitionInput
structures that define the partitions to be created. - TableName
-
- Required: Yes
- Type: string
The name of the metadata table in which the partition is to be created.
Result Syntax
[ 'Errors' => [ [ 'ErrorDetail' => [ 'ErrorCode' => '<string>', 'ErrorMessage' => '<string>', ], 'PartitionValues' => ['<string>', ...], ], // ... ], ]
Result Details
Members
- Errors
-
- Type: Array of PartitionError structures
The errors encountered when trying to create the requested partitions.
Errors
-
The input provided was not valid.
-
A resource to be created or added already exists.
-
ResourceNumberLimitExceededException:
A resource numerical limit was exceeded.
-
An internal service error occurred.
-
A specified entity does not exist
-
The operation timed out.
-
An encryption operation failed.
BatchDeleteConnection
$result = $client->batchDeleteConnection
([/* ... */]); $promise = $client->batchDeleteConnectionAsync
([/* ... */]);
Deletes a list of connection definitions from the Data Catalog.
Parameter Syntax
$result = $client->batchDeleteConnection([ 'CatalogId' => '<string>', 'ConnectionNameList' => ['<string>', ...], // REQUIRED ]);
Parameter Details
Members
Result Syntax
[ 'Errors' => [ '<NameString>' => [ 'ErrorCode' => '<string>', 'ErrorMessage' => '<string>', ], // ... ], 'Succeeded' => ['<string>', ...], ]
Result Details
Members
- Errors
-
- Type: Associative array of custom strings keys (NameString) to ErrorDetail structures
A map of the names of connections that were not successfully deleted to error details.
- Succeeded
-
- Type: Array of strings
A list of names of the connection definitions that were successfully deleted.
Errors
-
An internal service error occurred.
-
The operation timed out.
BatchDeletePartition
$result = $client->batchDeletePartition
([/* ... */]); $promise = $client->batchDeletePartitionAsync
([/* ... */]);
Deletes one or more partitions in a batch operation.
Parameter Syntax
$result = $client->batchDeletePartition([ 'CatalogId' => '<string>', 'DatabaseName' => '<string>', // REQUIRED 'PartitionsToDelete' => [ // REQUIRED [ 'Values' => ['<string>', ...], // REQUIRED ], // ... ], 'TableName' => '<string>', // REQUIRED ]);
Parameter Details
Members
- CatalogId
-
- Type: string
The ID of the Data Catalog where the partition to be deleted resides. If none is provided, the AWS account ID is used by default.
- DatabaseName
-
- Required: Yes
- Type: string
The name of the catalog database in which the table in question resides.
- PartitionsToDelete
-
- Required: Yes
- Type: Array of PartitionValueList structures
A list of
PartitionInput
structures that define the partitions to be deleted. - TableName
-
- Required: Yes
- Type: string
The name of the table that contains the partitions to be deleted.
Result Syntax
[ 'Errors' => [ [ 'ErrorDetail' => [ 'ErrorCode' => '<string>', 'ErrorMessage' => '<string>', ], 'PartitionValues' => ['<string>', ...], ], // ... ], ]
Result Details
Members
- Errors
-
- Type: Array of PartitionError structures
The errors encountered when trying to delete the requested partitions.
Errors
-
The input provided was not valid.
-
A specified entity does not exist
-
An internal service error occurred.
-
The operation timed out.
BatchDeleteTable
$result = $client->batchDeleteTable
([/* ... */]); $promise = $client->batchDeleteTableAsync
([/* ... */]);
Deletes multiple tables at once.
After completing this operation, you no longer have access to the table versions and partitions that belong to the deleted table. AWS Glue deletes these "orphaned" resources asynchronously in a timely manner, at the discretion of the service.
To ensure the immediate deletion of all related resources, before calling BatchDeleteTable
, use DeleteTableVersion
or BatchDeleteTableVersion
, and DeletePartition
or BatchDeletePartition
, to delete any resources that belong to the table.
Parameter Syntax
$result = $client->batchDeleteTable([ 'CatalogId' => '<string>', 'DatabaseName' => '<string>', // REQUIRED 'TablesToDelete' => ['<string>', ...], // REQUIRED ]);
Parameter Details
Members
- CatalogId
-
- Type: string
The ID of the Data Catalog where the table resides. If none is provided, the AWS account ID is used by default.
- DatabaseName
-
- Required: Yes
- Type: string
The name of the catalog database in which the tables to delete reside. For Hive compatibility, this name is entirely lowercase.
- TablesToDelete
-
- Required: Yes
- Type: Array of strings
A list of the table to delete.
Result Syntax
[ 'Errors' => [ [ 'ErrorDetail' => [ 'ErrorCode' => '<string>', 'ErrorMessage' => '<string>', ], 'TableName' => '<string>', ], // ... ], ]
Result Details
Members
- Errors
-
- Type: Array of TableError structures
A list of errors encountered in attempting to delete the specified tables.
Errors
-
The input provided was not valid.
-
A specified entity does not exist
-
An internal service error occurred.
-
The operation timed out.
BatchDeleteTableVersion
$result = $client->batchDeleteTableVersion
([/* ... */]); $promise = $client->batchDeleteTableVersionAsync
([/* ... */]);
Deletes a specified batch of versions of a table.
Parameter Syntax
$result = $client->batchDeleteTableVersion([ 'CatalogId' => '<string>', 'DatabaseName' => '<string>', // REQUIRED 'TableName' => '<string>', // REQUIRED 'VersionIds' => ['<string>', ...], // REQUIRED ]);
Parameter Details
Members
- CatalogId
-
- Type: string
The ID of the Data Catalog where the tables reside. If none is provided, the AWS account ID is used by default.
- DatabaseName
-
- Required: Yes
- Type: string
The database in the catalog in which the table resides. For Hive compatibility, this name is entirely lowercase.
- TableName
-
- Required: Yes
- Type: string
The name of the table. For Hive compatibility, this name is entirely lowercase.
- VersionIds
-
- Required: Yes
- Type: Array of strings
A list of the IDs of versions to be deleted. A
VersionId
is a string representation of an integer. Each version is incremented by 1.
Result Syntax
[ 'Errors' => [ [ 'ErrorDetail' => [ 'ErrorCode' => '<string>', 'ErrorMessage' => '<string>', ], 'TableName' => '<string>', 'VersionId' => '<string>', ], // ... ], ]
Result Details
Members
- Errors
-
- Type: Array of TableVersionError structures
A list of errors encountered while trying to delete the specified table versions.
Errors
-
A specified entity does not exist
-
The input provided was not valid.
-
An internal service error occurred.
-
The operation timed out.
BatchGetCrawlers
$result = $client->batchGetCrawlers
([/* ... */]); $promise = $client->batchGetCrawlersAsync
([/* ... */]);
Returns a list of resource metadata for a given list of crawler names. After calling the ListCrawlers
operation, you can call this operation to access the data to which you have been granted permissions. This operation supports all IAM permissions, including permission conditions that uses tags.
Parameter Syntax
$result = $client->batchGetCrawlers([ 'CrawlerNames' => ['<string>', ...], // REQUIRED ]);
Parameter Details
Members
Result Syntax
[ 'Crawlers' => [ [ 'Classifiers' => ['<string>', ...], 'Configuration' => '<string>', 'CrawlElapsedTime' => <integer>, 'CrawlerSecurityConfiguration' => '<string>', 'CreationTime' => <DateTime>, 'DatabaseName' => '<string>', 'Description' => '<string>', 'LastCrawl' => [ 'ErrorMessage' => '<string>', 'LogGroup' => '<string>', 'LogStream' => '<string>', 'MessagePrefix' => '<string>', 'StartTime' => <DateTime>, 'Status' => 'SUCCEEDED|CANCELLED|FAILED', ], 'LastUpdated' => <DateTime>, 'LineageConfiguration' => [ 'CrawlerLineageSettings' => 'ENABLE|DISABLE', ], 'Name' => '<string>', 'RecrawlPolicy' => [ 'RecrawlBehavior' => 'CRAWL_EVERYTHING|CRAWL_NEW_FOLDERS_ONLY', ], 'Role' => '<string>', 'Schedule' => [ 'ScheduleExpression' => '<string>', 'State' => 'SCHEDULED|NOT_SCHEDULED|TRANSITIONING', ], 'SchemaChangePolicy' => [ 'DeleteBehavior' => 'LOG|DELETE_FROM_DATABASE|DEPRECATE_IN_DATABASE', 'UpdateBehavior' => 'LOG|UPDATE_IN_DATABASE', ], 'State' => 'READY|RUNNING|STOPPING', 'TablePrefix' => '<string>', 'Targets' => [ 'CatalogTargets' => [ [ 'DatabaseName' => '<string>', 'Tables' => ['<string>', ...], ], // ... ], 'DynamoDBTargets' => [ [ 'Path' => '<string>', 'scanAll' => true || false, 'scanRate' => <float>, ], // ... ], 'JdbcTargets' => [ [ 'ConnectionName' => '<string>', 'Exclusions' => ['<string>', ...], 'Path' => '<string>', ], // ... ], 'MongoDBTargets' => [ [ 'ConnectionName' => '<string>', 'Path' => '<string>', 'ScanAll' => true || false, ], // ... ], 'S3Targets' => [ [ 'ConnectionName' => '<string>', 'Exclusions' => ['<string>', ...], 'Path' => '<string>', ], // ... ], ], 'Version' => <integer>, ], // ... ], 'CrawlersNotFound' => ['<string>', ...], ]
Result Details
Members
- Crawlers
-
- Type: Array of Crawler structures
A list of crawler definitions.
- CrawlersNotFound
-
- Type: Array of strings
A list of names of crawlers that were not found.
Errors
-
The input provided was not valid.
-
The operation timed out.
BatchGetDevEndpoints
$result = $client->batchGetDevEndpoints
([/* ... */]); $promise = $client->batchGetDevEndpointsAsync
([/* ... */]);
Returns a list of resource metadata for a given list of development endpoint names. After calling the ListDevEndpoints
operation, you can call this operation to access the data to which you have been granted permissions. This operation supports all IAM permissions, including permission conditions that uses tags.
Parameter Syntax
$result = $client->batchGetDevEndpoints([ 'DevEndpointNames' => ['<string>', ...], // REQUIRED ]);
Parameter Details
Members
Result Syntax
[ 'DevEndpoints' => [ [ 'Arguments' => ['<string>', ...], 'AvailabilityZone' => '<string>', 'CreatedTimestamp' => <DateTime>, 'EndpointName' => '<string>', 'ExtraJarsS3Path' => '<string>', 'ExtraPythonLibsS3Path' => '<string>', 'FailureReason' => '<string>', 'GlueVersion' => '<string>', 'LastModifiedTimestamp' => <DateTime>, 'LastUpdateStatus' => '<string>', 'NumberOfNodes' => <integer>, 'NumberOfWorkers' => <integer>, 'PrivateAddress' => '<string>', 'PublicAddress' => '<string>', 'PublicKey' => '<string>', 'PublicKeys' => ['<string>', ...], 'RoleArn' => '<string>', 'SecurityConfiguration' => '<string>', 'SecurityGroupIds' => ['<string>', ...], 'Status' => '<string>', 'SubnetId' => '<string>', 'VpcId' => '<string>', 'WorkerType' => 'Standard|G.1X|G.2X', 'YarnEndpointAddress' => '<string>', 'ZeppelinRemoteSparkInterpreterPort' => <integer>, ], // ... ], 'DevEndpointsNotFound' => ['<string>', ...], ]
Result Details
Members
- DevEndpoints
-
- Type: Array of DevEndpoint structures
A list of
DevEndpoint
definitions. - DevEndpointsNotFound
-
- Type: Array of strings
A list of
DevEndpoints
not found.
Errors
-
Access to a resource was denied.
-
An internal service error occurred.
-
The operation timed out.
-
The input provided was not valid.
BatchGetJobs
$result = $client->batchGetJobs
([/* ... */]); $promise = $client->batchGetJobsAsync
([/* ... */]);
Returns a list of resource metadata for a given list of job names. After calling the ListJobs
operation, you can call this operation to access the data to which you have been granted permissions. This operation supports all IAM permissions, including permission conditions that uses tags.
Parameter Syntax
$result = $client->batchGetJobs([ 'JobNames' => ['<string>', ...], // REQUIRED ]);
Parameter Details
Members
Result Syntax
[ 'Jobs' => [ [ 'AllocatedCapacity' => <integer>, 'Command' => [ 'Name' => '<string>', 'PythonVersion' => '<string>', 'ScriptLocation' => '<string>', ], 'Connections' => [ 'Connections' => ['<string>', ...], ], 'CreatedOn' => <DateTime>, 'DefaultArguments' => ['<string>', ...], 'Description' => '<string>', 'ExecutionProperty' => [ 'MaxConcurrentRuns' => <integer>, ], 'GlueVersion' => '<string>', 'LastModifiedOn' => <DateTime>, 'LogUri' => '<string>', 'MaxCapacity' => <float>, 'MaxRetries' => <integer>, 'Name' => '<string>', 'NonOverridableArguments' => ['<string>', ...], 'NotificationProperty' => [ 'NotifyDelayAfter' => <integer>, ], 'NumberOfWorkers' => <integer>, 'Role' => '<string>', 'SecurityConfiguration' => '<string>', 'Timeout' => <integer>, 'WorkerType' => 'Standard|G.1X|G.2X', ], // ... ], 'JobsNotFound' => ['<string>', ...], ]
Result Details
Members
- Jobs
-
- Type: Array of Job structures
A list of job definitions.
- JobsNotFound
-
- Type: Array of strings
A list of names of jobs not found.
Errors
-
An internal service error occurred.
-
The operation timed out.
-
The input provided was not valid.
BatchGetPartition
$result = $client->batchGetPartition
([/* ... */]); $promise = $client->batchGetPartitionAsync
([/* ... */]);
Retrieves partitions in a batch request.
Parameter Syntax
$result = $client->batchGetPartition([ 'CatalogId' => '<string>', 'DatabaseName' => '<string>', // REQUIRED 'PartitionsToGet' => [ // REQUIRED [ 'Values' => ['<string>', ...], // REQUIRED ], // ... ], 'TableName' => '<string>', // REQUIRED ]);
Parameter Details
Members
- CatalogId
-
- Type: string
The ID of the Data Catalog where the partitions in question reside. If none is supplied, the AWS account ID is used by default.
- DatabaseName
-
- Required: Yes
- Type: string
The name of the catalog database where the partitions reside.
- PartitionsToGet
-
- Required: Yes
- Type: Array of PartitionValueList structures
A list of partition values identifying the partitions to retrieve.
- TableName
-
- Required: Yes
- Type: string
The name of the partitions' table.
Result Syntax
[ 'Partitions' => [ [ 'CatalogId' => '<string>', 'CreationTime' => <DateTime>, 'DatabaseName' => '<string>', 'LastAccessTime' => <DateTime>, 'LastAnalyzedTime' => <DateTime>, 'Parameters' => ['<string>', ...], 'StorageDescriptor' => [ 'BucketColumns' => ['<string>', ...], 'Columns' => [ [ 'Comment' => '<string>', 'Name' => '<string>', 'Parameters' => ['<string>', ...], 'Type' => '<string>', ], // ... ], 'Compressed' => true || false, 'InputFormat' => '<string>', 'Location' => '<string>', 'NumberOfBuckets' => <integer>, 'OutputFormat' => '<string>', 'Parameters' => ['<string>', ...], 'SchemaReference' => [ 'SchemaId' => [ 'RegistryName' => '<string>', 'SchemaArn' => '<string>', 'SchemaName' => '<string>', ], 'SchemaVersionId' => '<string>', 'SchemaVersionNumber' => <integer>, ], 'SerdeInfo' => [ 'Name' => '<string>', 'Parameters' => ['<string>', ...], 'SerializationLibrary' => '<string>', ], 'SkewedInfo' => [ 'SkewedColumnNames' => ['<string>', ...], 'SkewedColumnValueLocationMaps' => ['<string>', ...], 'SkewedColumnValues' => ['<string>', ...], ], 'SortColumns' => [ [ 'Column' => '<string>', 'SortOrder' => <integer>, ], // ... ], 'StoredAsSubDirectories' => true || false, ], 'TableName' => '<string>', 'Values' => ['<string>', ...], ], // ... ], 'UnprocessedKeys' => [ [ 'Values' => ['<string>', ...], ], // ... ], ]
Result Details
Members
- Partitions
-
- Type: Array of Partition structures
A list of the requested partitions.
- UnprocessedKeys
-
- Type: Array of PartitionValueList structures
A list of the partition values in the request for which partitions were not returned.
Errors
-
The input provided was not valid.
-
A specified entity does not exist
-
The operation timed out.
-
An internal service error occurred.
-
An encryption operation failed.
BatchGetTriggers
$result = $client->batchGetTriggers
([/* ... */]); $promise = $client->batchGetTriggersAsync
([/* ... */]);
Returns a list of resource metadata for a given list of trigger names. After calling the ListTriggers
operation, you can call this operation to access the data to which you have been granted permissions. This operation supports all IAM permissions, including permission conditions that uses tags.
Parameter Syntax
$result = $client->batchGetTriggers([ 'TriggerNames' => ['<string>', ...], // REQUIRED ]);
Parameter Details
Members
Result Syntax
[ 'Triggers' => [ [ 'Actions' => [ [ 'Arguments' => ['<string>', ...], 'CrawlerName' => '<string>', 'JobName' => '<string>', 'NotificationProperty' => [ 'NotifyDelayAfter' => <integer>, ], 'SecurityConfiguration' => '<string>', 'Timeout' => <integer>, ], // ... ], 'Description' => '<string>', 'Id' => '<string>', 'Name' => '<string>', 'Predicate' => [ 'Conditions' => [ [ 'CrawlState' => 'RUNNING|CANCELLING|CANCELLED|SUCCEEDED|FAILED', 'CrawlerName' => '<string>', 'JobName' => '<string>', 'LogicalOperator' => 'EQUALS', 'State' => 'STARTING|RUNNING|STOPPING|STOPPED|SUCCEEDED|FAILED|TIMEOUT', ], // ... ], 'Logical' => 'AND|ANY', ], 'Schedule' => '<string>', 'State' => 'CREATING|CREATED|ACTIVATING|ACTIVATED|DEACTIVATING|DEACTIVATED|DELETING|UPDATING', 'Type' => 'SCHEDULED|CONDITIONAL|ON_DEMAND', 'WorkflowName' => '<string>', ], // ... ], 'TriggersNotFound' => ['<string>', ...], ]
Result Details
Members
- Triggers
-
- Type: Array of Trigger structures
A list of trigger definitions.
- TriggersNotFound
-
- Type: Array of strings
A list of names of triggers not found.
Errors
-
An internal service error occurred.
-
The operation timed out.
-
The input provided was not valid.
BatchGetWorkflows
$result = $client->batchGetWorkflows
([/* ... */]); $promise = $client->batchGetWorkflowsAsync
([/* ... */]);
Returns a list of resource metadata for a given list of workflow names. After calling the ListWorkflows
operation, you can call this operation to access the data to which you have been granted permissions. This operation supports all IAM permissions, including permission conditions that uses tags.
Parameter Syntax
$result = $client->batchGetWorkflows([ 'IncludeGraph' => true || false, 'Names' => ['<string>', ...], // REQUIRED ]);
Parameter Details
Members
Result Syntax
[ 'MissingWorkflows' => ['<string>', ...], 'Workflows' => [ [ 'CreatedOn' => <DateTime>, 'DefaultRunProperties' => ['<string>', ...], 'Description' => '<string>', 'Graph' => [ 'Edges' => [ [ 'DestinationId' => '<string>', 'SourceId' => '<string>', ], // ... ], 'Nodes' => [ [ 'CrawlerDetails' => [ 'Crawls' => [ [ 'CompletedOn' => <DateTime>, 'ErrorMessage' => '<string>', 'LogGroup' => '<string>', 'LogStream' => '<string>', 'StartedOn' => <DateTime>, 'State' => 'RUNNING|CANCELLING|CANCELLED|SUCCEEDED|FAILED', ], // ... ], ], 'JobDetails' => [ 'JobRuns' => [ [ 'AllocatedCapacity' => <integer>, 'Arguments' => ['<string>', ...], 'Attempt' => <integer>, 'CompletedOn' => <DateTime>, 'ErrorMessage' => '<string>', 'ExecutionTime' => <integer>, 'GlueVersion' => '<string>', 'Id' => '<string>', 'JobName' => '<string>', 'JobRunState' => 'STARTING|RUNNING|STOPPING|STOPPED|SUCCEEDED|FAILED|TIMEOUT', 'LastModifiedOn' => <DateTime>, 'LogGroupName' => '<string>', 'MaxCapacity' => <float>, 'NotificationProperty' => [ 'NotifyDelayAfter' => <integer>, ], 'NumberOfWorkers' => <integer>, 'PredecessorRuns' => [ [ 'JobName' => '<string>', 'RunId' => '<string>', ], // ... ], 'PreviousRunId' => '<string>', 'SecurityConfiguration' => '<string>', 'StartedOn' => <DateTime>, 'Timeout' => <integer>, 'TriggerName' => '<string>', 'WorkerType' => 'Standard|G.1X|G.2X', ], // ... ], ], 'Name' => '<string>', 'TriggerDetails' => [ 'Trigger' => [ 'Actions' => [ [ 'Arguments' => ['<string>', ...], 'CrawlerName' => '<string>', 'JobName' => '<string>', 'NotificationProperty' => [ 'NotifyDelayAfter' => <integer>, ], 'SecurityConfiguration' => '<string>', 'Timeout' => <integer>, ], // ... ], 'Description' => '<string>', 'Id' => '<string>', 'Name' => '<string>', 'Predicate' => [ 'Conditions' => [ [ 'CrawlState' => 'RUNNING|CANCELLING|CANCELLED|SUCCEEDED|FAILED', 'CrawlerName' => '<string>', 'JobName' => '<string>', 'LogicalOperator' => 'EQUALS', 'State' => 'STARTING|RUNNING|STOPPING|STOPPED|SUCCEEDED|FAILED|TIMEOUT', ], // ... ], 'Logical' => 'AND|ANY', ], 'Schedule' => '<string>', 'State' => 'CREATING|CREATED|ACTIVATING|ACTIVATED|DEACTIVATING|DEACTIVATED|DELETING|UPDATING', 'Type' => 'SCHEDULED|CONDITIONAL|ON_DEMAND', 'WorkflowName' => '<string>', ], ], 'Type' => 'CRAWLER|JOB|TRIGGER', 'UniqueId' => '<string>', ], // ... ], ], 'LastModifiedOn' => <DateTime>, 'LastRun' => [ 'CompletedOn' => <DateTime>, 'ErrorMessage' => '<string>', 'Graph' => [ 'Edges' => [ [ 'DestinationId' => '<string>', 'SourceId' => '<string>', ], // ... ], 'Nodes' => [ [ 'CrawlerDetails' => [ 'Crawls' => [ [ 'CompletedOn' => <DateTime>, 'ErrorMessage' => '<string>', 'LogGroup' => '<string>', 'LogStream' => '<string>', 'StartedOn' => <DateTime>, 'State' => 'RUNNING|CANCELLING|CANCELLED|SUCCEEDED|FAILED', ], // ... ], ], 'JobDetails' => [ 'JobRuns' => [ [ 'AllocatedCapacity' => <integer>, 'Arguments' => ['<string>', ...], 'Attempt' => <integer>, 'CompletedOn' => <DateTime>, 'ErrorMessage' => '<string>', 'ExecutionTime' => <integer>, 'GlueVersion' => '<string>', 'Id' => '<string>', 'JobName' => '<string>', 'JobRunState' => 'STARTING|RUNNING|STOPPING|STOPPED|SUCCEEDED|FAILED|TIMEOUT', 'LastModifiedOn' => <DateTime>, 'LogGroupName' => '<string>', 'MaxCapacity' => <float>, 'NotificationProperty' => [ 'NotifyDelayAfter' => <integer>, ], 'NumberOfWorkers' => <integer>, 'PredecessorRuns' => [ [ 'JobName' => '<string>', 'RunId' => '<string>', ], // ... ], 'PreviousRunId' => '<string>', 'SecurityConfiguration' => '<string>', 'StartedOn' => <DateTime>, 'Timeout' => <integer>, 'TriggerName' => '<string>', 'WorkerType' => 'Standard|G.1X|G.2X', ], // ... ], ], 'Name' => '<string>', 'TriggerDetails' => [ 'Trigger' => [ 'Actions' => [ [ 'Arguments' => ['<string>', ...], 'CrawlerName' => '<string>', 'JobName' => '<string>', 'NotificationProperty' => [ 'NotifyDelayAfter' => <integer>, ], 'SecurityConfiguration' => '<string>', 'Timeout' => <integer>, ], // ... ], 'Description' => '<string>', 'Id' => '<string>', 'Name' => '<string>', 'Predicate' => [ 'Conditions' => [ [ 'CrawlState' => 'RUNNING|CANCELLING|CANCELLED|SUCCEEDED|FAILED', 'CrawlerName' => '<string>', 'JobName' => '<string>', 'LogicalOperator' => 'EQUALS', 'State' => 'STARTING|RUNNING|STOPPING|STOPPED|SUCCEEDED|FAILED|TIMEOUT', ], // ... ], 'Logical' => 'AND|ANY', ], 'Schedule' => '<string>', 'State' => 'CREATING|CREATED|ACTIVATING|ACTIVATED|DEACTIVATING|DEACTIVATED|DELETING|UPDATING', 'Type' => 'SCHEDULED|CONDITIONAL|ON_DEMAND', 'WorkflowName' => '<string>', ], ], 'Type' => 'CRAWLER|JOB|TRIGGER', 'UniqueId' => '<string>', ], // ... ], ], 'Name' => '<string>', 'PreviousRunId' => '<string>', 'StartedOn' => <DateTime>, 'Statistics' => [ 'FailedActions' => <integer>, 'RunningActions' => <integer>, 'StoppedActions' => <integer>, 'SucceededActions' => <integer>, 'TimeoutActions' => <integer>, 'TotalActions' => <integer>, ], 'Status' => 'RUNNING|COMPLETED|STOPPING|STOPPED|ERROR', 'WorkflowRunId' => '<string>', 'WorkflowRunProperties' => ['<string>', ...], ], 'MaxConcurrentRuns' => <integer>, 'Name' => '<string>', ], // ... ], ]
Result Details
Members
- MissingWorkflows
-
- Type: Array of strings
A list of names of workflows not found.
- Workflows
-
- Type: Array of Workflow structures
A list of workflow resource metadata.
Errors
-
An internal service error occurred.
-
The operation timed out.
-
The input provided was not valid.
BatchStopJobRun
$result = $client->batchStopJobRun
([/* ... */]); $promise = $client->batchStopJobRunAsync
([/* ... */]);
Stops one or more job runs for a specified job definition.
Parameter Syntax
$result = $client->batchStopJobRun([ 'JobName' => '<string>', // REQUIRED 'JobRunIds' => ['<string>', ...], // REQUIRED ]);
Parameter Details
Members
Result Syntax
[ 'Errors' => [ [ 'ErrorDetail' => [ 'ErrorCode' => '<string>', 'ErrorMessage' => '<string>', ], 'JobName' => '<string>', 'JobRunId' => '<string>', ], // ... ], 'SuccessfulSubmissions' => [ [ 'JobName' => '<string>', 'JobRunId' => '<string>', ], // ... ], ]
Result Details
Members
- Errors
-
- Type: Array of BatchStopJobRunError structures
A list of the errors that were encountered in trying to stop
JobRuns
, including theJobRunId
for which each error was encountered and details about the error. - SuccessfulSubmissions
-
- Type: Array of BatchStopJobRunSuccessfulSubmission structures
A list of the JobRuns that were successfully submitted for stopping.
Errors
-
The input provided was not valid.
-
An internal service error occurred.
-
The operation timed out.
BatchUpdatePartition
$result = $client->batchUpdatePartition
([/* ... */]); $promise = $client->batchUpdatePartitionAsync
([/* ... */]);
Updates one or more partitions in a batch operation.
Parameter Syntax
$result = $client->batchUpdatePartition([ 'CatalogId' => '<string>', 'DatabaseName' => '<string>', // REQUIRED 'Entries' => [ // REQUIRED [ 'PartitionInput' => [ // REQUIRED 'LastAccessTime' => <integer || string || DateTime>, 'LastAnalyzedTime' => <integer || string || DateTime>, 'Parameters' => ['<string>', ...], 'StorageDescriptor' => [ 'BucketColumns' => ['<string>', ...], 'Columns' => [ [ 'Comment' => '<string>', 'Name' => '<string>', // REQUIRED 'Parameters' => ['<string>', ...], 'Type' => '<string>', ], // ... ], 'Compressed' => true || false, 'InputFormat' => '<string>', 'Location' => '<string>', 'NumberOfBuckets' => <integer>, 'OutputFormat' => '<string>', 'Parameters' => ['<string>', ...], 'SchemaReference' => [ 'SchemaId' => [ 'RegistryName' => '<string>', 'SchemaArn' => '<string>', 'SchemaName' => '<string>', ], 'SchemaVersionId' => '<string>', 'SchemaVersionNumber' => <integer>, ], 'SerdeInfo' => [ 'Name' => '<string>', 'Parameters' => ['<string>', ...], 'SerializationLibrary' => '<string>', ], 'SkewedInfo' => [ 'SkewedColumnNames' => ['<string>', ...], 'SkewedColumnValueLocationMaps' => ['<string>', ...], 'SkewedColumnValues' => ['<string>', ...], ], 'SortColumns' => [ [ 'Column' => '<string>', // REQUIRED 'SortOrder' => <integer>, // REQUIRED ], // ... ], 'StoredAsSubDirectories' => true || false, ], 'Values' => ['<string>', ...], ], 'PartitionValueList' => ['<string>', ...], // REQUIRED ], // ... ], 'TableName' => '<string>', // REQUIRED ]);
Parameter Details
Members
- CatalogId
-
- Type: string
The ID of the catalog in which the partition is to be updated. Currently, this should be the AWS account ID.
- DatabaseName
-
- Required: Yes
- Type: string
The name of the metadata database in which the partition is to be updated.
- Entries
-
- Required: Yes
- Type: Array of BatchUpdatePartitionRequestEntry structures
A list of up to 100
BatchUpdatePartitionRequestEntry
objects to update. - TableName
-
- Required: Yes
- Type: string
The name of the metadata table in which the partition is to be updated.
Result Syntax
[ 'Errors' => [ [ 'ErrorDetail' => [ 'ErrorCode' => '<string>', 'ErrorMessage' => '<string>', ], 'PartitionValueList' => ['<string>', ...], ], // ... ], ]
Result Details
Members
- Errors
-
- Type: Array of BatchUpdatePartitionFailureEntry structures
The errors encountered when trying to update the requested partitions. A list of
BatchUpdatePartitionFailureEntry
objects.
Errors
-
The input provided was not valid.
-
A specified entity does not exist
-
The operation timed out.
-
An internal service error occurred.
-
An encryption operation failed.
CancelMLTaskRun
$result = $client->cancelMLTaskRun
([/* ... */]); $promise = $client->cancelMLTaskRunAsync
([/* ... */]);
Cancels (stops) a task run. Machine learning task runs are asynchronous tasks that AWS Glue runs on your behalf as part of various machine learning workflows. You can cancel a machine learning task run at any time by calling CancelMLTaskRun
with a task run's parent transform's TransformID
and the task run's TaskRunId
.
Parameter Syntax
$result = $client->cancelMLTaskRun([ 'TaskRunId' => '<string>', // REQUIRED 'TransformId' => '<string>', // REQUIRED ]);
Parameter Details
Members
Result Syntax
[ 'Status' => 'STARTING|RUNNING|STOPPING|STOPPED|SUCCEEDED|FAILED|TIMEOUT', 'TaskRunId' => '<string>', 'TransformId' => '<string>', ]
Result Details
Members
Errors
-
A specified entity does not exist
-
The input provided was not valid.
-
The operation timed out.
-
An internal service error occurred.
CheckSchemaVersionValidity
$result = $client->checkSchemaVersionValidity
([/* ... */]); $promise = $client->checkSchemaVersionValidityAsync
([/* ... */]);
Validates the supplied schema. This call has no side effects, it simply validates using the supplied schema using DataFormat
as the format. Since it does not take a schema set name, no compatibility checks are performed.
Parameter Syntax
$result = $client->checkSchemaVersionValidity([ 'DataFormat' => 'AVRO', // REQUIRED 'SchemaDefinition' => '<string>', // REQUIRED ]);
Parameter Details
Members
Result Syntax
[ 'Error' => '<string>', 'Valid' => true || false, ]
Result Details
Members
Errors
-
The input provided was not valid.
-
Access to a resource was denied.
-
An internal service error occurred.
CreateClassifier
$result = $client->createClassifier
([/* ... */]); $promise = $client->createClassifierAsync
([/* ... */]);
Creates a classifier in the user's account. This can be a GrokClassifier
, an XMLClassifier
, a JsonClassifier
, or a CsvClassifier
, depending on which field of the request is present.
Parameter Syntax
$result = $client->createClassifier([ 'CsvClassifier' => [ 'AllowSingleColumn' => true || false, 'ContainsHeader' => 'UNKNOWN|PRESENT|ABSENT', 'Delimiter' => '<string>', 'DisableValueTrimming' => true || false, 'Header' => ['<string>', ...], 'Name' => '<string>', // REQUIRED 'QuoteSymbol' => '<string>', ], 'GrokClassifier' => [ 'Classification' => '<string>', // REQUIRED 'CustomPatterns' => '<string>', 'GrokPattern' => '<string>', // REQUIRED 'Name' => '<string>', // REQUIRED ], 'JsonClassifier' => [ 'JsonPath' => '<string>', // REQUIRED 'Name' => '<string>', // REQUIRED ], 'XMLClassifier' => [ 'Classification' => '<string>', // REQUIRED 'Name' => '<string>', // REQUIRED 'RowTag' => '<string>', ], ]);
Parameter Details
Members
- CsvClassifier
-
- Type: CreateCsvClassifierRequest structure
A
CsvClassifier
object specifying the classifier to create. - GrokClassifier
-
- Type: CreateGrokClassifierRequest structure
A
GrokClassifier
object specifying the classifier to create. - JsonClassifier
-
- Type: CreateJsonClassifierRequest structure
A
JsonClassifier
object specifying the classifier to create. - XMLClassifier
-
- Type: CreateXMLClassifierRequest structure
An
XMLClassifier
object specifying the classifier to create.
Result Syntax
[]
Result Details
Errors
-
A resource to be created or added already exists.
-
The input provided was not valid.
-
The operation timed out.
CreateConnection
$result = $client->createConnection
([/* ... */]); $promise = $client->createConnectionAsync
([/* ... */]);
Creates a connection definition in the Data Catalog.
Parameter Syntax
$result = $client->createConnection([ 'CatalogId' => '<string>', 'ConnectionInput' => [ // REQUIRED 'ConnectionProperties' => ['<string>', ...], // REQUIRED 'ConnectionType' => 'JDBC|SFTP|MONGODB|KAFKA|NETWORK|MARKETPLACE|CUSTOM', // REQUIRED 'Description' => '<string>', 'MatchCriteria' => ['<string>', ...], 'Name' => '<string>', // REQUIRED 'PhysicalConnectionRequirements' => [ 'AvailabilityZone' => '<string>', 'SecurityGroupIdList' => ['<string>', ...], 'SubnetId' => '<string>', ], ], ]);
Parameter Details
Members
- CatalogId
-
- Type: string
The ID of the Data Catalog in which to create the connection. If none is provided, the AWS account ID is used by default.
- ConnectionInput
-
- Required: Yes
- Type: ConnectionInput structure
A
ConnectionInput
object defining the connection to create.
Result Syntax
[]
Result Details
Errors
-
A resource to be created or added already exists.
-
The input provided was not valid.
-
The operation timed out.
-
ResourceNumberLimitExceededException:
A resource numerical limit was exceeded.
-
An encryption operation failed.
CreateCrawler
$result = $client->createCrawler
([/* ... */]); $promise = $client->createCrawlerAsync
([/* ... */]);
Creates a new crawler with specified targets, role, configuration, and optional schedule. At least one crawl target must be specified, in the s3Targets
field, the jdbcTargets
field, or the DynamoDBTargets
field.
Parameter Syntax
$result = $client->createCrawler([ 'Classifiers' => ['<string>', ...], 'Configuration' => '<string>', 'CrawlerSecurityConfiguration' => '<string>', 'DatabaseName' => '<string>', 'Description' => '<string>', 'LineageConfiguration' => [ 'CrawlerLineageSettings' => 'ENABLE|DISABLE', ], 'Name' => '<string>', // REQUIRED 'RecrawlPolicy' => [ 'RecrawlBehavior' => 'CRAWL_EVERYTHING|CRAWL_NEW_FOLDERS_ONLY', ], 'Role' => '<string>', // REQUIRED 'Schedule' => '<string>', 'SchemaChangePolicy' => [ 'DeleteBehavior' => 'LOG|DELETE_FROM_DATABASE|DEPRECATE_IN_DATABASE', 'UpdateBehavior' => 'LOG|UPDATE_IN_DATABASE', ], 'TablePrefix' => '<string>', 'Tags' => ['<string>', ...], 'Targets' => [ // REQUIRED 'CatalogTargets' => [ [ 'DatabaseName' => '<string>', // REQUIRED 'Tables' => ['<string>', ...], // REQUIRED ], // ... ], 'DynamoDBTargets' => [ [ 'Path' => '<string>', 'scanAll' => true || false, 'scanRate' => <float>, ], // ... ], 'JdbcTargets' => [ [ 'ConnectionName' => '<string>', 'Exclusions' => ['<string>', ...], 'Path' => '<string>', ], // ... ], 'MongoDBTargets' => [ [ 'ConnectionName' => '<string>', 'Path' => '<string>', 'ScanAll' => true || false, ], // ... ], 'S3Targets' => [ [ 'ConnectionName' => '<string>', 'Exclusions' => ['<string>', ...], 'Path' => '<string>', ], // ... ], ], ]);
Parameter Details
Members
- Classifiers
-
- Type: Array of strings
A list of custom classifiers that the user has registered. By default, all built-in classifiers are included in a crawl, but these custom classifiers always override the default classifiers for a given classification.
- Configuration
-
- Type: string
Crawler configuration information. This versioned JSON string allows users to specify aspects of a crawler's behavior. For more information, see Configuring a Crawler.
- CrawlerSecurityConfiguration
-
- Type: string
The name of the
SecurityConfiguration
structure to be used by this crawler. - DatabaseName
-
- Type: string
The AWS Glue database where results are written, such as:
arn:aws:daylight:us-east-1::database/sometable/*
. - Description
-
- Type: string
A description of the new crawler.
- LineageConfiguration
-
- Type: LineageConfiguration structure
Specifies data lineage configuration settings for the crawler.
- Name
-
- Required: Yes
- Type: string
Name of the new crawler.
- RecrawlPolicy
-
- Type: RecrawlPolicy structure
A policy that specifies whether to crawl the entire dataset again, or to crawl only folders that were added since the last crawler run.
- Role
-
- Required: Yes
- Type: string
The IAM role or Amazon Resource Name (ARN) of an IAM role used by the new crawler to access customer resources.
- Schedule
-
- Type: string
A
cron
expression used to specify the schedule (see Time-Based Schedules for Jobs and Crawlers. For example, to run something every day at 12:15 UTC, you would specify:cron(15 12 * * ? *)
. - SchemaChangePolicy
-
- Type: SchemaChangePolicy structure
The policy for the crawler's update and deletion behavior.
- TablePrefix
-
- Type: string
The table prefix used for catalog tables that are created.
- Tags
-
- Type: Associative array of custom strings keys (TagKey) to strings
The tags to use with this crawler request. You may use tags to limit access to the crawler. For more information about tags in AWS Glue, see AWS Tags in AWS Glue in the developer guide.
- Targets
-
- Required: Yes
- Type: CrawlerTargets structure
A list of collection of targets to crawl.
Result Syntax
[]
Result Details
Errors
-
The input provided was not valid.
-
A resource to be created or added already exists.
-
The operation timed out.
-
ResourceNumberLimitExceededException:
A resource numerical limit was exceeded.
CreateDatabase
$result = $client->createDatabase
([/* ... */]); $promise = $client->createDatabaseAsync
([/* ... */]);
Creates a new database in a Data Catalog.
Parameter Syntax
$result = $client->createDatabase([ 'CatalogId' => '<string>', 'DatabaseInput' => [ // REQUIRED 'CreateTableDefaultPermissions' => [ [ 'Permissions' => ['<string>', ...], 'Principal' => [ 'DataLakePrincipalIdentifier' => '<string>', ], ], // ... ], 'Description' => '<string>', 'LocationUri' => '<string>', 'Name' => '<string>', // REQUIRED 'Parameters' => ['<string>', ...], 'TargetDatabase' => [ 'CatalogId' => '<string>', 'DatabaseName' => '<string>', ], ], ]);
Parameter Details
Members
- CatalogId
-
- Type: string
The ID of the Data Catalog in which to create the database. If none is provided, the AWS account ID is used by default.
- DatabaseInput
-
- Required: Yes
- Type: DatabaseInput structure
The metadata for the database.
Result Syntax
[]
Result Details
Errors
-
The input provided was not valid.
-
A resource to be created or added already exists.
-
ResourceNumberLimitExceededException:
A resource numerical limit was exceeded.
-
An internal service error occurred.
-
The operation timed out.
-
An encryption operation failed.
CreateDevEndpoint
$result = $client->createDevEndpoint
([/* ... */]); $promise = $client->createDevEndpointAsync
([/* ... */]);
Creates a new development endpoint.
Parameter Syntax
$result = $client->createDevEndpoint([ 'Arguments' => ['<string>', ...], 'EndpointName' => '<string>', // REQUIRED 'ExtraJarsS3Path' => '<string>', 'ExtraPythonLibsS3Path' => '<string>', 'GlueVersion' => '<string>', 'NumberOfNodes' => <integer>, 'NumberOfWorkers' => <integer>, 'PublicKey' => '<string>', 'PublicKeys' => ['<string>', ...], 'RoleArn' => '<string>', // REQUIRED 'SecurityConfiguration' => '<string>', 'SecurityGroupIds' => ['<string>', ...], 'SubnetId' => '<string>', 'Tags' => ['<string>', ...], 'WorkerType' => 'Standard|G.1X|G.2X', ]);
Parameter Details
Members
- Arguments
-
- Type: Associative array of custom strings keys (GenericString) to strings
A map of arguments used to configure the
DevEndpoint
. - EndpointName
-
- Required: Yes
- Type: string
The name to be assigned to the new
DevEndpoint
. - ExtraJarsS3Path
-
- Type: string
The path to one or more Java
.jar
files in an S3 bucket that should be loaded in yourDevEndpoint
. - ExtraPythonLibsS3Path
-
- Type: string
The paths to one or more Python libraries in an Amazon S3 bucket that should be loaded in your
DevEndpoint
. Multiple values must be complete paths separated by a comma.You can only use pure Python libraries with a
DevEndpoint
. Libraries that rely on C extensions, such as the pandas Python data analysis library, are not yet supported. - GlueVersion
-
- Type: string
Glue version determines the versions of Apache Spark and Python that AWS Glue supports. The Python version indicates the version supported for running your ETL scripts on development endpoints.
For more information about the available AWS Glue versions and corresponding Spark and Python versions, see Glue version in the developer guide.
Development endpoints that are created without specifying a Glue version default to Glue 0.9.
You can specify a version of Python support for development endpoints by using the
Arguments
parameter in theCreateDevEndpoint
orUpdateDevEndpoint
APIs. If no arguments are provided, the version defaults to Python 2. - NumberOfNodes
-
- Type: int
The number of AWS Glue Data Processing Units (DPUs) to allocate to this
DevEndpoint
. - NumberOfWorkers
-
- Type: int
The number of workers of a defined
workerType
that are allocated to the development endpoint.The maximum number of workers you can define are 299 for
G.1X
, and 149 forG.2X
. - PublicKey
-
- Type: string
The public key to be used by this
DevEndpoint
for authentication. This attribute is provided for backward compatibility because the recommended attribute to use is public keys. - PublicKeys
-
- Type: Array of strings
A list of public keys to be used by the development endpoints for authentication. The use of this attribute is preferred over a single public key because the public keys allow you to have a different private key per client.
If you previously created an endpoint with a public key, you must remove that key to be able to set a list of public keys. Call the
UpdateDevEndpoint
API with the public key content in thedeletePublicKeys
attribute, and the list of new keys in theaddPublicKeys
attribute. - RoleArn
-
- Required: Yes
- Type: string
The IAM role for the
DevEndpoint
. - SecurityConfiguration
-
- Type: string
The name of the
SecurityConfiguration
structure to be used with thisDevEndpoint
. - SecurityGroupIds
-
- Type: Array of strings
Security group IDs for the security groups to be used by the new
DevEndpoint
. - SubnetId
-
- Type: string
The subnet ID for the new
DevEndpoint
to use. - Tags
-
- Type: Associative array of custom strings keys (TagKey) to strings
The tags to use with this DevEndpoint. You may use tags to limit access to the DevEndpoint. For more information about tags in AWS Glue, see AWS Tags in AWS Glue in the developer guide.
- WorkerType
-
- Type: string
The type of predefined worker that is allocated to the development endpoint. Accepts a value of Standard, G.1X, or G.2X.
-
For the
Standard
worker type, each worker provides 4 vCPU, 16 GB of memory and a 50GB disk, and 2 executors per worker. -
For the
G.1X
worker type, each worker maps to 1 DPU (4 vCPU, 16 GB of memory, 64 GB disk), and provides 1 executor per worker. We recommend this worker type for memory-intensive jobs. -
For the
G.2X
worker type, each worker maps to 2 DPU (8 vCPU, 32 GB of memory, 128 GB disk), and provides 1 executor per worker. We recommend this worker type for memory-intensive jobs.
Known issue: when a development endpoint is created with the
G.2X
WorkerType
configuration, the Spark drivers for the development endpoint will run on 4 vCPU, 16 GB of memory, and a 64 GB disk.
Result Syntax
[ 'Arguments' => ['<string>', ...], 'AvailabilityZone' => '<string>', 'CreatedTimestamp' => <DateTime>, 'EndpointName' => '<string>', 'ExtraJarsS3Path' => '<string>', 'ExtraPythonLibsS3Path' => '<string>', 'FailureReason' => '<string>', 'GlueVersion' => '<string>', 'NumberOfNodes' => <integer>, 'NumberOfWorkers' => <integer>, 'RoleArn' => '<string>', 'SecurityConfiguration' => '<string>', 'SecurityGroupIds' => ['<string>', ...], 'Status' => '<string>', 'SubnetId' => '<string>', 'VpcId' => '<string>', 'WorkerType' => 'Standard|G.1X|G.2X', 'YarnEndpointAddress' => '<string>', 'ZeppelinRemoteSparkInterpreterPort' => <integer>, ]
Result Details
Members
- Arguments
-
- Type: Associative array of custom strings keys (GenericString) to strings
The map of arguments used to configure this
DevEndpoint
.Valid arguments are:
-
"--enable-glue-datacatalog": ""
-
"GLUE_PYTHON_VERSION": "3"
-
"GLUE_PYTHON_VERSION": "2"
You can specify a version of Python support for development endpoints by using the
Arguments
parameter in theCreateDevEndpoint
orUpdateDevEndpoint
APIs. If no arguments are provided, the version defaults to Python 2. - AvailabilityZone
-
- Type: string
The AWS Availability Zone where this
DevEndpoint
is located. - CreatedTimestamp
-
- Type: timestamp (string|DateTime or anything parsable by strtotime)
The point in time at which this
DevEndpoint
was created. - EndpointName
-
- Type: string
The name assigned to the new
DevEndpoint
. - ExtraJarsS3Path
-
- Type: string
Path to one or more Java
.jar
files in an S3 bucket that will be loaded in yourDevEndpoint
. - ExtraPythonLibsS3Path
-
- Type: string
The paths to one or more Python libraries in an S3 bucket that will be loaded in your
DevEndpoint
. - FailureReason
-
- Type: string
The reason for a current failure in this
DevEndpoint
. - GlueVersion
-
- Type: string
Glue version determines the versions of Apache Spark and Python that AWS Glue supports. The Python version indicates the version supported for running your ETL scripts on development endpoints.
- NumberOfNodes
-
- Type: int
The number of AWS Glue Data Processing Units (DPUs) allocated to this DevEndpoint.
- NumberOfWorkers
-
- Type: int
The number of workers of a defined
workerType
that are allocated to the development endpoint. - RoleArn
-
- Type: string
The Amazon Resource Name (ARN) of the role assigned to the new
DevEndpoint
. - SecurityConfiguration
-
- Type: string
The name of the
SecurityConfiguration
structure being used with thisDevEndpoint
. - SecurityGroupIds
-
- Type: Array of strings
The security groups assigned to the new
DevEndpoint
. - Status
-
- Type: string
The current status of the new
DevEndpoint
. - SubnetId
-
- Type: string
The subnet ID assigned to the new
DevEndpoint
. - VpcId
-
- Type: string
The ID of the virtual private cloud (VPC) used by this
DevEndpoint
. - WorkerType
-
- Type: string
The type of predefined worker that is allocated to the development endpoint. May be a value of Standard, G.1X, or G.2X.
- YarnEndpointAddress
-
- Type: string
The address of the YARN endpoint used by this
DevEndpoint
. - ZeppelinRemoteSparkInterpreterPort
-
- Type: int
The Apache Zeppelin port for the remote Apache Spark interpreter.
Errors
-
Access to a resource was denied.
-
A resource to be created or added already exists.
-
IdempotentParameterMismatchException:
The same unique identifier was associated with two different records.
-
An internal service error occurred.
-
The operation timed out.
-
The input provided was not valid.
-
A value could not be validated.
-
ResourceNumberLimitExceededException:
A resource numerical limit was exceeded.
CreateJob
$result = $client->createJob
([/* ... */]); $promise = $client->createJobAsync
([/* ... */]);
Creates a new job definition.
Parameter Syntax
$result = $client->createJob([ 'AllocatedCapacity' => <integer>, 'Command' => [ // REQUIRED 'Name' => '<string>', 'PythonVersion' => '<string>', 'ScriptLocation' => '<string>', ], 'Connections' => [ 'Connections' => ['<string>', ...], ], 'DefaultArguments' => ['<string>', ...], 'Description' => '<string>', 'ExecutionProperty' => [ 'MaxConcurrentRuns' => <integer>, ], 'GlueVersion' => '<string>', 'LogUri' => '<string>', 'MaxCapacity' => <float>, 'MaxRetries' => <integer>, 'Name' => '<string>', // REQUIRED 'NonOverridableArguments' => ['<string>', ...], 'NotificationProperty' => [ 'NotifyDelayAfter' => <integer>, ], 'NumberOfWorkers' => <integer>, 'Role' => '<string>', // REQUIRED 'SecurityConfiguration' => '<string>', 'Tags' => ['<string>', ...], 'Timeout' => <integer>, 'WorkerType' => 'Standard|G.1X|G.2X', ]);
Parameter Details
Members
- AllocatedCapacity
-
- Type: int
This parameter is deprecated. Use
MaxCapacity
instead.The number of AWS Glue data processing units (DPUs) to allocate to this Job. You can allocate from 2 to 100 DPUs; the default is 10. A DPU is a relative measure of processing power that consists of 4 vCPUs of compute capacity and 16 GB of memory. For more information, see the AWS Glue pricing page.
- Command
-
- Required: Yes
- Type: JobCommand structure
The
JobCommand
that executes this job. - Connections
-
- Type: ConnectionsList structure
The connections used for this job.
- DefaultArguments
-
- Type: Associative array of custom strings keys (GenericString) to strings
The default arguments for this job.
You can specify arguments here that your own job-execution script consumes, as well as arguments that AWS Glue itself consumes.
For information about how to specify and consume your own Job arguments, see the Calling AWS Glue APIs in Python topic in the developer guide.
For information about the key-value pairs that AWS Glue consumes to set up your job, see the Special Parameters Used by AWS Glue topic in the developer guide.
- Description
-
- Type: string
Description of the job being defined.
- ExecutionProperty
-
- Type: ExecutionProperty structure
An
ExecutionProperty
specifying the maximum number of concurrent runs allowed for this job. - GlueVersion
-
- Type: string
Glue version determines the versions of Apache Spark and Python that AWS Glue supports. The Python version indicates the version supported for jobs of type Spark.
For more information about the available AWS Glue versions and corresponding Spark and Python versions, see Glue version in the developer guide.
Jobs that are created without specifying a Glue version default to Glue 0.9.
- LogUri
-
- Type: string
This field is reserved for future use.
- MaxCapacity
-
- Type: double
The number of AWS Glue data processing units (DPUs) that can be allocated when this job runs. A DPU is a relative measure of processing power that consists of 4 vCPUs of compute capacity and 16 GB of memory. For more information, see the AWS Glue pricing page.
Do not set
Max Capacity
if usingWorkerType
andNumberOfWorkers
.The value that can be allocated for
MaxCapacity
depends on whether you are running a Python shell job or an Apache Spark ETL job:-
When you specify a Python shell job (
JobCommand.Name
="pythonshell"), you can allocate either 0.0625 or 1 DPU. The default is 0.0625 DPU. -
When you specify an Apache Spark ETL job (
JobCommand.Name
="glueetl") or Apache Spark streaming ETL job (JobCommand.Name
="gluestreaming"), you can allocate from 2 to 100 DPUs. The default is 10 DPUs. This job type cannot have a fractional DPU allocation.
- MaxRetries
-
- Type: int
The maximum number of times to retry this job if it fails.
- Name
-
- Required: Yes
- Type: string
The name you assign to this job definition. It must be unique in your account.
- NonOverridableArguments
-
- Type: Associative array of custom strings keys (GenericString) to strings
Non-overridable arguments for this job, specified as name-value pairs.
- NotificationProperty
-
- Type: NotificationProperty structure
Specifies configuration properties of a job notification.
- NumberOfWorkers
-
- Type: int
The number of workers of a defined
workerType
that are allocated when a job runs.The maximum number of workers you can define are 299 for
G.1X
, and 149 forG.2X
. - Role
-
- Required: Yes
- Type: string
The name or Amazon Resource Name (ARN) of the IAM role associated with this job.
- SecurityConfiguration
-
- Type: string
The name of the
SecurityConfiguration
structure to be used with this job. - Tags
-
- Type: Associative array of custom strings keys (TagKey) to strings
The tags to use with this job. You may use tags to limit access to the job. For more information about tags in AWS Glue, see AWS Tags in AWS Glue in the developer guide.
- Timeout
-
- Type: int
The job timeout in minutes. This is the maximum time that a job run can consume resources before it is terminated and enters
TIMEOUT
status. The default is 2,880 minutes (48 hours). - WorkerType
-
- Type: string
The type of predefined worker that is allocated when a job runs. Accepts a value of Standard, G.1X, or G.2X.
-
For the
Standard
worker type, each worker provides 4 vCPU, 16 GB of memory and a 50GB disk, and 2 executors per worker. -
For the
G.1X
worker type, each worker maps to 1 DPU (4 vCPU, 16 GB of memory, 64 GB disk), and provides 1 executor per worker. We recommend this worker type for memory-intensive jobs. -
For the
G.2X
worker type, each worker maps to 2 DPU (8 vCPU, 32 GB of memory, 128 GB disk), and provides 1 executor per worker. We recommend this worker type for memory-intensive jobs.
Result Syntax
[ 'Name' => '<string>', ]
Result Details
Errors
-
The input provided was not valid.
-
IdempotentParameterMismatchException:
The same unique identifier was associated with two different records.
-
A resource to be created or added already exists.
-
An internal service error occurred.
-
The operation timed out.
-
ResourceNumberLimitExceededException:
A resource numerical limit was exceeded.
-
ConcurrentModificationException:
Two processes are trying to modify a resource simultaneously.
CreateMLTransform
$result = $client->createMLTransform
([/* ... */]); $promise = $client->createMLTransformAsync
([/* ... */]);
Creates an AWS Glue machine learning transform. This operation creates the transform and all the necessary parameters to train it.
Call this operation as the first step in the process of using a machine learning transform (such as the FindMatches
transform) for deduplicating data. You can provide an optional Description
, in addition to the parameters that you want to use for your algorithm.
You must also specify certain parameters for the tasks that AWS Glue runs on your behalf as part of learning from your data and creating a high-quality machine learning transform. These parameters include Role
, and optionally, AllocatedCapacity
, Timeout
, and MaxRetries
. For more information, see Jobs.
Parameter Syntax
$result = $client->createMLTransform([ 'Description' => '<string>', 'GlueVersion' => '<string>', 'InputRecordTables' => [ // REQUIRED [ 'CatalogId' => '<string>', 'ConnectionName' => '<string>', 'DatabaseName' => '<string>', // REQUIRED 'TableName' => '<string>', // REQUIRED ], // ... ], 'MaxCapacity' => <float>, 'MaxRetries' => <integer>, 'Name' => '<string>', // REQUIRED 'NumberOfWorkers' => <integer>, 'Parameters' => [ // REQUIRED 'FindMatchesParameters' => [ 'AccuracyCostTradeoff' => <float>, 'EnforceProvidedLabels' => true || false, 'PrecisionRecallTradeoff' => <float>, 'PrimaryKeyColumnName' => '<string>', ], 'TransformType' => 'FIND_MATCHES', // REQUIRED ], 'Role' => '<string>', // REQUIRED 'Tags' => ['<string>', ...], 'Timeout' => <integer>, 'TransformEncryption' => [ 'MlUserDataEncryption' => [ 'KmsKeyId' => '<string>', 'MlUserDataEncryptionMode' => 'DISABLED|SSE-KMS', // REQUIRED ], 'TaskRunSecurityConfigurationName' => '<string>', ], 'WorkerType' => 'Standard|G.1X|G.2X', ]);
Parameter Details
Members
- Description
-
- Type: string
A description of the machine learning transform that is being defined. The default is an empty string.
- GlueVersion
-
- Type: string
This value determines which version of AWS Glue this machine learning transform is compatible with. Glue 1.0 is recommended for most customers. If the value is not set, the Glue compatibility defaults to Glue 0.9. For more information, see AWS Glue Versions in the developer guide.
- InputRecordTables
-
- Required: Yes
- Type: Array of GlueTable structures
A list of AWS Glue table definitions used by the transform.
- MaxCapacity
-
- Type: double
The number of AWS Glue data processing units (DPUs) that are allocated to task runs for this transform. You can allocate from 2 to 100 DPUs; the default is 10. A DPU is a relative measure of processing power that consists of 4 vCPUs of compute capacity and 16 GB of memory. For more information, see the AWS Glue pricing page.
MaxCapacity
is a mutually exclusive option withNumberOfWorkers
andWorkerType
.-
If either
NumberOfWorkers
orWorkerType
is set, thenMaxCapacity
cannot be set. -
If
MaxCapacity
is set then neitherNumberOfWorkers
orWorkerType
can be set. -
If
WorkerType
is set, thenNumberOfWorkers
is required (and vice versa). -
MaxCapacity
andNumberOfWorkers
must both be at least 1.
When the
WorkerType
field is set to a value other thanStandard
, theMaxCapacity
field is set automatically and becomes read-only.When the
WorkerType
field is set to a value other thanStandard
, theMaxCapacity
field is set automatically and becomes read-only. - MaxRetries
-
- Type: int
The maximum number of times to retry a task for this transform after a task run fails.
- Name
-
- Required: Yes
- Type: string
The unique name that you give the transform when you create it.
- NumberOfWorkers
-
- Type: int
The number of workers of a defined
workerType
that are allocated when this task runs.If
WorkerType
is set, thenNumberOfWorkers
is required (and vice versa). - Parameters
-
- Required: Yes
- Type: TransformParameters structure
The algorithmic parameters that are specific to the transform type used. Conditionally dependent on the transform type.
- Role
-
- Required: Yes
- Type: string
The name or Amazon Resource Name (ARN) of the IAM role with the required permissions. The required permissions include both AWS Glue service role permissions to AWS Glue resources, and Amazon S3 permissions required by the transform.
-
This role needs AWS Glue service role permissions to allow access to resources in AWS Glue. See Attach a Policy to IAM Users That Access AWS Glue.
-
This role needs permission to your Amazon Simple Storage Service (Amazon S3) sources, targets, temporary directory, scripts, and any libraries used by the task run for this transform.
- Tags
-
- Type: Associative array of custom strings keys (TagKey) to strings
The tags to use with this machine learning transform. You may use tags to limit access to the machine learning transform. For more information about tags in AWS Glue, see AWS Tags in AWS Glue in the developer guide.
- Timeout
-
- Type: int
The timeout of the task run for this transform in minutes. This is the maximum time that a task run for this transform can consume resources before it is terminated and enters
TIMEOUT
status. The default is 2,880 minutes (48 hours). - TransformEncryption
-
- Type: TransformEncryption structure
The encryption-at-rest settings of the transform that apply to accessing user data. Machine learning transforms can access user data encrypted in Amazon S3 using KMS.
- WorkerType
-
- Type: string
The type of predefined worker that is allocated when this task runs. Accepts a value of Standard, G.1X, or G.2X.
-
For the
Standard
worker type, each worker provides 4 vCPU, 16 GB of memory and a 50GB disk, and 2 executors per worker. -
For the
G.1X
worker type, each worker provides 4 vCPU, 16 GB of memory and a 64GB disk, and 1 executor per worker. -
For the
G.2X
worker type, each worker provides 8 vCPU, 32 GB of memory and a 128GB disk, and 1 executor per worker.
MaxCapacity
is a mutually exclusive option withNumberOfWorkers
andWorkerType
.-
If either
NumberOfWorkers
orWorkerType
is set, thenMaxCapacity
cannot be set. -
If
MaxCapacity
is set then neitherNumberOfWorkers
orWorkerType
can be set. -
If
WorkerType
is set, thenNumberOfWorkers
is required (and vice versa). -
MaxCapacity
andNumberOfWorkers
must both be at least 1.
Result Syntax
[ 'TransformId' => '<string>', ]
Result Details
Errors
-
A resource to be created or added already exists.
-
The input provided was not valid.
-
The operation timed out.
-
An internal service error occurred.
-
Access to a resource was denied.
-
ResourceNumberLimitExceededException:
A resource numerical limit was exceeded.
-
IdempotentParameterMismatchException:
The same unique identifier was associated with two different records.
CreatePartition
$result = $client->createPartition
([/* ... */]); $promise = $client->createPartitionAsync
([/* ... */]);
Creates a new partition.
Parameter Syntax
$result = $client->createPartition([ 'CatalogId' => '<string>', 'DatabaseName' => '<string>', // REQUIRED 'PartitionInput' => [ // REQUIRED 'LastAccessTime' => <integer || string || DateTime>, 'LastAnalyzedTime' => <integer || string || DateTime>, 'Parameters' => ['<string>', ...], 'StorageDescriptor' => [ 'BucketColumns' => ['<string>', ...], 'Columns' => [ [ 'Comment' => '<string>', 'Name' => '<string>', // REQUIRED 'Parameters' => ['<string>', ...], 'Type' => '<string>', ], // ... ], 'Compressed' => true || false, 'InputFormat' => '<string>', 'Location' => '<string>', 'NumberOfBuckets' => <integer>, 'OutputFormat' => '<string>', 'Parameters' => ['<string>', ...], 'SchemaReference' => [ 'SchemaId' => [ 'RegistryName' => '<string>', 'SchemaArn' => '<string>', 'SchemaName' => '<string>', ], 'SchemaVersionId' => '<string>', 'SchemaVersionNumber' => <integer>, ], 'SerdeInfo' => [ 'Name' => '<string>', 'Parameters' => ['<string>', ...], 'SerializationLibrary' => '<string>', ], 'SkewedInfo' => [ 'SkewedColumnNames' => ['<string>', ...], 'SkewedColumnValueLocationMaps' => ['<string>', ...], 'SkewedColumnValues' => ['<string>', ...], ], 'SortColumns' => [ [ 'Column' => '<string>', // REQUIRED 'SortOrder' => <integer>, // REQUIRED ], // ... ], 'StoredAsSubDirectories' => true || false, ], 'Values' => ['<string>', ...], ], 'TableName' => '<string>', // REQUIRED ]);
Parameter Details
Members
- CatalogId
-
- Type: string
The AWS account ID of the catalog in which the partition is to be created.
- DatabaseName
-
- Required: Yes
- Type: string
The name of the metadata database in which the partition is to be created.
- PartitionInput
-
- Required: Yes
- Type: PartitionInput structure
A
PartitionInput
structure defining the partition to be created. - TableName
-
- Required: Yes
- Type: string
The name of the metadata table in which the partition is to be created.
Result Syntax
[]
Result Details
Errors
-
The input provided was not valid.
-
A resource to be created or added already exists.
-
ResourceNumberLimitExceededException:
A resource numerical limit was exceeded.
-
An internal service error occurred.
-
A specified entity does not exist
-
The operation timed out.
-
An encryption operation failed.
CreatePartitionIndex
$result = $client->createPartitionIndex
([/* ... */]); $promise = $client->createPartitionIndexAsync
([/* ... */]);
Creates a specified partition index in an existing table.
Parameter Syntax
$result = $client->createPartitionIndex([ 'CatalogId' => '<string>', 'DatabaseName' => '<string>', // REQUIRED 'PartitionIndex' => [ // REQUIRED 'IndexName' => '<string>', // REQUIRED 'Keys' => ['<string>', ...], // REQUIRED ], 'TableName' => '<string>', // REQUIRED ]);
Parameter Details
Members
- CatalogId
-
- Type: string
The catalog ID where the table resides.
- DatabaseName
-
- Required: Yes
- Type: string
Specifies the name of a database in which you want to create a partition index.
- PartitionIndex
-
- Required: Yes
- Type: PartitionIndex structure
Specifies a
PartitionIndex
structure to create a partition index in an existing table. - TableName
-
- Required: Yes
- Type: string
Specifies the name of a table in which you want to create a partition index.
Result Syntax
[]
Result Details
Errors
-
A resource to be created or added already exists.
-
The input provided was not valid.
-
A specified entity does not exist
-
ResourceNumberLimitExceededException:
A resource numerical limit was exceeded.
-
An internal service error occurred.
-
The operation timed out.
-
An encryption operation failed.
CreateRegistry
$result = $client->createRegistry
([/* ... */]); $promise = $client->createRegistryAsync
([/* ... */]);
Creates a new registry which may be used to hold a collection of schemas.
Parameter Syntax
$result = $client->createRegistry([ 'Description' => '<string>', 'RegistryName' => '<string>', // REQUIRED 'Tags' => ['<string>', ...], ]);
Parameter Details
Members
- Description
-
- Type: string
A description of the registry. If description is not provided, there will not be any default value for this.
- RegistryName
-
- Required: Yes
- Type: string
Name of the registry to be created of max length of 255, and may only contain letters, numbers, hyphen, underscore, dollar sign, or hash mark. No whitespace.
- Tags
-
- Type: Associative array of custom strings keys (TagKey) to strings
AWS tags that contain a key value pair and may be searched by console, command line, or API.
Result Syntax
[ 'Description' => '<string>', 'RegistryArn' => '<string>', 'RegistryName' => '<string>', 'Tags' => ['<string>', ...], ]
Result Details
Members
Errors
-
The input provided was not valid.
-
Access to a resource was denied.
-
A resource to be created or added already exists.
-
ResourceNumberLimitExceededException:
A resource numerical limit was exceeded.
-
An internal service error occurred.
CreateSchema
$result = $client->createSchema
([/* ... */]); $promise = $client->createSchemaAsync
([/* ... */]);
Creates a new schema set and registers the schema definition. Returns an error if the schema set already exists without actually registering the version.
When the schema set is created, a version checkpoint will be set to the first version. Compatibility mode "DISABLED" restricts any additional schema versions from being added after the first schema version. For all other compatibility modes, validation of compatibility settings will be applied only from the second version onwards when the RegisterSchemaVersion
API is used.
When this API is called without a RegistryId
, this will create an entry for a "default-registry" in the registry database tables, if it is not already present.
Parameter Syntax
$result = $client->createSchema([ 'Compatibility' => 'NONE|DISABLED|BACKWARD|BACKWARD_ALL|FORWARD|FORWARD_ALL|FULL|FULL_ALL', 'DataFormat' => 'AVRO', // REQUIRED 'Description' => '<string>', 'RegistryId' => [ 'RegistryArn' => '<string>', 'RegistryName' => '<string>', ], 'SchemaDefinition' => '<string>', 'SchemaName' => '<string>', // REQUIRED 'Tags' => ['<string>', ...], ]);
Parameter Details
Members
- Compatibility
-
- Type: string
The compatibility mode of the schema. The possible values are:
-
NONE: No compatibility mode applies. You can use this choice in development scenarios or if you do not know the compatibility mode that you want to apply to schemas. Any new version added will be accepted without undergoing a compatibility check.
-
DISABLED: This compatibility choice prevents versioning for a particular schema. You can use this choice to prevent future versioning of a schema.
-
BACKWARD: This compatibility choice is recommended as it allows data receivers to read both the current and one previous schema version. This means that for instance, a new schema version cannot drop data fields or change the type of these fields, so they can't be read by readers using the previous version.
-
BACKWARD_ALL: This compatibility choice allows data receivers to read both the current and all previous schema versions. You can use this choice when you need to delete fields or add optional fields, and check compatibility against all previous schema versions.
-
FORWARD: This compatibility choice allows data receivers to read both the current and one next schema version, but not necessarily later versions. You can use this choice when you need to add fields or delete optional fields, but only check compatibility against the last schema version.
-
FORWARD_ALL: This compatibility choice allows data receivers to read written by producers of any new registered schema. You can use this choice when you need to add fields or delete optional fields, and check compatibility against all previous schema versions.
-
FULL: This compatibility choice allows data receivers to read data written by producers using the previous or next version of the schema, but not necessarily earlier or later versions. You can use this choice when you need to add or remove optional fields, but only check compatibility against the last schema version.
-
FULL_ALL: This compatibility choice allows data receivers to read data written by producers using all previous schema versions. You can use this choice when you need to add or remove optional fields, and check compatibility against all previous schema versions.
- DataFormat
-
- Required: Yes
- Type: string
The data format of the schema definition. Currently only
AVRO
is supported. - Description
-
- Type: string
An optional description of the schema. If description is not provided, there will not be any automatic default value for this.
- RegistryId
-
- Type: RegistryId structure
This is a wrapper shape to contain the registry identity fields. If this is not provided, the default registry will be used. The ARN format for the same will be:
arn:aws:glue:us-east-2:<customer id>:registry/default-registry:random-5-letter-id
. - SchemaDefinition
-
- Type: string
The schema definition using the
DataFormat
setting forSchemaName
. - SchemaName
-
- Required: Yes
- Type: string
Name of the schema to be created of max length of 255, and may only contain letters, numbers, hyphen, underscore, dollar sign, or hash mark. No whitespace.
- Tags
-
- Type: Associative array of custom strings keys (TagKey) to strings
AWS tags that contain a key value pair and may be searched by console, command line, or API. If specified, follows the AWS tags-on-create pattern.
Result Syntax
[ 'Compatibility' => 'NONE|DISABLED|BACKWARD|BACKWARD_ALL|FORWARD|FORWARD_ALL|FULL|FULL_ALL', 'DataFormat' => 'AVRO', 'Description' => '<string>', 'LatestSchemaVersion' => <integer>, 'NextSchemaVersion' => <integer>, 'RegistryArn' => '<string>', 'RegistryName' => '<string>', 'SchemaArn' => '<string>', 'SchemaCheckpoint' => <integer>, 'SchemaName' => '<string>', 'SchemaStatus' => 'AVAILABLE|PENDING|DELETING', 'SchemaVersionId' => '<string>', 'SchemaVersionStatus' => 'AVAILABLE|PENDING|FAILURE|DELETING', 'Tags' => ['<string>', ...], ]
Result Details
Members
- Compatibility
-
- Type: string
The schema compatibility mode.
- DataFormat
-
- Type: string
The data format of the schema definition. Currently only
AVRO
is supported. - Description
-
- Type: string
A description of the schema if specified when created.
- LatestSchemaVersion
-
- Type: long (int|float)
The latest version of the schema associated with the returned schema definition.
- NextSchemaVersion
-
- Type: long (int|float)
The next version of the schema associated with the returned schema definition.
- RegistryArn
-
- Type: string
The Amazon Resource Name (ARN) of the registry.
- RegistryName
-
- Type: string
The name of the registry.
- SchemaArn
-
- Type: string
The Amazon Resource Name (ARN) of the schema.
- SchemaCheckpoint
-
- Type: long (int|float)
The version number of the checkpoint (the last time the compatibility mode was changed).
- SchemaName
-
- Type: string
The name of the schema.
- SchemaStatus
-
- Type: string
The status of the schema.
- SchemaVersionId
-
- Type: string
The unique identifier of the first schema version.
- SchemaVersionStatus
-
- Type: string
The status of the first schema version created.
- Tags
-
- Type: Associative array of custom strings keys (TagKey) to strings
The tags for the schema.
Errors
-
The input provided was not valid.
-
Access to a resource was denied.
-
A specified entity does not exist
-
A resource to be created or added already exists.
-
ResourceNumberLimitExceededException:
A resource numerical limit was exceeded.
-
An internal service error occurred.
CreateScript
$result = $client->createScript
([/* ... */]); $promise = $client->createScriptAsync
([/* ... */]);
Transforms a directed acyclic graph (DAG) into code.
Parameter Syntax
$result = $client->createScript([ 'DagEdges' => [ [ 'Source' => '<string>', // REQUIRED 'Target' => '<string>', // REQUIRED 'TargetParameter' => '<string>', ], // ... ], 'DagNodes' => [ [ 'Args' => [ // REQUIRED [ 'Name' => '<string>', // REQUIRED 'Param' => true || false, 'Value' => '<string>', // REQUIRED ], // ... ], 'Id' => '<string>', // REQUIRED 'LineNumber' => <integer>, 'NodeType' => '<string>', // REQUIRED ], // ... ], 'Language' => 'PYTHON|SCALA', ]);
Parameter Details
Members
- DagEdges
-
- Type: Array of CodeGenEdge structures
A list of the edges in the DAG.
- DagNodes
-
- Type: Array of CodeGenNode structures
A list of the nodes in the DAG.
- Language
-
- Type: string
The programming language of the resulting code from the DAG.
Result Syntax
[ 'PythonScript' => '<string>', 'ScalaCode' => '<string>', ]
Result Details
Members
Errors
-
The input provided was not valid.
-
An internal service error occurred.
-
The operation timed out.
CreateSecurityConfiguration
$result = $client->createSecurityConfiguration
([/* ... */]); $promise = $client->createSecurityConfigurationAsync
([/* ... */]);
Creates a new security configuration. A security configuration is a set of security properties that can be used by AWS Glue. You can use a security configuration to encrypt data at rest. For information about using security configurations in AWS Glue, see Encrypting Data Written by Crawlers, Jobs, and Development Endpoints.
Parameter Syntax
$result = $client->createSecurityConfiguration([ 'EncryptionConfiguration' => [ // REQUIRED 'CloudWatchEncryption' => [ 'CloudWatchEncryptionMode' => 'DISABLED|SSE-KMS', 'KmsKeyArn' => '<string>', ], 'JobBookmarksEncryption' => [ 'JobBookmarksEncryptionMode' => 'DISABLED|CSE-KMS', 'KmsKeyArn' => '<string>', ], 'S3Encryption' => [ [ 'KmsKeyArn' => '<string>', 'S3EncryptionMode' => 'DISABLED|SSE-KMS|SSE-S3', ], // ... ], ], 'Name' => '<string>', // REQUIRED ]);
Parameter Details
Members
- EncryptionConfiguration
-
- Required: Yes
- Type: EncryptionConfiguration structure
The encryption configuration for the new security configuration.
- Name
-
- Required: Yes
- Type: string
The name for the new security configuration.
Result Syntax
[ 'CreatedTimestamp' => <DateTime>, 'Name' => '<string>', ]
Result Details
Members
Errors
-
A resource to be created or added already exists.
-
The input provided was not valid.
-
An internal service error occurred.
-
The operation timed out.
-
ResourceNumberLimitExceededException:
A resource numerical limit was exceeded.
CreateTable
$result = $client->createTable
([/* ... */]); $promise = $client->createTableAsync
([/* ... */]);
Creates a new table definition in the Data Catalog.
Parameter Syntax
$result = $client->createTable([ 'CatalogId' => '<string>', 'DatabaseName' => '<string>', // REQUIRED 'PartitionIndexes' => [ [ 'IndexName' => '<string>', // REQUIRED 'Keys' => ['<string>', ...], // REQUIRED ], // ... ], 'TableInput' => [ // REQUIRED 'Description' => '<string>', 'LastAccessTime' => <integer || string || DateTime>, 'LastAnalyzedTime' => <integer || string || DateTime>, 'Name' => '<string>', // REQUIRED 'Owner' => '<string>', 'Parameters' => ['<string>', ...], 'PartitionKeys' => [ [ 'Comment' => '<string>', 'Name' => '<string>', // REQUIRED 'Parameters' => ['<string>', ...], 'Type' => '<string>', ], // ... ], 'Retention' => <integer>, 'StorageDescriptor' => [ 'BucketColumns' => ['<string>', ...], 'Columns' => [ [ 'Comment' => '<string>', 'Name' => '<string>', // REQUIRED 'Parameters' => ['<string>', ...], 'Type' => '<string>', ], // ... ], 'Compressed' => true || false, 'InputFormat' => '<string>', 'Location' => '<string>', 'NumberOfBuckets' => <integer>, 'OutputFormat' => '<string>', 'Parameters' => ['<string>', ...], 'SchemaReference' => [ 'SchemaId' => [ 'RegistryName' => '<string>', 'SchemaArn' => '<string>', 'SchemaName' => '<string>', ], 'SchemaVersionId' => '<string>', 'SchemaVersionNumber' => <integer>, ], 'SerdeInfo' => [ 'Name' => '<string>', 'Parameters' => ['<string>', ...], 'SerializationLibrary' => '<string>', ], 'SkewedInfo' => [ 'SkewedColumnNames' => ['<string>', ...], 'SkewedColumnValueLocationMaps' => ['<string>', ...], 'SkewedColumnValues' => ['<string>', ...], ], 'SortColumns' => [ [ 'Column' => '<string>', // REQUIRED 'SortOrder' => <integer>, // REQUIRED ], // ... ], 'StoredAsSubDirectories' => true || false, ], 'TableType' => '<string>', 'TargetTable' => [ 'CatalogId' => '<string>', 'DatabaseName' => '<string>', 'Name' => '<string>', ], 'ViewExpandedText' => '<string>', 'ViewOriginalText' => '<string>', ], ]);
Parameter Details
Members
- CatalogId
-
- Type: string
The ID of the Data Catalog in which to create the
Table
. If none is supplied, the AWS account ID is used by default. - DatabaseName
-
- Required: Yes
- Type: string
The catalog database in which to create the new table. For Hive compatibility, this name is entirely lowercase.
- PartitionIndexes
-
- Type: Array of PartitionIndex structures
A list of partition indexes,
PartitionIndex
structures, to create in the table. - TableInput
-
- Required: Yes
- Type: TableInput structure
The
TableInput
object that defines the metadata table to create in the catalog.
Result Syntax
[]
Result Details
Errors
-
A resource to be created or added already exists.
-
The input provided was not valid.
-
A specified entity does not exist
-
ResourceNumberLimitExceededException:
A resource numerical limit was exceeded.
-
An internal service error occurred.
-
The operation timed out.
-
An encryption operation failed.
CreateTrigger
$result = $client->createTrigger
([/* ... */]); $promise = $client->createTriggerAsync
([/* ... */]);
Creates a new trigger.
Parameter Syntax
$result = $client->createTrigger([ 'Actions' => [ // REQUIRED [ 'Arguments' => ['<string>', ...], 'CrawlerName' => '<string>', 'JobName' => '<string>', 'NotificationProperty' => [ 'NotifyDelayAfter' => <integer>, ], 'SecurityConfiguration' => '<string>', 'Timeout' => <integer>, ], // ... ], 'Description' => '<string>', 'Name' => '<string>', // REQUIRED 'Predicate' => [ 'Conditions' => [ [ 'CrawlState' => 'RUNNING|CANCELLING|CANCELLED|SUCCEEDED|FAILED', 'CrawlerName' => '<string>', 'JobName' => '<string>', 'LogicalOperator' => 'EQUALS', 'State' => 'STARTING|RUNNING|STOPPING|STOPPED|SUCCEEDED|FAILED|TIMEOUT', ], // ... ], 'Logical' => 'AND|ANY', ], 'Schedule' => '<string>', 'StartOnCreation' => true || false, 'Tags' => ['<string>', ...], 'Type' => 'SCHEDULED|CONDITIONAL|ON_DEMAND', // REQUIRED 'WorkflowName' => '<string>', ]);
Parameter Details
Members
- Actions
-
- Required: Yes
- Type: Array of Action structures
The actions initiated by this trigger when it fires.
- Description
-
- Type: string
A description of the new trigger.
- Name
-
- Required: Yes
- Type: string
The name of the trigger.
- Predicate
-
- Type: Predicate structure
A predicate to specify when the new trigger should fire.
This field is required when the trigger type is
CONDITIONAL
. - Schedule
-
- Type: string
A
cron
expression used to specify the schedule (see Time-Based Schedules for Jobs and Crawlers. For example, to run something every day at 12:15 UTC, you would specify:cron(15 12 * * ? *)
.This field is required when the trigger type is SCHEDULED.
- StartOnCreation
-
- Type: boolean
Set to
true
to startSCHEDULED
andCONDITIONAL
triggers when created. True is not supported forON_DEMAND
triggers. - Tags
-
- Type: Associative array of custom strings keys (TagKey) to strings
The tags to use with this trigger. You may use tags to limit access to the trigger. For more information about tags in AWS Glue, see AWS Tags in AWS Glue in the developer guide.
- Type
-
- Required: Yes
- Type: string
The type of the new trigger.
- WorkflowName
-
- Type: string
The name of the workflow associated with the trigger.
Result Syntax
[ 'Name' => '<string>', ]
Result Details
Errors
-
A resource to be created or added already exists.
-
A specified entity does not exist
-
The input provided was not valid.
-
IdempotentParameterMismatchException:
The same unique identifier was associated with two different records.
-
An internal service error occurred.
-
The operation timed out.
-
ResourceNumberLimitExceededException:
A resource numerical limit was exceeded.
-
ConcurrentModificationException:
Two processes are trying to modify a resource simultaneously.
CreateUserDefinedFunction
$result = $client->createUserDefinedFunction
([/* ... */]); $promise = $client->createUserDefinedFunctionAsync
([/* ... */]);
Creates a new function definition in the Data Catalog.
Parameter Syntax
$result = $client->createUserDefinedFunction([ 'CatalogId' => '<string>', 'DatabaseName' => '<string>', // REQUIRED 'FunctionInput' => [ // REQUIRED 'ClassName' => '<string>', 'FunctionName' => '<string>', 'OwnerName' => '<string>', 'OwnerType' => 'USER|ROLE|GROUP', 'ResourceUris' => [ [ 'ResourceType' => 'JAR|FILE|ARCHIVE', 'Uri' => '<string>', ], // ... ], ], ]);
Parameter Details
Members
- CatalogId
-
- Type: string
The ID of the Data Catalog in which to create the function. If none is provided, the AWS account ID is used by default.
- DatabaseName
-
- Required: Yes
- Type: string
The name of the catalog database in which to create the function.
- FunctionInput
-
- Required: Yes
- Type: UserDefinedFunctionInput structure
A
FunctionInput
object that defines the function to create in the Data Catalog.
Result Syntax
[]
Result Details
Errors
-
A resource to be created or added already exists.
-
The input provided was not valid.
-
An internal service error occurred.
-
A specified entity does not exist
-
The operation timed out.
-
ResourceNumberLimitExceededException:
A resource numerical limit was exceeded.
-
An encryption operation failed.
CreateWorkflow
$result = $client->createWorkflow
([/* ... */]); $promise = $client->createWorkflowAsync
([/* ... */]);
Creates a new workflow.
Parameter Syntax
$result = $client->createWorkflow([ 'DefaultRunProperties' => ['<string>', ...], 'Description' => '<string>', 'MaxConcurrentRuns' => <integer>, 'Name' => '<string>', // REQUIRED 'Tags' => ['<string>', ...], ]);
Parameter Details
Members
- DefaultRunProperties
-
- Type: Associative array of custom strings keys (IdString) to strings
A collection of properties to be used as part of each execution of the workflow.
- Description
-
- Type: string
A description of the workflow.
- MaxConcurrentRuns
-
- Type: int
You can use this parameter to prevent unwanted multiple updates to data, to control costs, or in some cases, to prevent exceeding the maximum number of concurrent runs of any of the component jobs. If you leave this parameter blank, there is no limit to the number of concurrent workflow runs.
- Name
-
- Required: Yes
- Type: string
The name to be assigned to the workflow. It should be unique within your account.
- Tags
-
- Type: Associative array of custom strings keys (TagKey) to strings
The tags to be used with this workflow.
Result Syntax
[ 'Name' => '<string>', ]
Result Details
Errors
-
A resource to be created or added already exists.
-
The input provided was not valid.
-
An internal service error occurred.
-
The operation timed out.
-
ResourceNumberLimitExceededException:
A resource numerical limit was exceeded.
-
ConcurrentModificationException:
Two processes are trying to modify a resource simultaneously.
DeleteClassifier
$result = $client->deleteClassifier
([/* ... */]); $promise = $client->deleteClassifierAsync
([/* ... */]);
Removes a classifier from the Data Catalog.
Parameter Syntax
$result = $client->deleteClassifier([ 'Name' => '<string>', // REQUIRED ]);
Parameter Details
Result Syntax
[]
Result Details
Errors
-
A specified entity does not exist
-
The operation timed out.
DeleteColumnStatisticsForPartition
$result = $client->deleteColumnStatisticsForPartition
([/* ... */]); $promise = $client->deleteColumnStatisticsForPartitionAsync
([/* ... */]);
Delete the partition column statistics of a column.
The Identity and Access Management (IAM) permission required for this operation is DeletePartition
.
Parameter Syntax
$result = $client->deleteColumnStatisticsForPartition([ 'CatalogId' => '<string>', 'ColumnName' => '<string>', // REQUIRED 'DatabaseName' => '<string>', // REQUIRED 'PartitionValues' => ['<string>', ...], // REQUIRED 'TableName' => '<string>', // REQUIRED ]);
Parameter Details
Members
- CatalogId
-
- Type: string
The ID of the Data Catalog where the partitions in question reside. If none is supplied, the AWS account ID is used by default.
- ColumnName
-
- Required: Yes
- Type: string
Name of the column.
- DatabaseName
-
- Required: Yes
- Type: string
The name of the catalog database where the partitions reside.
- PartitionValues
-
- Required: Yes
- Type: Array of strings
A list of partition values identifying the partition.
- TableName
-
- Required: Yes
- Type: string
The name of the partitions' table.
Result Syntax
[]
Result Details
Errors
-
A specified entity does not exist
-
The input provided was not valid.
-
An internal service error occurred.
-
The operation timed out.
-
An encryption operation failed.
DeleteColumnStatisticsForTable
$result = $client->deleteColumnStatisticsForTable
([/* ... */]); $promise = $client->deleteColumnStatisticsForTableAsync
([/* ... */]);
Retrieves table statistics of columns.
The Identity and Access Management (IAM) permission required for this operation is DeleteTable
.
Parameter Syntax
$result = $client->deleteColumnStatisticsForTable([ 'CatalogId' => '<string>', 'ColumnName' => '<string>', // REQUIRED 'DatabaseName' => '<string>', // REQUIRED 'TableName' => '<string>', // REQUIRED ]);
Parameter Details
Members
- CatalogId
-
- Type: string
The ID of the Data Catalog where the partitions in question reside. If none is supplied, the AWS account ID is used by default.
- ColumnName
-
- Required: Yes
- Type: string
The name of the column.
- DatabaseName
-
- Required: Yes
- Type: string
The name of the catalog database where the partitions reside.
- TableName
-
- Required: Yes
- Type: string
The name of the partitions' table.
Result Syntax
[]
Result Details
Errors
-
A specified entity does not exist
-
The input provided was not valid.
-
An internal service error occurred.
-
The operation timed out.
-
An encryption operation failed.
DeleteConnection
$result = $client->deleteConnection
([/* ... */]); $promise = $client->deleteConnectionAsync
([/* ... */]);
Deletes a connection from the Data Catalog.
Parameter Syntax
$result = $client->deleteConnection([ 'CatalogId' => '<string>', 'ConnectionName' => '<string>', // REQUIRED ]);
Parameter Details
Members
Result Syntax
[]
Result Details
Errors
-
A specified entity does not exist
-
The operation timed out.
DeleteCrawler
$result = $client->deleteCrawler
([/* ... */]); $promise = $client->deleteCrawlerAsync
([/* ... */]);
Removes a specified crawler from the AWS Glue Data Catalog, unless the crawler state is RUNNING
.
Parameter Syntax
$result = $client->deleteCrawler([ 'Name' => '<string>', // REQUIRED ]);
Parameter Details
Result Syntax
[]
Result Details
Errors
-
A specified entity does not exist
-
The operation cannot be performed because the crawler is already running.
-
SchedulerTransitioningException:
The specified scheduler is transitioning.
-
The operation timed out.
DeleteDatabase
$result = $client->deleteDatabase
([/* ... */]); $promise = $client->deleteDatabaseAsync
([/* ... */]);
Removes a specified database from a Data Catalog.
After completing this operation, you no longer have access to the tables (and all table versions and partitions that might belong to the tables) and the user-defined functions in the deleted database. AWS Glue deletes these "orphaned" resources asynchronously in a timely manner, at the discretion of the service.
To ensure the immediate deletion of all related resources, before calling DeleteDatabase
, use DeleteTableVersion
or BatchDeleteTableVersion
, DeletePartition
or BatchDeletePartition
, DeleteUserDefinedFunction
, and DeleteTable
or BatchDeleteTable
, to delete any resources that belong to the database.
Parameter Syntax
$result = $client->deleteDatabase([ 'CatalogId' => '<string>', 'Name' => '<string>', // REQUIRED ]);
Parameter Details
Members
Result Syntax
[]
Result Details
Errors
-
A specified entity does not exist
-
The input provided was not valid.
-
An internal service error occurred.
-
The operation timed out.
DeleteDevEndpoint
$result = $client->deleteDevEndpoint
([/* ... */]); $promise = $client->deleteDevEndpointAsync
([/* ... */]);
Deletes a specified development endpoint.
Parameter Syntax
$result = $client->deleteDevEndpoint([ 'EndpointName' => '<string>', // REQUIRED ]);
Parameter Details
Result Syntax
[]
Result Details
Errors
-
A specified entity does not exist
-
An internal service error occurred.
-
The operation timed out.
-
The input provided was not valid.
DeleteJob
$result = $client->deleteJob
([/* ... */]); $promise = $client->deleteJobAsync
([/* ... */]);
Deletes a specified job definition. If the job definition is not found, no exception is thrown.
Parameter Syntax
$result = $client->deleteJob([ 'JobName' => '<string>', // REQUIRED ]);
Parameter Details
Result Syntax
[ 'JobName' => '<string>', ]
Result Details
Errors
-
The input provided was not valid.
-
An internal service error occurred.
-
The operation timed out.
DeleteMLTransform
$result = $client->deleteMLTransform
([/* ... */]); $promise = $client->deleteMLTransformAsync
([/* ... */]);
Deletes an AWS Glue machine learning transform. Machine learning transforms are a special type of transform that use machine learning to learn the details of the transformation to be performed by learning from examples provided by humans. These transformations are then saved by AWS Glue. If you no longer need a transform, you can delete it by calling DeleteMLTransforms
. However, any AWS Glue jobs that still reference the deleted transform will no longer succeed.
Parameter Syntax
$result = $client->deleteMLTransform([ 'TransformId' => '<string>', // REQUIRED ]);
Parameter Details
Result Syntax
[ 'TransformId' => '<string>', ]
Result Details
Errors
-
A specified entity does not exist
-
The input provided was not valid.
-
The operation timed out.
-
An internal service error occurred.
DeletePartition
$result = $client->deletePartition
([/* ... */]); $promise = $client->deletePartitionAsync
([/* ... */]);
Deletes a specified partition.
Parameter Syntax
$result = $client->deletePartition([ 'CatalogId' => '<string>', 'DatabaseName' => '<string>', // REQUIRED 'PartitionValues' => ['<string>', ...], // REQUIRED 'TableName' => '<string>', // REQUIRED ]);
Parameter Details
Members
- CatalogId
-
- Type: string
The ID of the Data Catalog where the partition to be deleted resides. If none is provided, the AWS account ID is used by default.
- DatabaseName
-
- Required: Yes
- Type: string
The name of the catalog database in which the table in question resides.
- PartitionValues
-
- Required: Yes
- Type: Array of strings
The values that define the partition.
- TableName
-
- Required: Yes
- Type: string
The name of the table that contains the partition to be deleted.
Result Syntax
[]
Result Details
Errors
-
A specified entity does not exist
-
The input provided was not valid.
-
An internal service error occurred.
-
The operation timed out.
DeletePartitionIndex
$result = $client->deletePartitionIndex
([/* ... */]); $promise = $client->deletePartitionIndexAsync
([/* ... */]);
Deletes a specified partition index from an existing table.
Parameter Syntax
$result = $client->deletePartitionIndex([ 'CatalogId' => '<string>', 'DatabaseName' => '<string>', // REQUIRED 'IndexName' => '<string>', // REQUIRED 'TableName' => '<string>', // REQUIRED ]);
Parameter Details
Members
- CatalogId
-
- Type: string
The catalog ID where the table resides.
- DatabaseName
-
- Required: Yes
- Type: string
Specifies the name of a database from which you want to delete a partition index.
- IndexName
-
- Required: Yes
- Type: string
The name of the partition index to be deleted.
- TableName
-
- Required: Yes
- Type: string
Specifies the name of a table from which you want to delete a partition index.
Result Syntax
[]
Result Details
Errors
-
An internal service error occurred.
-
The operation timed out.
-
The input provided was not valid.
-
A specified entity does not exist
-
The
CreatePartitions
API was called on a table that has indexes enabled. -
An encryption operation failed.
DeleteRegistry
$result = $client->deleteRegistry
([/* ... */]); $promise = $client->deleteRegistryAsync
([/* ... */]);
Delete the entire registry including schema and all of its versions. To get the status of the delete operation, you can call the GetRegistry
API after the asynchronous call. Deleting a registry will disable all online operations for the registry such as the UpdateRegistry
, CreateSchema
, UpdateSchema
, and RegisterSchemaVersion
APIs.
Parameter Syntax
$result = $client->deleteRegistry([ 'RegistryId' => [ // REQUIRED 'RegistryArn' => '<string>', 'RegistryName' => '<string>', ], ]);
Parameter Details
Members
- RegistryId
-
- Required: Yes
- Type: RegistryId structure
This is a wrapper structure that may contain the registry name and Amazon Resource Name (ARN).
Result Syntax
[ 'RegistryArn' => '<string>', 'RegistryName' => '<string>', 'Status' => 'AVAILABLE|DELETING', ]
Result Details
Members
Errors
-
The input provided was not valid.
-
A specified entity does not exist
-
Access to a resource was denied.
-
ConcurrentModificationException:
Two processes are trying to modify a resource simultaneously.
DeleteResourcePolicy
$result = $client->deleteResourcePolicy
([/* ... */]); $promise = $client->deleteResourcePolicyAsync
([/* ... */]);
Deletes a specified policy.
Parameter Syntax
$result = $client->deleteResourcePolicy([ 'PolicyHashCondition' => '<string>', 'ResourceArn' => '<string>', ]);
Parameter Details
Members
Result Syntax
[]
Result Details
Errors
-
A specified entity does not exist
-
An internal service error occurred.
-
The operation timed out.
-
The input provided was not valid.
-
ConditionCheckFailureException:
A specified condition was not satisfied.
DeleteSchema
$result = $client->deleteSchema
([/* ... */]); $promise = $client->deleteSchemaAsync
([/* ... */]);
Deletes the entire schema set, including the schema set and all of its versions. To get the status of the delete operation, you can call GetSchema
API after the asynchronous call. Deleting a registry will disable all online operations for the schema, such as the GetSchemaByDefinition
, and RegisterSchemaVersion
APIs.
Parameter Syntax
$result = $client->deleteSchema([ 'SchemaId' => [ // REQUIRED 'RegistryName' => '<string>', 'SchemaArn' => '<string>', 'SchemaName' => '<string>', ], ]);
Parameter Details
Members
- SchemaId
-
- Required: Yes
- Type: SchemaId structure
This is a wrapper structure that may contain the schema name and Amazon Resource Name (ARN).
Result Syntax
[ 'SchemaArn' => '<string>', 'SchemaName' => '<string>', 'Status' => 'AVAILABLE|PENDING|DELETING', ]
Result Details
Members
Errors
-
The input provided was not valid.
-
A specified entity does not exist
-
Access to a resource was denied.
-
ConcurrentModificationException:
Two processes are trying to modify a resource simultaneously.
DeleteSchemaVersions
$result = $client->deleteSchemaVersions
([/* ... */]); $promise = $client->deleteSchemaVersionsAsync
([/* ... */]);
Remove versions from the specified schema. A version number or range may be supplied. If the compatibility mode forbids deleting of a version that is necessary, such as BACKWARDS_FULL, an error is returned. Calling the GetSchemaVersions
API after this call will list the status of the deleted versions.
When the range of version numbers contain check pointed version, the API will return a 409 conflict and will not proceed with the deletion. You have to remove the checkpoint first using the DeleteSchemaCheckpoint
API before using this API.
You cannot use the DeleteSchemaVersions
API to delete the first schema version in the schema set. The first schema version can only be deleted by the DeleteSchema
API. This operation will also delete the attached SchemaVersionMetadata
under the schema versions. Hard deletes will be enforced on the database.
If the compatibility mode forbids deleting of a version that is necessary, such as BACKWARDS_FULL, an error is returned.
Parameter Syntax
$result = $client->deleteSchemaVersions([ 'SchemaId' => [ // REQUIRED 'RegistryName' => '<string>', 'SchemaArn' => '<string>', 'SchemaName' => '<string>', ], 'Versions' => '<string>', // REQUIRED ]);
Parameter Details
Members
- SchemaId
-
- Required: Yes
- Type: SchemaId structure
This is a wrapper structure that may contain the schema name and Amazon Resource Name (ARN).
- Versions
-
- Required: Yes
- Type: string
A version range may be supplied which may be of the format:
-
a single version number, 5
-
a range, 5-8 : deletes versions 5, 6, 7, 8
Result Syntax
[ 'SchemaVersionErrors' => [ [ 'ErrorDetails' => [ 'ErrorCode' => '<string>', 'ErrorMessage' => '<string>', ], 'VersionNumber' => <integer>, ], // ... ], ]
Result Details
Members
- SchemaVersionErrors
-
- Type: Array of SchemaVersionErrorItem structures
A list of
SchemaVersionErrorItem
objects, each containing an error and schema version.
Errors
-
The input provided was not valid.
-
A specified entity does not exist
-
Access to a resource was denied.
-
ConcurrentModificationException:
Two processes are trying to modify a resource simultaneously.
DeleteSecurityConfiguration
$result = $client->deleteSecurityConfiguration
([/* ... */]); $promise = $client->deleteSecurityConfigurationAsync
([/* ... */]);
Deletes a specified security configuration.
Parameter Syntax
$result = $client->deleteSecurityConfiguration([ 'Name' => '<string>', // REQUIRED ]);
Parameter Details
Result Syntax
[]
Result Details
Errors
-
A specified entity does not exist
-
The input provided was not valid.
-
An internal service error occurred.
-
The operation timed out.
DeleteTable
$result = $client->deleteTable
([/* ... */]); $promise = $client->deleteTableAsync
([/* ... */]);
Removes a table definition from the Data Catalog.
After completing this operation, you no longer have access to the table versions and partitions that belong to the deleted table. AWS Glue deletes these "orphaned" resources asynchronously in a timely manner, at the discretion of the service.
To ensure the immediate deletion of all related resources, before calling DeleteTable
, use DeleteTableVersion
or BatchDeleteTableVersion
, and DeletePartition
or BatchDeletePartition
, to delete any resources that belong to the table.
Parameter Syntax
$result = $client->deleteTable([ 'CatalogId' => '<string>', 'DatabaseName' => '<string>', // REQUIRED 'Name' => '<string>', // REQUIRED ]);
Parameter Details
Members
- CatalogId
-
- Type: string
The ID of the Data Catalog where the table resides. If none is provided, the AWS account ID is used by default.
- DatabaseName
-
- Required: Yes
- Type: string
The name of the catalog database in which the table resides. For Hive compatibility, this name is entirely lowercase.
- Name
-
- Required: Yes
- Type: string
The name of the table to be deleted. For Hive compatibility, this name is entirely lowercase.
Result Syntax
[]
Result Details
Errors
-
A specified entity does not exist
-
The input provided was not valid.
-
An internal service error occurred.
-
The operation timed out.
DeleteTableVersion
$result = $client->deleteTableVersion
([/* ... */]); $promise = $client->deleteTableVersionAsync
([/* ... */]);
Deletes a specified version of a table.
Parameter Syntax
$result = $client->deleteTableVersion([ 'CatalogId' => '<string>', 'DatabaseName' => '<string>', // REQUIRED 'TableName' => '<string>', // REQUIRED 'VersionId' => '<string>', // REQUIRED ]);
Parameter Details
Members
- CatalogId
-
- Type: string
The ID of the Data Catalog where the tables reside. If none is provided, the AWS account ID is used by default.
- DatabaseName
-
- Required: Yes
- Type: string
The database in the catalog in which the table resides. For Hive compatibility, this name is entirely lowercase.
- TableName
-
- Required: Yes
- Type: string
The name of the table. For Hive compatibility, this name is entirely lowercase.
- VersionId
-
- Required: Yes
- Type: string
The ID of the table version to be deleted. A
VersionID
is a string representation of an integer. Each version is incremented by 1.
Result Syntax
[]
Result Details
Errors
-
A specified entity does not exist
-
The input provided was not valid.
-
An internal service error occurred.
-
The operation timed out.
DeleteTrigger
$result = $client->deleteTrigger
([/* ... */]); $promise = $client->deleteTriggerAsync
([/* ... */]);
Deletes a specified trigger. If the trigger is not found, no exception is thrown.
Parameter Syntax
$result = $client->deleteTrigger([ 'Name' => '<string>', // REQUIRED ]);
Parameter Details
Result Syntax
[ 'Name' => '<string>', ]
Result Details
Errors
-
The input provided was not valid.
-
An internal service error occurred.
-
The operation timed out.
-
ConcurrentModificationException:
Two processes are trying to modify a resource simultaneously.
DeleteUserDefinedFunction
$result = $client->deleteUserDefinedFunction
([/* ... */]); $promise = $client->deleteUserDefinedFunctionAsync
([/* ... */]);
Deletes an existing function definition from the Data Catalog.
Parameter Syntax
$result = $client->deleteUserDefinedFunction([ 'CatalogId' => '<string>', 'DatabaseName' => '<string>', // REQUIRED 'FunctionName' => '<string>', // REQUIRED ]);
Parameter Details
Members
- CatalogId
-
- Type: string
The ID of the Data Catalog where the function to be deleted is located. If none is supplied, the AWS account ID is used by default.
- DatabaseName
-
- Required: Yes
- Type: string
The name of the catalog database where the function is located.
- FunctionName
-
- Required: Yes
- Type: string
The name of the function definition to be deleted.
Result Syntax
[]
Result Details
Errors
-
A specified entity does not exist
-
The input provided was not valid.
-
An internal service error occurred.
-
The operation timed out.
DeleteWorkflow
$result = $client->deleteWorkflow
([/* ... */]); $promise = $client->deleteWorkflowAsync
([/* ... */]);
Deletes a workflow.
Parameter Syntax
$result = $client->deleteWorkflow([ 'Name' => '<string>', // REQUIRED ]);
Parameter Details
Result Syntax
[ 'Name' => '<string>', ]
Result Details
Errors
-
The input provided was not valid.
-
An internal service error occurred.
-
The operation timed out.
-
ConcurrentModificationException:
Two processes are trying to modify a resource simultaneously.
GetCatalogImportStatus
$result = $client->getCatalogImportStatus
([/* ... */]); $promise = $client->getCatalogImportStatusAsync
([/* ... */]);
Retrieves the status of a migration operation.
Parameter Syntax
$result = $client->getCatalogImportStatus([ 'CatalogId' => '<string>', ]);
Parameter Details
Members
Result Syntax
[ 'ImportStatus' => [ 'ImportCompleted' => true || false, 'ImportTime' => <DateTime>, 'ImportedBy' => '<string>', ], ]
Result Details
Members
- ImportStatus
-
- Type: CatalogImportStatus structure
The status of the specified catalog migration.
Errors
-
An internal service error occurred.
-
The operation timed out.
GetClassifier
$result = $client->getClassifier
([/* ... */]); $promise = $client->getClassifierAsync
([/* ... */]);
Retrieve a classifier by name.
Parameter Syntax
$result = $client->getClassifier([ 'Name' => '<string>', // REQUIRED ]);
Parameter Details
Result Syntax
[ 'Classifier' => [ 'CsvClassifier' => [ 'AllowSingleColumn' => true || false, 'ContainsHeader' => 'UNKNOWN|PRESENT|ABSENT', 'CreationTime' => <DateTime>, 'Delimiter' => '<string>', 'DisableValueTrimming' => true || false, 'Header' => ['<string>', ...], 'LastUpdated' => <DateTime>, 'Name' => '<string>', 'QuoteSymbol' => '<string>', 'Version' => <integer>, ], 'GrokClassifier' => [ 'Classification' => '<string>', 'CreationTime' => <DateTime>, 'CustomPatterns' => '<string>', 'GrokPattern' => '<string>', 'LastUpdated' => <DateTime>, 'Name' => '<string>', 'Version' => <integer>, ], 'JsonClassifier' => [ 'CreationTime' => <DateTime>, 'JsonPath' => '<string>', 'LastUpdated' => <DateTime>, 'Name' => '<string>', 'Version' => <integer>, ], 'XMLClassifier' => [ 'Classification' => '<string>', 'CreationTime' => <DateTime>, 'LastUpdated' => <DateTime>, 'Name' => '<string>', 'RowTag' => '<string>', 'Version' => <integer>, ], ], ]
Result Details
Members
- Classifier
-
- Type: Classifier structure
The requested classifier.
Errors
-
A specified entity does not exist
-
The operation timed out.
GetClassifiers
$result = $client->getClassifiers
([/* ... */]); $promise = $client->getClassifiersAsync
([/* ... */]);
Lists all classifier objects in the Data Catalog.
Parameter Syntax
$result = $client->getClassifiers([ 'MaxResults' => <integer>, 'NextToken' => '<string>', ]);
Parameter Details
Members
Result Syntax
[ 'Classifiers' => [ [ 'CsvClassifier' => [ 'AllowSingleColumn' => true || false, 'ContainsHeader' => 'UNKNOWN|PRESENT|ABSENT', 'CreationTime' => <DateTime>, 'Delimiter' => '<string>', 'DisableValueTrimming' => true || false, 'Header' => ['<string>', ...], 'LastUpdated' => <DateTime>, 'Name' => '<string>', 'QuoteSymbol' => '<string>', 'Version' => <integer>, ], 'GrokClassifier' => [ 'Classification' => '<string>', 'CreationTime' => <DateTime>, 'CustomPatterns' => '<string>', 'GrokPattern' => '<string>', 'LastUpdated' => <DateTime>, 'Name' => '<string>', 'Version' => <integer>, ], 'JsonClassifier' => [ 'CreationTime' => <DateTime>, 'JsonPath' => '<string>', 'LastUpdated' => <DateTime>, 'Name' => '<string>', 'Version' => <integer>, ], 'XMLClassifier' => [ 'Classification' => '<string>', 'CreationTime' => <DateTime>, 'LastUpdated' => <DateTime>, 'Name' => '<string>', 'RowTag' => '<string>', 'Version' => <integer>, ], ], // ... ], 'NextToken' => '<string>', ]
Result Details
Members
- Classifiers
-
- Type: Array of Classifier structures
The requested list of classifier objects.
- NextToken
-
- Type: string
A continuation token.
Errors
-
The operation timed out.
GetColumnStatisticsForPartition
$result = $client->getColumnStatisticsForPartition
([/* ... */]); $promise = $client->getColumnStatisticsForPartitionAsync
([/* ... */]);
Retrieves partition statistics of columns.
The Identity and Access Management (IAM) permission required for this operation is GetPartition
.
Parameter Syntax
$result = $client->getColumnStatisticsForPartition([ 'CatalogId' => '<string>', 'ColumnNames' => ['<string>', ...], // REQUIRED 'DatabaseName' => '<string>', // REQUIRED 'PartitionValues' => ['<string>', ...], // REQUIRED 'TableName' => '<string>', // REQUIRED ]);
Parameter Details
Members
- CatalogId
-
- Type: string
The ID of the Data Catalog where the partitions in question reside. If none is supplied, the AWS account ID is used by default.
- ColumnNames
-
- Required: Yes
- Type: Array of strings
A list of the column names.
- DatabaseName
-
- Required: Yes
- Type: string
The name of the catalog database where the partitions reside.
- PartitionValues
-
- Required: Yes
- Type: Array of strings
A list of partition values identifying the partition.
- TableName
-
- Required: Yes
- Type: string
The name of the partitions' table.
Result Syntax
[ 'ColumnStatisticsList' => [ [ 'AnalyzedTime' => <DateTime>, 'ColumnName' => '<string>', 'ColumnType' => '<string>', 'StatisticsData' => [ 'BinaryColumnStatisticsData' => [ 'AverageLength' => <float>, 'MaximumLength' => <integer>, 'NumberOfNulls' => <integer>, ], 'BooleanColumnStatisticsData' => [ 'NumberOfFalses' => <integer>, 'NumberOfNulls' => <integer>, 'NumberOfTrues' => <integer>, ], 'DateColumnStatisticsData' => [ 'MaximumValue' => <DateTime>, 'MinimumValue' => <DateTime>, 'NumberOfDistinctValues' => <integer>, 'NumberOfNulls' => <integer>, ], 'DecimalColumnStatisticsData' => [ 'MaximumValue' => [ 'Scale' => <integer>, 'UnscaledValue' => <string || resource || Psr\Http\Message\StreamInterface>, ], 'MinimumValue' => [ 'Scale' => <integer>, 'UnscaledValue' => <string || resource || Psr\Http\Message\StreamInterface>, ], 'NumberOfDistinctValues' => <integer>, 'NumberOfNulls' => <integer>, ], 'DoubleColumnStatisticsData' => [ 'MaximumValue' => <float>, 'MinimumValue' => <float>, 'NumberOfDistinctValues' => <integer>, 'NumberOfNulls' => <integer>, ], 'LongColumnStatisticsData' => [ 'MaximumValue' => <integer>, 'MinimumValue' => <integer>, 'NumberOfDistinctValues' => <integer>, 'NumberOfNulls' => <integer>, ], 'StringColumnStatisticsData' => [ 'AverageLength' => <float>, 'MaximumLength' => <integer>, 'NumberOfDistinctValues' => <integer>, 'NumberOfNulls' => <integer>, ], 'Type' => 'BOOLEAN|DATE|DECIMAL|DOUBLE|LONG|STRING|BINARY', ], ], // ... ], 'Errors' => [ [ 'ColumnName' => '<string>', 'Error' => [ 'ErrorCode' => '<string>', 'ErrorMessage' => '<string>', ], ], // ... ], ]
Result Details
Members
- ColumnStatisticsList
-
- Type: Array of ColumnStatistics structures
List of ColumnStatistics that failed to be retrieved.
- Errors
-
- Type: Array of ColumnError structures
Error occurred during retrieving column statistics data.
Errors
-
A specified entity does not exist
-
The input provided was not valid.
-
An internal service error occurred.
-
The operation timed out.
-
An encryption operation failed.
GetColumnStatisticsForTable
$result = $client->getColumnStatisticsForTable
([/* ... */]); $promise = $client->getColumnStatisticsForTableAsync
([/* ... */]);
Retrieves table statistics of columns.
The Identity and Access Management (IAM) permission required for this operation is GetTable
.
Parameter Syntax
$result = $client->getColumnStatisticsForTable([ 'CatalogId' => '<string>', 'ColumnNames' => ['<string>', ...], // REQUIRED 'DatabaseName' => '<string>', // REQUIRED 'TableName' => '<string>', // REQUIRED ]);
Parameter Details
Members
- CatalogId
-
- Type: string
The ID of the Data Catalog where the partitions in question reside. If none is supplied, the AWS account ID is used by default.
- ColumnNames
-
- Required: Yes
- Type: Array of strings
A list of the column names.
- DatabaseName
-
- Required: Yes
- Type: string
The name of the catalog database where the partitions reside.
- TableName
-
- Required: Yes
- Type: string
The name of the partitions' table.
Result Syntax
[ 'ColumnStatisticsList' => [ [ 'AnalyzedTime' => <DateTime>, 'ColumnName' => '<string>', 'ColumnType' => '<string>', 'StatisticsData' => [ 'BinaryColumnStatisticsData' => [ 'AverageLength' => <float>, 'MaximumLength' => <integer>, 'NumberOfNulls' => <integer>, ], 'BooleanColumnStatisticsData' => [ 'NumberOfFalses' => <integer>, 'NumberOfNulls' => <integer>, 'NumberOfTrues' => <integer>, ], 'DateColumnStatisticsData' => [ 'MaximumValue' => <DateTime>, 'MinimumValue' => <DateTime>, 'NumberOfDistinctValues' => <integer>, 'NumberOfNulls' => <integer>, ], 'DecimalColumnStatisticsData' => [ 'MaximumValue' => [ 'Scale' => <integer>, 'UnscaledValue' => <string || resource || Psr\Http\Message\StreamInterface>, ], 'MinimumValue' => [ 'Scale' => <integer>, 'UnscaledValue' => <string || resource || Psr\Http\Message\StreamInterface>, ], 'NumberOfDistinctValues' => <integer>, 'NumberOfNulls' => <integer>, ], 'DoubleColumnStatisticsData' => [ 'MaximumValue' => <float>, 'MinimumValue' => <float>, 'NumberOfDistinctValues' => <integer>, 'NumberOfNulls' => <integer>, ], 'LongColumnStatisticsData' => [ 'MaximumValue' => <integer>, 'MinimumValue' => <integer>, 'NumberOfDistinctValues' => <integer>, 'NumberOfNulls' => <integer>, ], 'StringColumnStatisticsData' => [ 'AverageLength' => <float>, 'MaximumLength' => <integer>, 'NumberOfDistinctValues' => <integer>, 'NumberOfNulls' => <integer>, ], 'Type' => 'BOOLEAN|DATE|DECIMAL|DOUBLE|LONG|STRING|BINARY', ], ], // ... ], 'Errors' => [ [ 'ColumnName' => '<string>', 'Error' => [ 'ErrorCode' => '<string>', 'ErrorMessage' => '<string>', ], ], // ... ], ]
Result Details
Members
- ColumnStatisticsList
-
- Type: Array of ColumnStatistics structures
List of ColumnStatistics that failed to be retrieved.
- Errors
-
- Type: Array of ColumnError structures
List of ColumnStatistics that failed to be retrieved.
Errors
-
A specified entity does not exist
-
The input provided was not valid.
-
An internal service error occurred.
-
The operation timed out.
-
An encryption operation failed.
GetConnection
$result = $client->getConnection
([/* ... */]); $promise = $client->getConnectionAsync
([/* ... */]);
Retrieves a connection definition from the Data Catalog.
Parameter Syntax
$result = $client->getConnection([ 'CatalogId' => '<string>', 'HidePassword' => true || false, 'Name' => '<string>', // REQUIRED ]);
Parameter Details
Members
- CatalogId
-
- Type: string
The ID of the Data Catalog in which the connection resides. If none is provided, the AWS account ID is used by default.
- HidePassword
-
- Type: boolean
Allows you to retrieve the connection metadata without returning the password. For instance, the AWS Glue console uses this flag to retrieve the connection, and does not display the password. Set this parameter when the caller might not have permission to use the AWS KMS key to decrypt the password, but it does have permission to access the rest of the connection properties.
- Name
-
- Required: Yes
- Type: string
The name of the connection definition to retrieve.
Result Syntax
[ 'Connection' => [ 'ConnectionProperties' => ['<string>', ...], 'ConnectionType' => 'JDBC|SFTP|MONGODB|KAFKA|NETWORK|MARKETPLACE|CUSTOM', 'CreationTime' => <DateTime>, 'Description' => '<string>', 'LastUpdatedBy' => '<string>', 'LastUpdatedTime' => <DateTime>, 'MatchCriteria' => ['<string>', ...], 'Name' => '<string>', 'PhysicalConnectionRequirements' => [ 'AvailabilityZone' => '<string>', 'SecurityGroupIdList' => ['<string>', ...], 'SubnetId' => '<string>', ], ], ]
Result Details
Members
- Connection
-
- Type: Connection structure
The requested connection definition.
Errors
-
A specified entity does not exist
-
The operation timed out.
-
The input provided was not valid.
-
An encryption operation failed.
GetConnections
$result = $client->getConnections
([/* ... */]); $promise = $client->getConnectionsAsync
([/* ... */]);
Retrieves a list of connection definitions from the Data Catalog.
Parameter Syntax
$result = $client->getConnections([ 'CatalogId' => '<string>', 'Filter' => [ 'ConnectionType' => 'JDBC|SFTP|MONGODB|KAFKA|NETWORK|MARKETPLACE|CUSTOM', 'MatchCriteria' => ['<string>', ...], ], 'HidePassword' => true || false, 'MaxResults' => <integer>, 'NextToken' => '<string>', ]);
Parameter Details
Members
- CatalogId
-
- Type: string
The ID of the Data Catalog in which the connections reside. If none is provided, the AWS account ID is used by default.
- Filter
-
- Type: GetConnectionsFilter structure
A filter that controls which connections are returned.
- HidePassword
-
- Type: boolean
Allows you to retrieve the connection metadata without returning the password. For instance, the AWS Glue console uses this flag to retrieve the connection, and does not display the password. Set this parameter when the caller might not have permission to use the AWS KMS key to decrypt the password, but it does have permission to access the rest of the connection properties.
- MaxResults
-
- Type: int
The maximum number of connections to return in one response.
- NextToken
-
- Type: string
A continuation token, if this is a continuation call.
Result Syntax
[ 'ConnectionList' => [ [ 'ConnectionProperties' => ['<string>', ...], 'ConnectionType' => 'JDBC|SFTP|MONGODB|KAFKA|NETWORK|MARKETPLACE|CUSTOM', 'CreationTime' => <DateTime>, 'Description' => '<string>', 'LastUpdatedBy' => '<string>', 'LastUpdatedTime' => <DateTime>, 'MatchCriteria' => ['<string>', ...], 'Name' => '<string>', 'PhysicalConnectionRequirements' => [ 'AvailabilityZone' => '<string>', 'SecurityGroupIdList' => ['<string>', ...], 'SubnetId' => '<string>', ], ], // ... ], 'NextToken' => '<string>', ]
Result Details
Members
- ConnectionList
-
- Type: Array of Connection structures
A list of requested connection definitions.
- NextToken
-
- Type: string
A continuation token, if the list of connections returned does not include the last of the filtered connections.
Errors
-
A specified entity does not exist
-
The operation timed out.
-
The input provided was not valid.
-
An encryption operation failed.
GetCrawler
$result = $client->getCrawler
([/* ... */]); $promise = $client->getCrawlerAsync
([/* ... */]);
Retrieves metadata for a specified crawler.
Parameter Syntax
$result = $client->getCrawler([ 'Name' => '<string>', // REQUIRED ]);
Parameter Details
Result Syntax
[ 'Crawler' => [ 'Classifiers' => ['<string>', ...], 'Configuration' => '<string>', 'CrawlElapsedTime' => <integer>, 'CrawlerSecurityConfiguration' => '<string>', 'CreationTime' => <DateTime>, 'DatabaseName' => '<string>', 'Description' => '<string>', 'LastCrawl' => [ 'ErrorMessage' => '<string>', 'LogGroup' => '<string>', 'LogStream' => '<string>', 'MessagePrefix' => '<string>', 'StartTime' => <DateTime>, 'Status' => 'SUCCEEDED|CANCELLED|FAILED', ], 'LastUpdated' => <DateTime>, 'LineageConfiguration' => [ 'CrawlerLineageSettings' => 'ENABLE|DISABLE', ], 'Name' => '<string>', 'RecrawlPolicy' => [ 'RecrawlBehavior' => 'CRAWL_EVERYTHING|CRAWL_NEW_FOLDERS_ONLY', ], 'Role' => '<string>', 'Schedule' => [ 'ScheduleExpression' => '<string>', 'State' => 'SCHEDULED|NOT_SCHEDULED|TRANSITIONING', ], 'SchemaChangePolicy' => [ 'DeleteBehavior' => 'LOG|DELETE_FROM_DATABASE|DEPRECATE_IN_DATABASE', 'UpdateBehavior' => 'LOG|UPDATE_IN_DATABASE', ], 'State' => 'READY|RUNNING|STOPPING', 'TablePrefix' => '<string>', 'Targets' => [ 'CatalogTargets' => [ [ 'DatabaseName' => '<string>', 'Tables' => ['<string>', ...], ], // ... ], 'DynamoDBTargets' => [ [ 'Path' => '<string>', 'scanAll' => true || false, 'scanRate' => <float>, ], // ... ], 'JdbcTargets' => [ [ 'ConnectionName' => '<string>', 'Exclusions' => ['<string>', ...], 'Path' => '<string>', ], // ... ], 'MongoDBTargets' => [ [ 'ConnectionName' => '<string>', 'Path' => '<string>', 'ScanAll' => true || false, ], // ... ], 'S3Targets' => [ [ 'ConnectionName' => '<string>', 'Exclusions' => ['<string>', ...], 'Path' => '<string>', ], // ... ], ], 'Version' => <integer>, ], ]
Result Details
Members
- Crawler
-
- Type: Crawler structure
The metadata for the specified crawler.
Errors
-
A specified entity does not exist
-
The operation timed out.
GetCrawlerMetrics
$result = $client->getCrawlerMetrics
([/* ... */]); $promise = $client->getCrawlerMetricsAsync
([/* ... */]);
Retrieves metrics about specified crawlers.
Parameter Syntax
$result = $client->getCrawlerMetrics([ 'CrawlerNameList' => ['<string>', ...], 'MaxResults' => <integer>, 'NextToken' => '<string>', ]);
Parameter Details
Members
Result Syntax
[ 'CrawlerMetricsList' => [ [ 'CrawlerName' => '<string>', 'LastRuntimeSeconds' => <float>, 'MedianRuntimeSeconds' => <float>, 'StillEstimating' => true || false, 'TablesCreated' => <integer>, 'TablesDeleted' => <integer>, 'TablesUpdated' => <integer>, 'TimeLeftSeconds' => <float>, ], // ... ], 'NextToken' => '<string>', ]
Result Details
Members
- CrawlerMetricsList
-
- Type: Array of CrawlerMetrics structures
A list of metrics for the specified crawler.
- NextToken
-
- Type: string
A continuation token, if the returned list does not contain the last metric available.
Errors
-
The operation timed out.
GetCrawlers
$result = $client->getCrawlers
([/* ... */]); $promise = $client->getCrawlersAsync
([/* ... */]);
Retrieves metadata for all crawlers defined in the customer account.
Parameter Syntax
$result = $client->getCrawlers([ 'MaxResults' => <integer>, 'NextToken' => '<string>', ]);
Parameter Details
Members
Result Syntax
[ 'Crawlers' => [ [ 'Classifiers' => ['<string>', ...], 'Configuration' => '<string>', 'CrawlElapsedTime' => <integer>, 'CrawlerSecurityConfiguration' => '<string>', 'CreationTime' => <DateTime>, 'DatabaseName' => '<string>', 'Description' => '<string>', 'LastCrawl' => [ 'ErrorMessage' => '<string>', 'LogGroup' => '<string>', 'LogStream' => '<string>', 'MessagePrefix' => '<string>', 'StartTime' => <DateTime>, 'Status' => 'SUCCEEDED|CANCELLED|FAILED', ], 'LastUpdated' => <DateTime>, 'LineageConfiguration' => [ 'CrawlerLineageSettings' => 'ENABLE|DISABLE', ], 'Name' => '<string>', 'RecrawlPolicy' => [ 'RecrawlBehavior' => 'CRAWL_EVERYTHING|CRAWL_NEW_FOLDERS_ONLY', ], 'Role' => '<string>', 'Schedule' => [ 'ScheduleExpression' => '<string>', 'State' => 'SCHEDULED|NOT_SCHEDULED|TRANSITIONING', ], 'SchemaChangePolicy' => [ 'DeleteBehavior' => 'LOG|DELETE_FROM_DATABASE|DEPRECATE_IN_DATABASE', 'UpdateBehavior' => 'LOG|UPDATE_IN_DATABASE', ], 'State' => 'READY|RUNNING|STOPPING', 'TablePrefix' => '<string>', 'Targets' => [ 'CatalogTargets' => [ [ 'DatabaseName' => '<string>', 'Tables' => ['<string>', ...], ], // ... ], 'DynamoDBTargets' => [ [ 'Path' => '<string>', 'scanAll' => true || false, 'scanRate' => <float>, ], // ... ], 'JdbcTargets' => [ [ 'ConnectionName' => '<string>', 'Exclusions' => ['<string>', ...], 'Path' => '<string>', ], // ... ], 'MongoDBTargets' => [ [ 'ConnectionName' => '<string>', 'Path' => '<string>', 'ScanAll' => true || false, ], // ... ], 'S3Targets' => [ [ 'ConnectionName' => '<string>', 'Exclusions' => ['<string>', ...], 'Path' => '<string>', ], // ... ], ], 'Version' => <integer>, ], // ... ], 'NextToken' => '<string>', ]
Result Details
Members
- Crawlers
-
- Type: Array of Crawler structures
A list of crawler metadata.
- NextToken
-
- Type: string
A continuation token, if the returned list has not reached the end of those defined in this customer account.
Errors
-
The operation timed out.
GetDataCatalogEncryptionSettings
$result = $client->getDataCatalogEncryptionSettings
([/* ... */]); $promise = $client->getDataCatalogEncryptionSettingsAsync
([/* ... */]);
Retrieves the security configuration for a specified catalog.
Parameter Syntax
$result = $client->getDataCatalogEncryptionSettings([ 'CatalogId' => '<string>', ]);
Parameter Details
Members
Result Syntax
[ 'DataCatalogEncryptionSettings' => [ 'ConnectionPasswordEncryption' => [ 'AwsKmsKeyId' => '<string>', 'ReturnConnectionPasswordEncrypted' => true || false, ], 'EncryptionAtRest' => [ 'CatalogEncryptionMode' => 'DISABLED|SSE-KMS', 'SseAwsKmsKeyId' => '<string>', ], ], ]
Result Details
Members
- DataCatalogEncryptionSettings
-
- Type: DataCatalogEncryptionSettings structure
The requested security configuration.
Errors
-
An internal service error occurred.
-
The input provided was not valid.
-
The operation timed out.
GetDatabase
$result = $client->getDatabase
([/* ... */]); $promise = $client->getDatabaseAsync
([/* ... */]);
Retrieves the definition of a specified database.
Parameter Syntax
$result = $client->getDatabase([ 'CatalogId' => '<string>', 'Name' => '<string>', // REQUIRED ]);
Parameter Details
Members
Result Syntax
[ 'Database' => [ 'CatalogId' => '<string>', 'CreateTableDefaultPermissions' => [ [ 'Permissions' => ['<string>', ...], 'Principal' => [ 'DataLakePrincipalIdentifier' => '<string>', ], ], // ... ], 'CreateTime' => <DateTime>, 'Description' => '<string>', 'LocationUri' => '<string>', 'Name' => '<string>', 'Parameters' => ['<string>', ...], 'TargetDatabase' => [ 'CatalogId' => '<string>', 'DatabaseName' => '<string>', ], ], ]
Result Details
Members
- Database
-
- Type: Database structure
The definition of the specified database in the Data Catalog.
Errors
-
The input provided was not valid.
-
A specified entity does not exist
-
An internal service error occurred.
-
The operation timed out.
-
An encryption operation failed.
GetDatabases
$result = $client->getDatabases
([/* ... */]); $promise = $client->getDatabasesAsync
([/* ... */]);
Retrieves all databases defined in a given Data Catalog.
Parameter Syntax
$result = $client->getDatabases([ 'CatalogId' => '<string>', 'MaxResults' => <integer>, 'NextToken' => '<string>', 'ResourceShareType' => 'FOREIGN|ALL', ]);
Parameter Details
Members
- CatalogId
-
- Type: string
The ID of the Data Catalog from which to retrieve
Databases
. If none is provided, the AWS account ID is used by default. - MaxResults
-
- Type: int
The maximum number of databases to return in one response.
- NextToken
-
- Type: string
A continuation token, if this is a continuation call.
- ResourceShareType
-
- Type: string
Allows you to specify that you want to list the databases shared with your account. The allowable values are
FOREIGN
orALL
.-
If set to
FOREIGN
, will list the databases shared with your account. -
If set to
ALL
, will list the databases shared with your account, as well as the databases in yor local account.
Result Syntax
[ 'DatabaseList' => [ [ 'CatalogId' => '<string>', 'CreateTableDefaultPermissions' => [ [ 'Permissions' => ['<string>', ...], 'Principal' => [ 'DataLakePrincipalIdentifier' => '<string>', ], ], // ... ], 'CreateTime' => <DateTime>, 'Description' => '<string>', 'LocationUri' => '<string>', 'Name' => '<string>', 'Parameters' => ['<string>', ...], 'TargetDatabase' => [ 'CatalogId' => '<string>', 'DatabaseName' => '<string>', ], ], // ... ], 'NextToken' => '<string>', ]
Result Details
Members
- DatabaseList
-
- Required: Yes
- Type: Array of Database structures
A list of
Database
objects from the specified catalog. - NextToken
-
- Type: string
A continuation token for paginating the returned list of tokens, returned if the current segment of the list is not the last.
Errors
-
The input provided was not valid.
-
An internal service error occurred.
-
The operation timed out.
-
An encryption operation failed.
GetDataflowGraph
$result = $client->getDataflowGraph
([/* ... */]); $promise = $client->getDataflowGraphAsync
([/* ... */]);
Transforms a Python script into a directed acyclic graph (DAG).
Parameter Syntax
$result = $client->getDataflowGraph([ 'PythonScript' => '<string>', ]);
Parameter Details
Result Syntax
[ 'DagEdges' => [ [ 'Source' => '<string>', 'Target' => '<string>', 'TargetParameter' => '<string>', ], // ... ], 'DagNodes' => [ [ 'Args' => [ [ 'Name' => '<string>', 'Param' => true || false, 'Value' => '<string>', ], // ... ], 'Id' => '<string>', 'LineNumber' => <integer>, 'NodeType' => '<string>', ], // ... ], ]
Result Details
Members
- DagEdges
-
- Type: Array of CodeGenEdge structures
A list of the edges in the resulting DAG.
- DagNodes
-
- Type: Array of CodeGenNode structures
A list of the nodes in the resulting DAG.
Errors
-
The input provided was not valid.
-
An internal service error occurred.
-
The operation timed out.
GetDevEndpoint
$result = $client->getDevEndpoint
([/* ... */]); $promise = $client->getDevEndpointAsync
([/* ... */]);
Retrieves information about a specified development endpoint.
When you create a development endpoint in a virtual private cloud (VPC), AWS Glue returns only a private IP address, and the public IP address field is not populated. When you create a non-VPC development endpoint, AWS Glue returns only a public IP address.
Parameter Syntax
$result = $client->getDevEndpoint([ 'EndpointName' => '<string>', // REQUIRED ]);
Parameter Details
Members
Result Syntax
[ 'DevEndpoint' => [ 'Arguments' => ['<string>', ...], 'AvailabilityZone' => '<string>', 'CreatedTimestamp' => <DateTime>, 'EndpointName' => '<string>', 'ExtraJarsS3Path' => '<string>', 'ExtraPythonLibsS3Path' => '<string>', 'FailureReason' => '<string>', 'GlueVersion' => '<string>', 'LastModifiedTimestamp' => <DateTime>, 'LastUpdateStatus' => '<string>', 'NumberOfNodes' => <integer>, 'NumberOfWorkers' => <integer>, 'PrivateAddress' => '<string>', 'PublicAddress' => '<string>', 'PublicKey' => '<string>', 'PublicKeys' => ['<string>', ...], 'RoleArn' => '<string>', 'SecurityConfiguration' => '<string>', 'SecurityGroupIds' => ['<string>', ...], 'Status' => '<string>', 'SubnetId' => '<string>', 'VpcId' => '<string>', 'WorkerType' => 'Standard|G.1X|G.2X', 'YarnEndpointAddress' => '<string>', 'ZeppelinRemoteSparkInterpreterPort' => <integer>, ], ]
Result Details
Members
- DevEndpoint
-
- Type: DevEndpoint structure
A
DevEndpoint
definition.
Errors
-
A specified entity does not exist
-
An internal service error occurred.
-
The operation timed out.
-
The input provided was not valid.
GetDevEndpoints
$result = $client->getDevEndpoints
([/* ... */]); $promise = $client->getDevEndpointsAsync
([/* ... */]);
Retrieves all the development endpoints in this AWS account.
When you create a development endpoint in a virtual private cloud (VPC), AWS Glue returns only a private IP address and the public IP address field is not populated. When you create a non-VPC development endpoint, AWS Glue returns only a public IP address.
Parameter Syntax
$result = $client->getDevEndpoints([ 'MaxResults' => <integer>, 'NextToken' => '<string>', ]);
Parameter Details
Members
Result Syntax
[ 'DevEndpoints' => [ [ 'Arguments' => ['<string>', ...], 'AvailabilityZone' => '<string>', 'CreatedTimestamp' => <DateTime>, 'EndpointName' => '<string>', 'ExtraJarsS3Path' => '<string>', 'ExtraPythonLibsS3Path' => '<string>', 'FailureReason' => '<string>', 'GlueVersion' => '<string>', 'LastModifiedTimestamp' => <DateTime>, 'LastUpdateStatus' => '<string>', 'NumberOfNodes' => <integer>, 'NumberOfWorkers' => <integer>, 'PrivateAddress' => '<string>', 'PublicAddress' => '<string>', 'PublicKey' => '<string>', 'PublicKeys' => ['<string>', ...], 'RoleArn' => '<string>', 'SecurityConfiguration' => '<string>', 'SecurityGroupIds' => ['<string>', ...], 'Status' => '<string>', 'SubnetId' => '<string>', 'VpcId' => '<string>', 'WorkerType' => 'Standard|G.1X|G.2X', 'YarnEndpointAddress' => '<string>', 'ZeppelinRemoteSparkInterpreterPort' => <integer>, ], // ... ], 'NextToken' => '<string>', ]
Result Details
Members
- DevEndpoints
-
- Type: Array of DevEndpoint structures
A list of
DevEndpoint
definitions. - NextToken
-
- Type: string
A continuation token, if not all
DevEndpoint
definitions have yet been returned.
Errors
-
A specified entity does not exist
-
An internal service error occurred.
-
The operation timed out.
-
The input provided was not valid.
GetJob
$result = $client->getJob
([/* ... */]); $promise = $client->getJobAsync
([/* ... */]);
Retrieves an existing job definition.
Parameter Syntax
$result = $client->getJob([ 'JobName' => '<string>', // REQUIRED ]);
Parameter Details
Result Syntax
[ 'Job' => [ 'AllocatedCapacity' => <integer>, 'Command' => [ 'Name' => '<string>', 'PythonVersion' => '<string>', 'ScriptLocation' => '<string>', ], 'Connections' => [ 'Connections' => ['<string>', ...], ], 'CreatedOn' => <DateTime>, 'DefaultArguments' => ['<string>', ...], 'Description' => '<string>', 'ExecutionProperty' => [ 'MaxConcurrentRuns' => <integer>, ], 'GlueVersion' => '<string>', 'LastModifiedOn' => <DateTime>, 'LogUri' => '<string>', 'MaxCapacity' => <float>, 'MaxRetries' => <integer>, 'Name' => '<string>', 'NonOverridableArguments' => ['<string>', ...], 'NotificationProperty' => [ 'NotifyDelayAfter' => <integer>, ], 'NumberOfWorkers' => <integer>, 'Role' => '<string>', 'SecurityConfiguration' => '<string>', 'Timeout' => <integer>, 'WorkerType' => 'Standard|G.1X|G.2X', ], ]
Result Details
Members
- Job
-
- Type: Job structure
The requested job definition.
Errors
-
The input provided was not valid.
-
A specified entity does not exist
-
An internal service error occurred.
-
The operation timed out.
GetJobBookmark
$result = $client->getJobBookmark
([/* ... */]); $promise = $client->getJobBookmarkAsync
([/* ... */]);
Returns information on a job bookmark entry.
Parameter Syntax
$result = $client->getJobBookmark([ 'JobName' => '<string>', // REQUIRED 'RunId' => '<string>', ]);
Parameter Details
Members
Result Syntax
[ 'JobBookmarkEntry' => [ 'Attempt' => <integer>, 'JobBookmark' => '<string>', 'JobName' => '<string>', 'PreviousRunId' => '<string>', 'Run' => <integer>, 'RunId' => '<string>', 'Version' => <integer>, ], ]
Result Details
Members
- JobBookmarkEntry
-
- Type: JobBookmarkEntry structure
A structure that defines a point that a job can resume processing.
Errors
-
A specified entity does not exist
-
The input provided was not valid.
-
An internal service error occurred.
-
The operation timed out.
-
A value could not be validated.
GetJobRun
$result = $client->getJobRun
([/* ... */]); $promise = $client->getJobRunAsync
([/* ... */]);
Retrieves the metadata for a given job run.
Parameter Syntax
$result = $client->getJobRun([ 'JobName' => '<string>', // REQUIRED 'PredecessorsIncluded' => true || false, 'RunId' => '<string>', // REQUIRED ]);
Parameter Details
Members
Result Syntax
[ 'JobRun' => [ 'AllocatedCapacity' => <integer>, 'Arguments' => ['<string>', ...], 'Attempt' => <integer>, 'CompletedOn' => <DateTime>, 'ErrorMessage' => '<string>', 'ExecutionTime' => <integer>, 'GlueVersion' => '<string>', 'Id' => '<string>', 'JobName' => '<string>', 'JobRunState' => 'STARTING|RUNNING|STOPPING|STOPPED|SUCCEEDED|FAILED|TIMEOUT', 'LastModifiedOn' => <DateTime>, 'LogGroupName' => '<string>', 'MaxCapacity' => <float>, 'NotificationProperty' => [ 'NotifyDelayAfter' => <integer>, ], 'NumberOfWorkers' => <integer>, 'PredecessorRuns' => [ [ 'JobName' => '<string>', 'RunId' => '<string>', ], // ... ], 'PreviousRunId' => '<string>', 'SecurityConfiguration' => '<string>', 'StartedOn' => <DateTime>, 'Timeout' => <integer>, 'TriggerName' => '<string>', 'WorkerType' => 'Standard|G.1X|G.2X', ], ]
Result Details
Members
- JobRun
-
- Type: JobRun structure
The requested job-run metadata.
Errors
-
The input provided was not valid.
-
A specified entity does not exist
-
An internal service error occurred.
-
The operation timed out.
GetJobRuns
$result = $client->getJobRuns
([/* ... */]); $promise = $client->getJobRunsAsync
([/* ... */]);
Retrieves metadata for all runs of a given job definition.
Parameter Syntax
$result = $client->getJobRuns([ 'JobName' => '<string>', // REQUIRED 'MaxResults' => <integer>, 'NextToken' => '<string>', ]);
Parameter Details
Members
Result Syntax
[ 'JobRuns' => [ [ 'AllocatedCapacity' => <integer>, 'Arguments' => ['<string>', ...], 'Attempt' => <integer>, 'CompletedOn' => <DateTime>, 'ErrorMessage' => '<string>', 'ExecutionTime' => <integer>, 'GlueVersion' => '<string>', 'Id' => '<string>', 'JobName' => '<string>', 'JobRunState' => 'STARTING|RUNNING|STOPPING|STOPPED|SUCCEEDED|FAILED|TIMEOUT', 'LastModifiedOn' => <DateTime>, 'LogGroupName' => '<string>', 'MaxCapacity' => <float>, 'NotificationProperty' => [ 'NotifyDelayAfter' => <integer>, ], 'NumberOfWorkers' => <integer>, 'PredecessorRuns' => [ [ 'JobName' => '<string>', 'RunId' => '<string>', ], // ... ], 'PreviousRunId' => '<string>', 'SecurityConfiguration' => '<string>', 'StartedOn' => <DateTime>, 'Timeout' => <integer>, 'TriggerName' => '<string>', 'WorkerType' => 'Standard|G.1X|G.2X', ], // ... ], 'NextToken' => '<string>', ]
Result Details
Members
- JobRuns
-
- Type: Array of JobRun structures
A list of job-run metadata objects.
- NextToken
-
- Type: string
A continuation token, if not all requested job runs have been returned.
Errors
-
The input provided was not valid.
-
A specified entity does not exist
-
An internal service error occurred.
-
The operation timed out.
GetJobs
$result = $client->getJobs
([/* ... */]); $promise = $client->getJobsAsync
([/* ... */]);
Retrieves all current job definitions.
Parameter Syntax
$result = $client->getJobs([ 'MaxResults' => <integer>, 'NextToken' => '<string>', ]);
Parameter Details
Members
Result Syntax
[ 'Jobs' => [ [ 'AllocatedCapacity' => <integer>, 'Command' => [ 'Name' => '<string>', 'PythonVersion' => '<string>', 'ScriptLocation' => '<string>', ], 'Connections' => [ 'Connections' => ['<string>', ...], ], 'CreatedOn' => <DateTime>, 'DefaultArguments' => ['<string>', ...], 'Description' => '<string>', 'ExecutionProperty' => [ 'MaxConcurrentRuns' => <integer>, ], 'GlueVersion' => '<string>', 'LastModifiedOn' => <DateTime>, 'LogUri' => '<string>', 'MaxCapacity' => <float>, 'MaxRetries' => <integer>, 'Name' => '<string>', 'NonOverridableArguments' => ['<string>', ...], 'NotificationProperty' => [ 'NotifyDelayAfter' => <integer>, ], 'NumberOfWorkers' => <integer>, 'Role' => '<string>', 'SecurityConfiguration' => '<string>', 'Timeout' => <integer>, 'WorkerType' => 'Standard|G.1X|G.2X', ], // ... ], 'NextToken' => '<string>', ]
Result Details
Members
- Jobs
-
- Type: Array of Job structures
A list of job definitions.
- NextToken
-
- Type: string
A continuation token, if not all job definitions have yet been returned.
Errors
-
The input provided was not valid.
-
A specified entity does not exist
-
An internal service error occurred.
-
The operation timed out.
GetMLTaskRun
$result = $client->getMLTaskRun
([/* ... */]); $promise = $client->getMLTaskRunAsync
([/* ... */]);
Gets details for a specific task run on a machine learning transform. Machine learning task runs are asynchronous tasks that AWS Glue runs on your behalf as part of various machine learning workflows. You can check the stats of any task run by calling GetMLTaskRun
with the TaskRunID
and its parent transform's TransformID
.
Parameter Syntax
$result = $client->getMLTaskRun([ 'TaskRunId' => '<string>', // REQUIRED 'TransformId' => '<string>', // REQUIRED ]);
Parameter Details
Members
Result Syntax
[ 'CompletedOn' => <DateTime>, 'ErrorString' => '<string>', 'ExecutionTime' => <integer>, 'LastModifiedOn' => <DateTime>, 'LogGroupName' => '<string>', 'Properties' => [ 'ExportLabelsTaskRunProperties' => [ 'OutputS3Path' => '<string>', ], 'FindMatchesTaskRunProperties' => [ 'JobId' => '<string>', 'JobName' => '<string>', 'JobRunId' => '<string>', ], 'ImportLabelsTaskRunProperties' => [ 'InputS3Path' => '<string>', 'Replace' => true || false, ], 'LabelingSetGenerationTaskRunProperties' => [ 'OutputS3Path' => '<string>', ], 'TaskType' => 'EVALUATION|LABELING_SET_GENERATION|IMPORT_LABELS|EXPORT_LABELS|FIND_MATCHES', ], 'StartedOn' => <DateTime>, 'Status' => 'STARTING|RUNNING|STOPPING|STOPPED|SUCCEEDED|FAILED|TIMEOUT', 'TaskRunId' => '<string>', 'TransformId' => '<string>', ]
Result Details
Members
- CompletedOn
-
- Type: timestamp (string|DateTime or anything parsable by strtotime)
The date and time when this task run was completed.
- ErrorString
-
- Type: string
The error strings that are associated with the task run.
- ExecutionTime
-
- Type: int
The amount of time (in seconds) that the task run consumed resources.
- LastModifiedOn
-
- Type: timestamp (string|DateTime or anything parsable by strtotime)
The date and time when this task run was last modified.
- LogGroupName
-
- Type: string
The names of the log groups that are associated with the task run.
- Properties
-
- Type: TaskRunProperties structure
The list of properties that are associated with the task run.
- StartedOn
-
- Type: timestamp (string|DateTime or anything parsable by strtotime)
The date and time when this task run started.
- Status
-
- Type: string
The status for this task run.
- TaskRunId
-
- Type: string
The unique run identifier associated with this run.
- TransformId
-
- Type: string
The unique identifier of the task run.
Errors
-
A specified entity does not exist
-
The input provided was not valid.
-
The operation timed out.
-
An internal service error occurred.
GetMLTaskRuns
$result = $client->getMLTaskRuns
([/* ... */]); $promise = $client->getMLTaskRunsAsync
([/* ... */]);
Gets a list of runs for a machine learning transform. Machine learning task runs are asynchronous tasks that AWS Glue runs on your behalf as part of various machine learning workflows. You can get a sortable, filterable list of machine learning task runs by calling GetMLTaskRuns
with their parent transform's TransformID
and other optional parameters as documented in this section.
This operation returns a list of historic runs and must be paginated.
Parameter Syntax
$result = $client->getMLTaskRuns([ 'Filter' => [ 'StartedAfter' => <integer || string || DateTime>, 'StartedBefore' => <integer || string || DateTime>, 'Status' => 'STARTING|RUNNING|STOPPING|STOPPED|SUCCEEDED|FAILED|TIMEOUT', 'TaskRunType' => 'EVALUATION|LABELING_SET_GENERATION|IMPORT_LABELS|EXPORT_LABELS|FIND_MATCHES', ], 'MaxResults' => <integer>, 'NextToken' => '<string>', 'Sort' => [ 'Column' => 'TASK_RUN_TYPE|STATUS|STARTED', // REQUIRED 'SortDirection' => 'DESCENDING|ASCENDING', // REQUIRED ], 'TransformId' => '<string>', // REQUIRED ]);
Parameter Details
Members
- Filter
-
- Type: TaskRunFilterCriteria structure
The filter criteria, in the
TaskRunFilterCriteria
structure, for the task run. - MaxResults
-
- Type: int
The maximum number of results to return.
- NextToken
-
- Type: string
A token for pagination of the results. The default is empty.
- Sort
-
- Type: TaskRunSortCriteria structure
The sorting criteria, in the
TaskRunSortCriteria
structure, for the task run. - TransformId
-
- Required: Yes
- Type: string
The unique identifier of the machine learning transform.
Result Syntax
[ 'NextToken' => '<string>', 'TaskRuns' => [ [ 'CompletedOn' => <DateTime>, 'ErrorString' => '<string>', 'ExecutionTime' => <integer>, 'LastModifiedOn' => <DateTime>, 'LogGroupName' => '<string>', 'Properties' => [ 'ExportLabelsTaskRunProperties' => [ 'OutputS3Path' => '<string>', ], 'FindMatchesTaskRunProperties' => [ 'JobId' => '<string>', 'JobName' => '<string>', 'JobRunId' => '<string>', ], 'ImportLabelsTaskRunProperties' => [ 'InputS3Path' => '<string>', 'Replace' => true || false, ], 'LabelingSetGenerationTaskRunProperties' => [ 'OutputS3Path' => '<string>', ], 'TaskType' => 'EVALUATION|LABELING_SET_GENERATION|IMPORT_LABELS|EXPORT_LABELS|FIND_MATCHES', ], 'StartedOn' => <DateTime>, 'Status' => 'STARTING|RUNNING|STOPPING|STOPPED|SUCCEEDED|FAILED|TIMEOUT', 'TaskRunId' => '<string>', 'TransformId' => '<string>', ], // ... ], ]
Result Details
Members
- NextToken
-
- Type: string
A pagination token, if more results are available.
- TaskRuns
-
- Type: Array of TaskRun structures
A list of task runs that are associated with the transform.
Errors
-
A specified entity does not exist
-
The input provided was not valid.
-
The operation timed out.
-
An internal service error occurred.
GetMLTransform
$result = $client->getMLTransform
([/* ... */]); $promise = $client->getMLTransformAsync
([/* ... */]);
Gets an AWS Glue machine learning transform artifact and all its corresponding metadata. Machine learning transforms are a special type of transform that use machine learning to learn the details of the transformation to be performed by learning from examples provided by humans. These transformations are then saved by AWS Glue. You can retrieve their metadata by calling GetMLTransform
.
Parameter Syntax
$result = $client->getMLTransform([ 'TransformId' => '<string>', // REQUIRED ]);
Parameter Details
Members
Result Syntax
[ 'CreatedOn' => <DateTime>, 'Description' => '<string>', 'EvaluationMetrics' => [ 'FindMatchesMetrics' => [ 'AreaUnderPRCurve' => <float>, 'ColumnImportances' => [ [ 'ColumnName' => '<string>', 'Importance' => <float>, ], // ... ], 'ConfusionMatrix' => [ 'NumFalseNegatives' => <integer>, 'NumFalsePositives' => <integer>, 'NumTrueNegatives' => <integer>, 'NumTruePositives' => <integer>, ], 'F1' => <float>, 'Precision' => <float>, 'Recall' => <float>, ], 'TransformType' => 'FIND_MATCHES', ], 'GlueVersion' => '<string>', 'InputRecordTables' => [ [ 'CatalogId' => '<string>', 'ConnectionName' => '<string>', 'DatabaseName' => '<string>', 'TableName' => '<string>', ], // ... ], 'LabelCount' => <integer>, 'LastModifiedOn' => <DateTime>, 'MaxCapacity' => <float>, 'MaxRetries' => <integer>, 'Name' => '<string>', 'NumberOfWorkers' => <integer>, 'Parameters' => [ 'FindMatchesParameters' => [ 'AccuracyCostTradeoff' => <float>, 'EnforceProvidedLabels' => true || false, 'PrecisionRecallTradeoff' => <float>, 'PrimaryKeyColumnName' => '<string>', ], 'TransformType' => 'FIND_MATCHES', ], 'Role' => '<string>', 'Schema' => [ [ 'DataType' => '<string>', 'Name' => '<string>', ], // ... ], 'Status' => 'NOT_READY|READY|DELETING', 'Timeout' => <integer>, 'TransformEncryption' => [ 'MlUserDataEncryption' => [ 'KmsKeyId' => '<string>', 'MlUserDataEncryptionMode' => 'DISABLED|SSE-KMS', ], 'TaskRunSecurityConfigurationName' => '<string>', ], 'TransformId' => '<string>', 'WorkerType' => 'Standard|G.1X|G.2X', ]
Result Details
Members
- CreatedOn
-
- Type: timestamp (string|DateTime or anything parsable by strtotime)
The date and time when the transform was created.
- Description
-
- Type: string
A description of the transform.
- EvaluationMetrics
-
- Type: EvaluationMetrics structure
The latest evaluation metrics.
- GlueVersion
-
- Type: string
This value determines which version of AWS Glue this machine learning transform is compatible with. Glue 1.0 is recommended for most customers. If the value is not set, the Glue compatibility defaults to Glue 0.9. For more information, see AWS Glue Versions in the developer guide.
- InputRecordTables
-
- Type: Array of GlueTable structures
A list of AWS Glue table definitions used by the transform.
- LabelCount
-
- Type: int
The number of labels available for this transform.
- LastModifiedOn
-
- Type: timestamp (string|DateTime or anything parsable by strtotime)
The date and time when the transform was last modified.
- MaxCapacity
-
- Type: double
The number of AWS Glue data processing units (DPUs) that are allocated to task runs for this transform. You can allocate from 2 to 100 DPUs; the default is 10. A DPU is a relative measure of processing power that consists of 4 vCPUs of compute capacity and 16 GB of memory. For more information, see the AWS Glue pricing page.
When the
WorkerType
field is set to a value other thanStandard
, theMaxCapacity
field is set automatically and becomes read-only. - MaxRetries
-
- Type: int
The maximum number of times to retry a task for this transform after a task run fails.
- Name
-
- Type: string
The unique name given to the transform when it was created.
- NumberOfWorkers
-
- Type: int
The number of workers of a defined
workerType
that are allocated when this task runs. - Parameters
-
- Type: TransformParameters structure
The configuration parameters that are specific to the algorithm used.
- Role
-
- Type: string
The name or Amazon Resource Name (ARN) of the IAM role with the required permissions.
- Schema
-
- Type: Array of SchemaColumn structures
The
Map<Column, Type>
object that represents the schema that this transform accepts. Has an upper bound of 100 columns. - Status
-
- Type: string
The last known status of the transform (to indicate whether it can be used or not). One of "NOT_READY", "READY", or "DELETING".
- Timeout
-
- Type: int
The timeout for a task run for this transform in minutes. This is the maximum time that a task run for this transform can consume resources before it is terminated and enters
TIMEOUT
status. The default is 2,880 minutes (48 hours). - TransformEncryption
-
- Type: TransformEncryption structure
The encryption-at-rest settings of the transform that apply to accessing user data. Machine learning transforms can access user data encrypted in Amazon S3 using KMS.
- TransformId
-
- Type: string
The unique identifier of the transform, generated at the time that the transform was created.
- WorkerType
-
- Type: string
The type of predefined worker that is allocated when this task runs. Accepts a value of Standard, G.1X, or G.2X.
-
For the
Standard
worker type, each worker provides 4 vCPU, 16 GB of memory and a 50GB disk, and 2 executors per worker. -
For the
G.1X
worker type, each worker provides 4 vCPU, 16 GB of memory and a 64GB disk, and 1 executor per worker. -
For the
G.2X
worker type, each worker provides 8 vCPU, 32 GB of memory and a 128GB disk, and 1 executor per worker.
Errors
-
A specified entity does not exist
-
The input provided was not valid.
-
The operation timed out.
-
An internal service error occurred.
GetMLTransforms
$result = $client->getMLTransforms
([/* ... */]); $promise = $client->getMLTransformsAsync
([/* ... */]);
Gets a sortable, filterable list of existing AWS Glue machine learning transforms. Machine learning transforms are a special type of transform that use machine learning to learn the details of the transformation to be performed by learning from examples provided by humans. These transformations are then saved by AWS Glue, and you can retrieve their metadata by calling GetMLTransforms
.
Parameter Syntax
$result = $client->getMLTransforms([ 'Filter' => [ 'CreatedAfter' => <integer || string || DateTime>, 'CreatedBefore' => <integer || string || DateTime>, 'GlueVersion' => '<string>', 'LastModifiedAfter' => <integer || string || DateTime>, 'LastModifiedBefore' => <integer || string || DateTime>, 'Name' => '<string>', 'Schema' => [ [ 'DataType' => '<string>', 'Name' => '<string>', ], // ... ], 'Status' => 'NOT_READY|READY|DELETING', 'TransformType' => 'FIND_MATCHES', ], 'MaxResults' => <integer>, 'NextToken' => '<string>', 'Sort' => [ 'Column' => 'NAME|TRANSFORM_TYPE|STATUS|CREATED|LAST_MODIFIED', // REQUIRED 'SortDirection' => 'DESCENDING|ASCENDING', // REQUIRED ], ]);
Parameter Details
Members
- Filter
-
- Type: TransformFilterCriteria structure
The filter transformation criteria.
- MaxResults
-
- Type: int
The maximum number of results to return.
- NextToken
-
- Type: string
A paginated token to offset the results.
- Sort
-
- Type: TransformSortCriteria structure
The sorting criteria.
Result Syntax
[ 'NextToken' => '<string>', 'Transforms' => [ [ 'CreatedOn' => <DateTime>, 'Description' => '<string>', 'EvaluationMetrics' => [ 'FindMatchesMetrics' => [ 'AreaUnderPRCurve' => <float>, 'ColumnImportances' => [ [ 'ColumnName' => '<string>', 'Importance' => <float>, ], // ... ], 'ConfusionMatrix' => [ 'NumFalseNegatives' => <integer>, 'NumFalsePositives' => <integer>, 'NumTrueNegatives' => <integer>, 'NumTruePositives' => <integer>, ], 'F1' => <float>, 'Precision' => <float>, 'Recall' => <float>, ], 'TransformType' => 'FIND_MATCHES', ], 'GlueVersion' => '<string>', 'InputRecordTables' => [ [ 'CatalogId' => '<string>', 'ConnectionName' => '<string>', 'DatabaseName' => '<string>', 'TableName' => '<string>', ], // ... ], 'LabelCount' => <integer>, 'LastModifiedOn' => <DateTime>, 'MaxCapacity' => <float>, 'MaxRetries' => <integer>, 'Name' => '<string>', 'NumberOfWorkers' => <integer>, 'Parameters' => [ 'FindMatchesParameters' => [ 'AccuracyCostTradeoff' => <float>, 'EnforceProvidedLabels' => true || false, 'PrecisionRecallTradeoff' => <float>, 'PrimaryKeyColumnName' => '<string>', ], 'TransformType' => 'FIND_MATCHES', ], 'Role' => '<string>', 'Schema' => [ [ 'DataType' => '<string>', 'Name' => '<string>', ], // ... ], 'Status' => 'NOT_READY|READY|DELETING', 'Timeout' => <integer>, 'TransformEncryption' => [ 'MlUserDataEncryption' => [ 'KmsKeyId' => '<string>', 'MlUserDataEncryptionMode' => 'DISABLED|SSE-KMS', ], 'TaskRunSecurityConfigurationName' => '<string>', ], 'TransformId' => '<string>', 'WorkerType' => 'Standard|G.1X|G.2X', ], // ... ], ]
Result Details
Members
- NextToken
-
- Type: string
A pagination token, if more results are available.
- Transforms
-
- Required: Yes
- Type: Array of MLTransform structures
A list of machine learning transforms.
Errors
-
A specified entity does not exist
-
The input provided was not valid.
-
The operation timed out.
-
An internal service error occurred.
GetMapping
$result = $client->getMapping
([/* ... */]); $promise = $client->getMappingAsync
([/* ... */]);
Creates mappings.
Parameter Syntax
$result = $client->getMapping([ 'Location' => [ 'DynamoDB' => [ [ 'Name' => '<string>', // REQUIRED 'Param' => true || false, 'Value' => '<string>', // REQUIRED ], // ... ], 'Jdbc' => [ [ 'Name' => '<string>', // REQUIRED 'Param' => true || false, 'Value' => '<string>', // REQUIRED ], // ... ], 'S3' => [ [ 'Name' => '<string>', // REQUIRED 'Param' => true || false, 'Value' => '<string>', // REQUIRED ], // ... ], ], 'Sinks' => [ [ 'DatabaseName' => '<string>', // REQUIRED 'TableName' => '<string>', // REQUIRED ], // ... ], 'Source' => [ // REQUIRED 'DatabaseName' => '<string>', // REQUIRED 'TableName' => '<string>', // REQUIRED ], ]);
Parameter Details
Members
- Location
-
- Type: Location structure
Parameters for the mapping.
- Sinks
-
- Type: Array of CatalogEntry structures
A list of target tables.
- Source
-
- Required: Yes
- Type: CatalogEntry structure
Specifies the source table.
Result Syntax
[ 'Mapping' => [ [ 'SourcePath' => '<string>', 'SourceTable' => '<string>', 'SourceType' => '<string>', 'TargetPath' => '<string>', 'TargetTable' => '<string>', 'TargetType' => '<string>', ], // ... ], ]
Result Details
Members
- Mapping
-
- Required: Yes
- Type: Array of MappingEntry structures
A list of mappings to the specified targets.
Errors
-
The input provided was not valid.
-
An internal service error occurred.
-
The operation timed out.
-
A specified entity does not exist
GetPartition
$result = $client->getPartition
([/* ... */]); $promise = $client->getPartitionAsync
([/* ... */]);
Retrieves information about a specified partition.
Parameter Syntax
$result = $client->getPartition([ 'CatalogId' => '<string>', 'DatabaseName' => '<string>', // REQUIRED 'PartitionValues' => ['<string>', ...], // REQUIRED 'TableName' => '<string>', // REQUIRED ]);
Parameter Details
Members
- CatalogId
-
- Type: string
The ID of the Data Catalog where the partition in question resides. If none is provided, the AWS account ID is used by default.
- DatabaseName
-
- Required: Yes
- Type: string
The name of the catalog database where the partition resides.
- PartitionValues
-
- Required: Yes
- Type: Array of strings
The values that define the partition.
- TableName
-
- Required: Yes
- Type: string
The name of the partition's table.
Result Syntax
[ 'Partition' => [ 'CatalogId' => '<string>', 'CreationTime' => <DateTime>, 'DatabaseName' => '<string>', 'LastAccessTime' => <DateTime>, 'LastAnalyzedTime' => <DateTime>, 'Parameters' => ['<string>', ...], 'StorageDescriptor' => [ 'BucketColumns' => ['<string>', ...], 'Columns' => [ [ 'Comment' => '<string>', 'Name' => '<string>', 'Parameters' => ['<string>', ...], 'Type' => '<string>', ], // ... ], 'Compressed' => true || false, 'InputFormat' => '<string>', 'Location' => '<string>', 'NumberOfBuckets' => <integer>, 'OutputFormat' => '<string>', 'Parameters' => ['<string>', ...], 'SchemaReference' => [ 'SchemaId' => [ 'RegistryName' => '<string>', 'SchemaArn' => '<string>', 'SchemaName' => '<string>', ], 'SchemaVersionId' => '<string>', 'SchemaVersionNumber' => <integer>, ], 'SerdeInfo' => [ 'Name' => '<string>', 'Parameters' => ['<string>', ...], 'SerializationLibrary' => '<string>', ], 'SkewedInfo' => [ 'SkewedColumnNames' => ['<string>', ...], 'SkewedColumnValueLocationMaps' => ['<string>', ...], 'SkewedColumnValues' => ['<string>', ...], ], 'SortColumns' => [ [ 'Column' => '<string>', 'SortOrder' => <integer>, ], // ... ], 'StoredAsSubDirectories' => true || false, ], 'TableName' => '<string>', 'Values' => ['<string>', ...], ], ]
Result Details
Members
- Partition
-
- Type: Partition structure
The requested information, in the form of a
Partition
object.
Errors
-
A specified entity does not exist
-
The input provided was not valid.
-
An internal service error occurred.
-
The operation timed out.
-
An encryption operation failed.
GetPartitionIndexes
$result = $client->getPartitionIndexes
([/* ... */]); $promise = $client->getPartitionIndexesAsync
([/* ... */]);
Retrieves the partition indexes associated with a table.
Parameter Syntax
$result = $client->getPartitionIndexes([ 'CatalogId' => '<string>', 'DatabaseName' => '<string>', // REQUIRED 'NextToken' => '<string>', 'TableName' => '<string>', // REQUIRED ]);
Parameter Details
Members
- CatalogId
-
- Type: string
The catalog ID where the table resides.
- DatabaseName
-
- Required: Yes
- Type: string
Specifies the name of a database from which you want to retrieve partition indexes.
- NextToken
-
- Type: string
A continuation token, included if this is a continuation call.
- TableName
-
- Required: Yes
- Type: string
Specifies the name of a table for which you want to retrieve the partition indexes.
Result Syntax
[ 'NextToken' => '<string>', 'PartitionIndexDescriptorList' => [ [ 'BackfillErrors' => [ [ 'Code' => 'ENCRYPTED_PARTITION_ERROR|INTERNAL_ERROR|INVALID_PARTITION_TYPE_DATA_ERROR|MISSING_PARTITION_VALUE_ERROR|UNSUPPORTED_PARTITION_CHARACTER_ERROR', 'Partitions' => [ [ 'Values' => ['<string>', ...], ], // ... ], ], // ... ], 'IndexName' => '<string>', 'IndexStatus' => 'CREATING|ACTIVE|DELETING|FAILED', 'Keys' => [ [ 'Name' => '<string>', 'Type' => '<string>', ], // ... ], ], // ... ], ]
Result Details
Members
- NextToken
-
- Type: string
A continuation token, present if the current list segment is not the last.
- PartitionIndexDescriptorList
-
- Type: Array of PartitionIndexDescriptor structures
A list of index descriptors.
Errors
-
An internal service error occurred.
-
The operation timed out.
-
The input provided was not valid.
-
A specified entity does not exist
-
The
CreatePartitions
API was called on a table that has indexes enabled.
GetPartitions
$result = $client->getPartitions
([/* ... */]); $promise = $client->getPartitionsAsync
([/* ... */]);
Retrieves information about the partitions in a table.
Parameter Syntax
$result = $client->getPartitions([ 'CatalogId' => '<string>', 'DatabaseName' => '<string>', // REQUIRED 'ExcludeColumnSchema' => true || false, 'Expression' => '<string>', 'MaxResults' => <integer>, 'NextToken' => '<string>', 'Segment' => [ 'SegmentNumber' => <integer>, // REQUIRED 'TotalSegments' => <integer>, // REQUIRED ], 'TableName' => '<string>', // REQUIRED ]);
Parameter Details
Members
- CatalogId
-
- Type: string
The ID of the Data Catalog where the partitions in question reside. If none is provided, the AWS account ID is used by default.
- DatabaseName
-
- Required: Yes
- Type: string
The name of the catalog database where the partitions reside.
- ExcludeColumnSchema
-
- Type: boolean
- Expression
-
- Type: string
An expression that filters the partitions to be returned.
The expression uses SQL syntax similar to the SQL
WHERE
filter clause. The SQL statement parser JSQLParser parses the expression.Operators: The following are the operators that you can use in the
Expression
API call:- =
-
Checks whether the values of the two operands are equal; if yes, then the condition becomes true.
Example: Assume 'variable a' holds 10 and 'variable b' holds 20.
(a = b) is not true.
- < >
-
Checks whether the values of two operands are equal; if the values are not equal, then the condition becomes true.
Example: (a < > b) is true.
- >
-
Checks whether the value of the left operand is greater than the value of the right operand; if yes, then the condition becomes true.
Example: (a > b) is not true.
- <
-
Checks whether the value of the left operand is less than the value of the right operand; if yes, then the condition becomes true.
Example: (a < b) is true.
- >=
-
Checks whether the value of the left operand is greater than or equal to the value of the right operand; if yes, then the condition becomes true.
Example: (a >= b) is not true.
- <=
-
Checks whether the value of the left operand is less than or equal to the value of the right operand; if yes, then the condition becomes true.
Example: (a <= b) is true.
- AND, OR, IN, BETWEEN, LIKE, NOT, IS NULL
-
Logical operators.
Supported Partition Key Types: The following are the supported partition keys.
-
string
-
date
-
timestamp
-
int
-
bigint
-
long
-
tinyint
-
smallint
-
decimal
If an invalid type is encountered, an exception is thrown.
The following list shows the valid operators on each type. When you define a crawler, the
partitionKey
type is created as aSTRING
, to be compatible with the catalog partitions.Sample API Call:
- MaxResults
-
- Type: int
The maximum number of partitions to return in a single response.
- NextToken
-
- Type: string
A continuation token, if this is not the first call to retrieve these partitions.
- Segment
-
- Type: Segment structure
The segment of the table's partitions to scan in this request.
- TableName
-
- Required: Yes
- Type: string
The name of the partitions' table.
Result Syntax
[ 'NextToken' => '<string>', 'Partitions' => [ [ 'CatalogId' => '<string>', 'CreationTime' => <DateTime>, 'DatabaseName' => '<string>', 'LastAccessTime' => <DateTime>, 'LastAnalyzedTime' => <DateTime>, 'Parameters' => ['<string>', ...], 'StorageDescriptor' => [ 'BucketColumns' => ['<string>', ...], 'Columns' => [ [ 'Comment' => '<string>', 'Name' => '<string>', 'Parameters' => ['<string>', ...], 'Type' => '<string>', ], // ... ], 'Compressed' => true || false, 'InputFormat' => '<string>', 'Location' => '<string>', 'NumberOfBuckets' => <integer>, 'OutputFormat' => '<string>', 'Parameters' => ['<string>', ...], 'SchemaReference' => [ 'SchemaId' => [ 'RegistryName' => '<string>', 'SchemaArn' => '<string>', 'SchemaName' => '<string>', ], 'SchemaVersionId' => '<string>', 'SchemaVersionNumber' => <integer>, ], 'SerdeInfo' => [ 'Name' => '<string>', 'Parameters' => ['<string>', ...], 'SerializationLibrary' => '<string>', ], 'SkewedInfo' => [ 'SkewedColumnNames' => ['<string>', ...], 'SkewedColumnValueLocationMaps' => ['<string>', ...], 'SkewedColumnValues' => ['<string>', ...], ], 'SortColumns' => [ [ 'Column' => '<string>', 'SortOrder' => <integer>, ], // ... ], 'StoredAsSubDirectories' => true || false, ], 'TableName' => '<string>', 'Values' => ['<string>', ...], ], // ... ], ]
Result Details
Members
- NextToken
-
- Type: string
A continuation token, if the returned list of partitions does not include the last one.
- Partitions
-
- Type: Array of Partition structures
A list of requested partitions.
Errors
-
A specified entity does not exist
-
The input provided was not valid.
-
The operation timed out.
-
An internal service error occurred.
-
An encryption operation failed.
GetPlan
$result = $client->getPlan
([/* ... */]); $promise = $client->getPlanAsync
([/* ... */]);
Gets code to perform a specified mapping.
Parameter Syntax
$result = $client->getPlan([ 'AdditionalPlanOptionsMap' => ['<string>', ...], 'Language' => 'PYTHON|SCALA', 'Location' => [ 'DynamoDB' => [ [ 'Name' => '<string>', // REQUIRED 'Param' => true || false, 'Value' => '<string>', // REQUIRED ], // ... ], 'Jdbc' => [ [ 'Name' => '<string>', // REQUIRED 'Param' => true || false, 'Value' => '<string>', // REQUIRED ], // ... ], 'S3' => [ [ 'Name' => '<string>', // REQUIRED 'Param' => true || false, 'Value' => '<string>', // REQUIRED ], // ... ], ], 'Mapping' => [ // REQUIRED [ 'SourcePath' => '<string>', 'SourceTable' => '<string>', 'SourceType' => '<string>', 'TargetPath' => '<string>', 'TargetTable' => '<string>', 'TargetType' => '<string>', ], // ... ], 'Sinks' => [ [ 'DatabaseName' => '<string>', // REQUIRED 'TableName' => '<string>', // REQUIRED ], // ... ], 'Source' => [ // REQUIRED 'DatabaseName' => '<string>', // REQUIRED 'TableName' => '<string>', // REQUIRED ], ]);
Parameter Details
Members
- AdditionalPlanOptionsMap
-
- Type: Associative array of custom strings keys (GenericString) to strings
A map to hold additional optional key-value parameters.
Currently, these key-value pairs are supported:
-
inferSchema
— Specifies whether to setinferSchema
to true or false for the default script generated by an AWS Glue job. For example, to setinferSchema
to true, pass the following key value pair:--additional-plan-options-map '{"inferSchema":"true"}'
- Language
-
- Type: string
The programming language of the code to perform the mapping.
- Location
-
- Type: Location structure
The parameters for the mapping.
- Mapping
-
- Required: Yes
- Type: Array of MappingEntry structures
The list of mappings from a source table to target tables.
- Sinks
-
- Type: Array of CatalogEntry structures
The target tables.
- Source
-
- Required: Yes
- Type: CatalogEntry structure
The source table.
Result Syntax
[ 'PythonScript' => '<string>', 'ScalaCode' => '<string>', ]
Result Details
Members
Errors
-
The input provided was not valid.
-
An internal service error occurred.
-
The operation timed out.
GetRegistry
$result = $client->getRegistry
([/* ... */]); $promise = $client->getRegistryAsync
([/* ... */]);
Describes the specified registry in detail.
Parameter Syntax
$result = $client->getRegistry([ 'RegistryId' => [ // REQUIRED 'RegistryArn' => '<string>', 'RegistryName' => '<string>', ], ]);
Parameter Details
Members
- RegistryId
-
- Required: Yes
- Type: RegistryId structure
This is a wrapper structure that may contain the registry name and Amazon Resource Name (ARN).
Result Syntax
[ 'CreatedTime' => '<string>', 'Description' => '<string>', 'RegistryArn' => '<string>', 'RegistryName' => '<string>', 'Status' => 'AVAILABLE|DELETING', 'UpdatedTime' => '<string>', ]
Result Details
Members
- CreatedTime
-
- Type: string
The date and time the registry was created.
- Description
-
- Type: string
A description of the registry.
- RegistryArn
-
- Type: string
The Amazon Resource Name (ARN) of the registry.
- RegistryName
-
- Type: string
The name of the registry.
- Status
-
- Type: string
The status of the registry.
- UpdatedTime
-
- Type: string
The date and time the registry was updated.
Errors
-
The input provided was not valid.
-
Access to a resource was denied.
-
A specified entity does not exist
-
An internal service error occurred.
GetResourcePolicies
$result = $client->getResourcePolicies
([/* ... */]); $promise = $client->getResourcePoliciesAsync
([/* ... */]);
Retrieves the resource policies set on individual resources by AWS Resource Access Manager during cross-account permission grants. Also retrieves the Data Catalog resource policy.
If you enabled metadata encryption in Data Catalog settings, and you do not have permission on the AWS KMS key, the operation can't return the Data Catalog resource policy.
Parameter Syntax
$result = $client->getResourcePolicies([ 'MaxResults' => <integer>, 'NextToken' => '<string>', ]);
Parameter Details
Members
Result Syntax
[ 'GetResourcePoliciesResponseList' => [ [ 'CreateTime' => <DateTime>, 'PolicyHash' => '<string>', 'PolicyInJson' => '<string>', 'UpdateTime' => <DateTime>, ], // ... ], 'NextToken' => '<string>', ]
Result Details
Members
- GetResourcePoliciesResponseList
-
- Type: Array of GluePolicy structures
A list of the individual resource policies and the account-level resource policy.
- NextToken
-
- Type: string
A continuation token, if the returned list does not contain the last resource policy available.
Errors
-
An internal service error occurred.
-
The operation timed out.
-
The input provided was not valid.
-
An encryption operation failed.
GetResourcePolicy
$result = $client->getResourcePolicy
([/* ... */]); $promise = $client->getResourcePolicyAsync
([/* ... */]);
Retrieves a specified resource policy.
Parameter Syntax
$result = $client->getResourcePolicy([ 'ResourceArn' => '<string>', ]);
Parameter Details
Members
- ResourceArn
-
- Type: string
The ARN of the AWS Glue resource for which to retrieve the resource policy. If not supplied, the Data Catalog resource policy is returned. Use
GetResourcePolicies
to view all existing resource policies. For more information see Specifying AWS Glue Resource ARNs.
Result Syntax
[ 'CreateTime' => <DateTime>, 'PolicyHash' => '<string>', 'PolicyInJson' => '<string>', 'UpdateTime' => <DateTime>, ]
Result Details
Members
- CreateTime
-
- Type: timestamp (string|DateTime or anything parsable by strtotime)
The date and time at which the policy was created.
- PolicyHash
-
- Type: string
Contains the hash value associated with this policy.
- PolicyInJson
-
- Type: string
Contains the requested policy document, in JSON format.
- UpdateTime
-
- Type: timestamp (string|DateTime or anything parsable by strtotime)
The date and time at which the policy was last updated.
Errors
-
A specified entity does not exist
-
An internal service error occurred.
-
The operation timed out.
-
The input provided was not valid.
GetSchema
$result = $client->getSchema
([/* ... */]); $promise = $client->getSchemaAsync
([/* ... */]);
Describes the specified schema in detail.
Parameter Syntax
$result = $client->getSchema([ 'SchemaId' => [ // REQUIRED 'RegistryName' => '<string>', 'SchemaArn' => '<string>', 'SchemaName' => '<string>', ], ]);
Parameter Details
Members
- SchemaId
-
- Required: Yes
- Type: SchemaId structure
This is a wrapper structure to contain schema identity fields. The structure contains:
-
SchemaId$SchemaArn: The Amazon Resource Name (ARN) of the schema. Either
SchemaArn
orSchemaName
andRegistryName
has to be provided. -
SchemaId$SchemaName: The name of the schema. Either
SchemaArn
orSchemaName
andRegistryName
has to be provided.
Result Syntax
[ 'Compatibility' => 'NONE|DISABLED|BACKWARD|BACKWARD_ALL|FORWARD|FORWARD_ALL|FULL|FULL_ALL', 'CreatedTime' => '<string>', 'DataFormat' => 'AVRO', 'Description' => '<string>', 'LatestSchemaVersion' => <integer>, 'NextSchemaVersion' => <integer>, 'RegistryArn' => '<string>', 'RegistryName' => '<string>', 'SchemaArn' => '<string>', 'SchemaCheckpoint' => <integer>, 'SchemaName' => '<string>', 'SchemaStatus' => 'AVAILABLE|PENDING|DELETING', 'UpdatedTime' => '<string>', ]
Result Details
Members
- Compatibility
-
- Type: string
The compatibility mode of the schema.
- CreatedTime
-
- Type: string
The date and time the schema was created.
- DataFormat
-
- Type: string
The data format of the schema definition. Currently only
AVRO
is supported. - Description
-
- Type: string
A description of schema if specified when created
- LatestSchemaVersion
-
- Type: long (int|float)
The latest version of the schema associated with the returned schema definition.
- NextSchemaVersion
-
- Type: long (int|float)
The next version of the schema associated with the returned schema definition.
- RegistryArn
-
- Type: string
The Amazon Resource Name (ARN) of the registry.
- RegistryName
-
- Type: string
The name of the registry.
- SchemaArn
-
- Type: string
The Amazon Resource Name (ARN) of the schema.
- SchemaCheckpoint
-
- Type: long (int|float)
The version number of the checkpoint (the last time the compatibility mode was changed).
- SchemaName
-
- Type: string
The name of the schema.
- SchemaStatus
-
- Type: string
The status of the schema.
- UpdatedTime
-
- Type: string
The date and time the schema was updated.
Errors
-
The input provided was not valid.
-
Access to a resource was denied.
-
A specified entity does not exist
-
An internal service error occurred.
GetSchemaByDefinition
$result = $client->getSchemaByDefinition
([/* ... */]); $promise = $client->getSchemaByDefinitionAsync
([/* ... */]);
Retrieves a schema by the SchemaDefinition
. The schema definition is sent to the Schema Registry, canonicalized, and hashed. If the hash is matched within the scope of the SchemaName
or ARN (or the default registry, if none is supplied), that schema’s metadata is returned. Otherwise, a 404 or NotFound error is returned. Schema versions in Deleted
statuses will not be included in the results.
Parameter Syntax
$result = $client->getSchemaByDefinition([ 'SchemaDefinition' => '<string>', // REQUIRED 'SchemaId' => [ // REQUIRED 'RegistryName' => '<string>', 'SchemaArn' => '<string>', 'SchemaName' => '<string>', ], ]);
Parameter Details
Members
- SchemaDefinition
-
- Required: Yes
- Type: string
The definition of the schema for which schema details are required.
- SchemaId
-
- Required: Yes
- Type: SchemaId structure
This is a wrapper structure to contain schema identity fields. The structure contains:
-
SchemaId$SchemaArn: The Amazon Resource Name (ARN) of the schema. One of
SchemaArn
orSchemaName
has to be provided. -
SchemaId$SchemaName: The name of the schema. One of
SchemaArn
orSchemaName
has to be provided.
Result Syntax
[ 'CreatedTime' => '<string>', 'DataFormat' => 'AVRO', 'SchemaArn' => '<string>', 'SchemaVersionId' => '<string>', 'Status' => 'AVAILABLE|PENDING|FAILURE|DELETING', ]
Result Details
Members
- CreatedTime
-
- Type: string
The date and time the schema was created.
- DataFormat
-
- Type: string
The data format of the schema definition. Currently only
AVRO
is supported. - SchemaArn
-
- Type: string
The Amazon Resource Name (ARN) of the schema.
- SchemaVersionId
-
- Type: string
The schema ID of the schema version.
- Status
-
- Type: string
The status of the schema version.
Errors
-
The input provided was not valid.
-
Access to a resource was denied.
-
A specified entity does not exist
-
An internal service error occurred.
GetSchemaVersion
$result = $client->getSchemaVersion
([/* ... */]); $promise = $client->getSchemaVersionAsync
([/* ... */]);
Get the specified schema by its unique ID assigned when a version of the schema is created or registered. Schema versions in Deleted status will not be included in the results.
Parameter Syntax
$result = $client->getSchemaVersion([ 'SchemaId' => [ 'RegistryName' => '<string>', 'SchemaArn' => '<string>', 'SchemaName' => '<string>', ], 'SchemaVersionId' => '<string>', 'SchemaVersionNumber' => [ 'LatestVersion' => true || false, 'VersionNumber' => <integer>, ], ]);
Parameter Details
Members
- SchemaId
-
- Type: SchemaId structure
This is a wrapper structure to contain schema identity fields. The structure contains:
-
SchemaId$SchemaArn: The Amazon Resource Name (ARN) of the schema. Either
SchemaArn
orSchemaName
andRegistryName
has to be provided. -
SchemaId$SchemaName: The name of the schema. Either
SchemaArn
orSchemaName
andRegistryName
has to be provided.
- SchemaVersionId
-
- Type: string
The
SchemaVersionId
of the schema version. This field is required for fetching by schema ID. Either this or theSchemaId
wrapper has to be provided. - SchemaVersionNumber
-
- Type: SchemaVersionNumber structure
The version number of the schema.
Result Syntax
[ 'CreatedTime' => '<string>', 'DataFormat' => 'AVRO', 'SchemaArn' => '<string>', 'SchemaDefinition' => '<string>', 'SchemaVersionId' => '<string>', 'Status' => 'AVAILABLE|PENDING|FAILURE|DELETING', 'VersionNumber' => <integer>, ]
Result Details
Members
- CreatedTime
-
- Type: string
The date and time the schema version was created.
- DataFormat
-
- Type: string
The data format of the schema definition. Currently only
AVRO
is supported. - SchemaArn
-
- Type: string
The Amazon Resource Name (ARN) of the schema.
- SchemaDefinition
-
- Type: string
The schema definition for the schema ID.
- SchemaVersionId
-
- Type: string
The
SchemaVersionId
of the schema version. - Status
-
- Type: string
The status of the schema version.
- VersionNumber
-
- Type: long (int|float)
The version number of the schema.
Errors
-
The input provided was not valid.
-
Access to a resource was denied.
-
A specified entity does not exist
-
An internal service error occurred.
GetSchemaVersionsDiff
$result = $client->getSchemaVersionsDiff
([/* ... */]); $promise = $client->getSchemaVersionsDiffAsync
([/* ... */]);
Fetches the schema version difference in the specified difference type between two stored schema versions in the Schema Registry.
This API allows you to compare two schema versions between two schema definitions under the same schema.
Parameter Syntax
$result = $client->getSchemaVersionsDiff([ 'FirstSchemaVersionNumber' => [ // REQUIRED 'LatestVersion' => true || false, 'VersionNumber' => <integer>, ], 'SchemaDiffType' => 'SYNTAX_DIFF', // REQUIRED 'SchemaId' => [ // REQUIRED 'RegistryName' => '<string>', 'SchemaArn' => '<string>', 'SchemaName' => '<string>', ], 'SecondSchemaVersionNumber' => [ // REQUIRED 'LatestVersion' => true || false, 'VersionNumber' => <integer>, ], ]);
Parameter Details
Members
- FirstSchemaVersionNumber
-
- Required: Yes
- Type: SchemaVersionNumber structure
The first of the two schema versions to be compared.
- SchemaDiffType
-
- Required: Yes
- Type: string
Refers to
SYNTAX_DIFF
, which is the currently supported diff type. - SchemaId
-
- Required: Yes
- Type: SchemaId structure
This is a wrapper structure to contain schema identity fields. The structure contains:
-
SchemaId$SchemaArn: The Amazon Resource Name (ARN) of the schema. One of
SchemaArn
orSchemaName
has to be provided. -
SchemaId$SchemaName: The name of the schema. One of
SchemaArn
orSchemaName
has to be provided.
- SecondSchemaVersionNumber
-
- Required: Yes
- Type: SchemaVersionNumber structure
The second of the two schema versions to be compared.
Result Syntax
[ 'Diff' => '<string>', ]
Result Details
Errors
-
The input provided was not valid.
-
A specified entity does not exist
-
Access to a resource was denied.
-
An internal service error occurred.
GetSecurityConfiguration
$result = $client->getSecurityConfiguration
([/* ... */]); $promise = $client->getSecurityConfigurationAsync
([/* ... */]);
Retrieves a specified security configuration.
Parameter Syntax
$result = $client->getSecurityConfiguration([ 'Name' => '<string>', // REQUIRED ]);
Parameter Details
Result Syntax
[ 'SecurityConfiguration' => [ 'CreatedTimeStamp' => <DateTime>, 'EncryptionConfiguration' => [ 'CloudWatchEncryption' => [ 'CloudWatchEncryptionMode' => 'DISABLED|SSE-KMS', 'KmsKeyArn' => '<string>', ], 'JobBookmarksEncryption' => [ 'JobBookmarksEncryptionMode' => 'DISABLED|CSE-KMS', 'KmsKeyArn' => '<string>', ], 'S3Encryption' => [ [ 'KmsKeyArn' => '<string>', 'S3EncryptionMode' => 'DISABLED|SSE-KMS|SSE-S3', ], // ... ], ], 'Name' => '<string>', ], ]
Result Details
Members
- SecurityConfiguration
-
- Type: SecurityConfiguration structure
The requested security configuration.
Errors
-
A specified entity does not exist
-
The input provided was not valid.
-
An internal service error occurred.
-
The operation timed out.
GetSecurityConfigurations
$result = $client->getSecurityConfigurations
([/* ... */]); $promise = $client->getSecurityConfigurationsAsync
([/* ... */]);
Retrieves a list of all security configurations.
Parameter Syntax
$result = $client->getSecurityConfigurations([ 'MaxResults' => <integer>, 'NextToken' => '<string>', ]);
Parameter Details
Members
Result Syntax
[ 'NextToken' => '<string>', 'SecurityConfigurations' => [ [ 'CreatedTimeStamp' => <DateTime>, 'EncryptionConfiguration' => [ 'CloudWatchEncryption' => [ 'CloudWatchEncryptionMode' => 'DISABLED|SSE-KMS', 'KmsKeyArn' => '<string>', ], 'JobBookmarksEncryption' => [ 'JobBookmarksEncryptionMode' => 'DISABLED|CSE-KMS', 'KmsKeyArn' => '<string>', ], 'S3Encryption' => [ [ 'KmsKeyArn' => '<string>', 'S3EncryptionMode' => 'DISABLED|SSE-KMS|SSE-S3', ], // ... ], ], 'Name' => '<string>', ], // ... ], ]
Result Details
Members
- NextToken
-
- Type: string
A continuation token, if there are more security configurations to return.
- SecurityConfigurations
-
- Type: Array of SecurityConfiguration structures
A list of security configurations.
Errors
-
A specified entity does not exist
-
The input provided was not valid.
-
An internal service error occurred.
-
The operation timed out.
GetTable
$result = $client->getTable
([/* ... */]); $promise = $client->getTableAsync
([/* ... */]);
Retrieves the Table
definition in a Data Catalog for a specified table.
Parameter Syntax
$result = $client->getTable([ 'CatalogId' => '<string>', 'DatabaseName' => '<string>', // REQUIRED 'Name' => '<string>', // REQUIRED ]);
Parameter Details
Members
- CatalogId
-
- Type: string
The ID of the Data Catalog where the table resides. If none is provided, the AWS account ID is used by default.
- DatabaseName
-
- Required: Yes
- Type: string
The name of the database in the catalog in which the table resides. For Hive compatibility, this name is entirely lowercase.
- Name
-
- Required: Yes
- Type: string
The name of the table for which to retrieve the definition. For Hive compatibility, this name is entirely lowercase.
Result Syntax
[ 'Table' => [ 'CatalogId' => '<string>', 'CreateTime' => <DateTime>, 'CreatedBy' => '<string>', 'DatabaseName' => '<string>', 'Description' => '<string>', 'IsRegisteredWithLakeFormation' => true || false, 'LastAccessTime' => <DateTime>, 'LastAnalyzedTime' => <DateTime>, 'Name' => '<string>', 'Owner' => '<string>', 'Parameters' => ['<string>', ...], 'PartitionKeys' => [ [ 'Comment' => '<string>', 'Name' => '<string>', 'Parameters' => ['<string>', ...], 'Type' => '<string>', ], // ... ], 'Retention' => <integer>, 'StorageDescriptor' => [ 'BucketColumns' => ['<string>', ...], 'Columns' => [ [ 'Comment' => '<string>', 'Name' => '<string>', 'Parameters' => ['<string>', ...], 'Type' => '<string>', ], // ... ], 'Compressed' => true || false, 'InputFormat' => '<string>', 'Location' => '<string>', 'NumberOfBuckets' => <integer>, 'OutputFormat' => '<string>', 'Parameters' => ['<string>', ...], 'SchemaReference' => [ 'SchemaId' => [ 'RegistryName' => '<string>', 'SchemaArn' => '<string>', 'SchemaName' => '<string>', ], 'SchemaVersionId' => '<string>', 'SchemaVersionNumber' => <integer>, ], 'SerdeInfo' => [ 'Name' => '<string>', 'Parameters' => ['<string>', ...], 'SerializationLibrary' => '<string>', ], 'SkewedInfo' => [ 'SkewedColumnNames' => ['<string>', ...], 'SkewedColumnValueLocationMaps' => ['<string>', ...], 'SkewedColumnValues' => ['<string>', ...], ], 'SortColumns' => [ [ 'Column' => '<string>', 'SortOrder' => <integer>, ], // ... ], 'StoredAsSubDirectories' => true || false, ], 'TableType' => '<string>', 'TargetTable' => [ 'CatalogId' => '<string>', 'DatabaseName' => '<string>', 'Name' => '<string>', ], 'UpdateTime' => <DateTime>, 'ViewExpandedText' => '<string>', 'ViewOriginalText' => '<string>', ], ]
Result Details
Members
- Table
-
- Type: Table structure
The
Table
object that defines the specified table.
Errors
-
A specified entity does not exist
-
The input provided was not valid.
-
An internal service error occurred.
-
The operation timed out.
-
An encryption operation failed.
GetTableVersion
$result = $client->getTableVersion
([/* ... */]); $promise = $client->getTableVersionAsync
([/* ... */]);
Retrieves a specified version of a table.
Parameter Syntax
$result = $client->getTableVersion([ 'CatalogId' => '<string>', 'DatabaseName' => '<string>', // REQUIRED 'TableName' => '<string>', // REQUIRED 'VersionId' => '<string>', ]);
Parameter Details
Members
- CatalogId
-
- Type: string
The ID of the Data Catalog where the tables reside. If none is provided, the AWS account ID is used by default.
- DatabaseName
-
- Required: Yes
- Type: string
The database in the catalog in which the table resides. For Hive compatibility, this name is entirely lowercase.
- TableName
-
- Required: Yes
- Type: string
The name of the table. For Hive compatibility, this name is entirely lowercase.
- VersionId
-
- Type: string
The ID value of the table version to be retrieved. A
VersionID
is a string representation of an integer. Each version is incremented by 1.
Result Syntax
[ 'TableVersion' => [ 'Table' => [ 'CatalogId' => '<string>', 'CreateTime' => <DateTime>, 'CreatedBy' => '<string>', 'DatabaseName' => '<string>', 'Description' => '<string>', 'IsRegisteredWithLakeFormation' => true || false, 'LastAccessTime' => <DateTime>, 'LastAnalyzedTime' => <DateTime>, 'Name' => '<string>', 'Owner' => '<string>', 'Parameters' => ['<string>', ...], 'PartitionKeys' => [ [ 'Comment' => '<string>', 'Name' => '<string>', 'Parameters' => ['<string>', ...], 'Type' => '<string>', ], // ... ], 'Retention' => <integer>, 'StorageDescriptor' => [ 'BucketColumns' => ['<string>', ...], 'Columns' => [ [ 'Comment' => '<string>', 'Name' => '<string>', 'Parameters' => ['<string>', ...], 'Type' => '<string>', ], // ... ], 'Compressed' => true || false, 'InputFormat' => '<string>', 'Location' => '<string>', 'NumberOfBuckets' => <integer>, 'OutputFormat' => '<string>', 'Parameters' => ['<string>', ...], 'SchemaReference' => [ 'SchemaId' => [ 'RegistryName' => '<string>', 'SchemaArn' => '<string>', 'SchemaName' => '<string>', ], 'SchemaVersionId' => '<string>', 'SchemaVersionNumber' => <integer>, ], 'SerdeInfo' => [ 'Name' => '<string>', 'Parameters' => ['<string>', ...], 'SerializationLibrary' => '<string>', ], 'SkewedInfo' => [ 'SkewedColumnNames' => ['<string>', ...], 'SkewedColumnValueLocationMaps' => ['<string>', ...], 'SkewedColumnValues' => ['<string>', ...], ], 'SortColumns' => [ [ 'Column' => '<string>', 'SortOrder' => <integer>, ], // ... ], 'StoredAsSubDirectories' => true || false, ], 'TableType' => '<string>', 'TargetTable' => [ 'CatalogId' => '<string>', 'DatabaseName' => '<string>', 'Name' => '<string>', ], 'UpdateTime' => <DateTime>, 'ViewExpandedText' => '<string>', 'ViewOriginalText' => '<string>', ], 'VersionId' => '<string>', ], ]
Result Details
Members
- TableVersion
-
- Type: TableVersion structure
The requested table version.
Errors
-
A specified entity does not exist
-
The input provided was not valid.
-
An internal service error occurred.
-
The operation timed out.
-
An encryption operation failed.
GetTableVersions
$result = $client->getTableVersions
([/* ... */]); $promise = $client->getTableVersionsAsync
([/* ... */]);
Retrieves a list of strings that identify available versions of a specified table.
Parameter Syntax
$result = $client->getTableVersions([ 'CatalogId' => '<string>', 'DatabaseName' => '<string>', // REQUIRED 'MaxResults' => <integer>, 'NextToken' => '<string>', 'TableName' => '<string>', // REQUIRED ]);
Parameter Details
Members
- CatalogId
-
- Type: string
The ID of the Data Catalog where the tables reside. If none is provided, the AWS account ID is used by default.
- DatabaseName
-
- Required: Yes
- Type: string
The database in the catalog in which the table resides. For Hive compatibility, this name is entirely lowercase.
- MaxResults
-
- Type: int
The maximum number of table versions to return in one response.
- NextToken
-
- Type: string
A continuation token, if this is not the first call.
- TableName
-
- Required: Yes
- Type: string
The name of the table. For Hive compatibility, this name is entirely lowercase.
Result Syntax
[ 'NextToken' => '<string>', 'TableVersions' => [ [ 'Table' => [ 'CatalogId' => '<string>', 'CreateTime' => <DateTime>, 'CreatedBy' => '<string>', 'DatabaseName' => '<string>', 'Description' => '<string>', 'IsRegisteredWithLakeFormation' => true || false, 'LastAccessTime' => <DateTime>, 'LastAnalyzedTime' => <DateTime>, 'Name' => '<string>', 'Owner' => '<string>', 'Parameters' => ['<string>', ...], 'PartitionKeys' => [ [ 'Comment' => '<string>', 'Name' => '<string>', 'Parameters' => ['<string>', ...], 'Type' => '<string>', ], // ... ], 'Retention' => <integer>, 'StorageDescriptor' => [ 'BucketColumns' => ['<string>', ...], 'Columns' => [ [ 'Comment' => '<string>', 'Name' => '<string>', 'Parameters' => ['<string>', ...], 'Type' => '<string>', ], // ... ], 'Compressed' => true || false, 'InputFormat' => '<string>', 'Location' => '<string>', 'NumberOfBuckets' => <integer>, 'OutputFormat' => '<string>', 'Parameters' => ['<string>', ...], 'SchemaReference' => [ 'SchemaId' => [ 'RegistryName' => '<string>', 'SchemaArn' => '<string>', 'SchemaName' => '<string>', ], 'SchemaVersionId' => '<string>', 'SchemaVersionNumber' => <integer>, ], 'SerdeInfo' => [ 'Name' => '<string>', 'Parameters' => ['<string>', ...], 'SerializationLibrary' => '<string>', ], 'SkewedInfo' => [ 'SkewedColumnNames' => ['<string>', ...], 'SkewedColumnValueLocationMaps' => ['<string>', ...], 'SkewedColumnValues' => ['<string>', ...], ], 'SortColumns' => [ [ 'Column' => '<string>', 'SortOrder' => <integer>, ], // ... ], 'StoredAsSubDirectories' => true || false, ], 'TableType' => '<string>', 'TargetTable' => [ 'CatalogId' => '<string>', 'DatabaseName' => '<string>', 'Name' => '<string>', ], 'UpdateTime' => <DateTime>, 'ViewExpandedText' => '<string>', 'ViewOriginalText' => '<string>', ], 'VersionId' => '<string>', ], // ... ], ]
Result Details
Members
- NextToken
-
- Type: string
A continuation token, if the list of available versions does not include the last one.
- TableVersions
-
- Type: Array of TableVersion structures
A list of strings identifying available versions of the specified table.
Errors
-
A specified entity does not exist
-
The input provided was not valid.
-
An internal service error occurred.
-
The operation timed out.
-
An encryption operation failed.
GetTables
$result = $client->getTables
([/* ... */]); $promise = $client->getTablesAsync
([/* ... */]);
Retrieves the definitions of some or all of the tables in a given Database
.
Parameter Syntax
$result = $client->getTables([ 'CatalogId' => '<string>', 'DatabaseName' => '<string>', // REQUIRED 'Expression' => '<string>', 'MaxResults' => <integer>, 'NextToken' => '<string>', ]);
Parameter Details
Members
- CatalogId
-
- Type: string
The ID of the Data Catalog where the tables reside. If none is provided, the AWS account ID is used by default.
- DatabaseName
-
- Required: Yes
- Type: string
The database in the catalog whose tables to list. For Hive compatibility, this name is entirely lowercase.
- Expression
-
- Type: string
A regular expression pattern. If present, only those tables whose names match the pattern are returned.
- MaxResults
-
- Type: int
The maximum number of tables to return in a single response.
- NextToken
-
- Type: string
A continuation token, included if this is a continuation call.
Result Syntax
[ 'NextToken' => '<string>', 'TableList' => [ [ 'CatalogId' => '<string>', 'CreateTime' => <DateTime>, 'CreatedBy' => '<string>', 'DatabaseName' => '<string>', 'Description' => '<string>', 'IsRegisteredWithLakeFormation' => true || false, 'LastAccessTime' => <DateTime>, 'LastAnalyzedTime' => <DateTime>, 'Name' => '<string>', 'Owner' => '<string>', 'Parameters' => ['<string>', ...], 'PartitionKeys' => [ [ 'Comment' => '<string>', 'Name' => '<string>', 'Parameters' => ['<string>', ...], 'Type' => '<string>', ], // ... ], 'Retention' => <integer>, 'StorageDescriptor' => [ 'BucketColumns' => ['<string>', ...], 'Columns' => [ [ 'Comment' => '<string>', 'Name' => '<string>', 'Parameters' => ['<string>', ...], 'Type' => '<string>', ], // ... ], 'Compressed' => true || false, 'InputFormat' => '<string>', 'Location' => '<string>', 'NumberOfBuckets' => <integer>, 'OutputFormat' => '<string>', 'Parameters' => ['<string>', ...], 'SchemaReference' => [ 'SchemaId' => [ 'RegistryName' => '<string>', 'SchemaArn' => '<string>', 'SchemaName' => '<string>', ], 'SchemaVersionId' => '<string>', 'SchemaVersionNumber' => <integer>, ], 'SerdeInfo' => [ 'Name' => '<string>', 'Parameters' => ['<string>', ...], 'SerializationLibrary' => '<string>', ], 'SkewedInfo' => [ 'SkewedColumnNames' => ['<string>', ...], 'SkewedColumnValueLocationMaps' => ['<string>', ...], 'SkewedColumnValues' => ['<string>', ...], ], 'SortColumns' => [ [ 'Column' => '<string>', 'SortOrder' => <integer>, ], // ... ], 'StoredAsSubDirectories' => true || false, ], 'TableType' => '<string>', 'TargetTable' => [ 'CatalogId' => '<string>', 'DatabaseName' => '<string>', 'Name' => '<string>', ], 'UpdateTime' => <DateTime>, 'ViewExpandedText' => '<string>', 'ViewOriginalText' => '<string>', ], // ... ], ]
Result Details
Members
- NextToken
-
- Type: string
A continuation token, present if the current list segment is not the last.
- TableList
-
- Type: Array of Table structures
A list of the requested
Table
objects.
Errors
-
A specified entity does not exist
-
The input provided was not valid.
-
The operation timed out.
-
An internal service error occurred.
-
An encryption operation failed.
GetTags
$result = $client->getTags
([/* ... */]); $promise = $client->getTagsAsync
([/* ... */]);
Retrieves a list of tags associated with a resource.
Parameter Syntax
$result = $client->getTags([ 'ResourceArn' => '<string>', // REQUIRED ]);
Parameter Details
Members
Result Syntax
[ 'Tags' => ['<string>', ...], ]
Result Details
Errors
-
The input provided was not valid.
-
An internal service error occurred.
-
The operation timed out.
-
A specified entity does not exist
GetTrigger
$result = $client->getTrigger
([/* ... */]); $promise = $client->getTriggerAsync
([/* ... */]);
Retrieves the definition of a trigger.
Parameter Syntax
$result = $client->getTrigger([ 'Name' => '<string>', // REQUIRED ]);
Parameter Details
Result Syntax
[ 'Trigger' => [ 'Actions' => [ [ 'Arguments' => ['<string>', ...], 'CrawlerName' => '<string>', 'JobName' => '<string>', 'NotificationProperty' => [ 'NotifyDelayAfter' => <integer>, ], 'SecurityConfiguration' => '<string>', 'Timeout' => <integer>, ], // ... ], 'Description' => '<string>', 'Id' => '<string>', 'Name' => '<string>', 'Predicate' => [ 'Conditions' => [ [ 'CrawlState' => 'RUNNING|CANCELLING|CANCELLED|SUCCEEDED|FAILED', 'CrawlerName' => '<string>', 'JobName' => '<string>', 'LogicalOperator' => 'EQUALS', 'State' => 'STARTING|RUNNING|STOPPING|STOPPED|SUCCEEDED|FAILED|TIMEOUT', ], // ... ], 'Logical' => 'AND|ANY', ], 'Schedule' => '<string>', 'State' => 'CREATING|CREATED|ACTIVATING|ACTIVATED|DEACTIVATING|DEACTIVATED|DELETING|UPDATING', 'Type' => 'SCHEDULED|CONDITIONAL|ON_DEMAND', 'WorkflowName' => '<string>', ], ]
Result Details
Members
- Trigger
-
- Type: Trigger structure
The requested trigger definition.
Errors
-
A specified entity does not exist
-
The input provided was not valid.
-
An internal service error occurred.
-
The operation timed out.
GetTriggers
$result = $client->getTriggers
([/* ... */]); $promise = $client->getTriggersAsync
([/* ... */]);
Gets all the triggers associated with a job.
Parameter Syntax
$result = $client->getTriggers([ 'DependentJobName' => '<string>', 'MaxResults' => <integer>, 'NextToken' => '<string>', ]);
Parameter Details
Members
- DependentJobName
-
- Type: string
The name of the job to retrieve triggers for. The trigger that can start this job is returned, and if there is no such trigger, all triggers are returned.
- MaxResults
-
- Type: int
The maximum size of the response.
- NextToken
-
- Type: string
A continuation token, if this is a continuation call.
Result Syntax
[ 'NextToken' => '<string>', 'Triggers' => [ [ 'Actions' => [ [ 'Arguments' => ['<string>', ...], 'CrawlerName' => '<string>', 'JobName' => '<string>', 'NotificationProperty' => [ 'NotifyDelayAfter' => <integer>, ], 'SecurityConfiguration' => '<string>', 'Timeout' => <integer>, ], // ... ], 'Description' => '<string>', 'Id' => '<string>', 'Name' => '<string>', 'Predicate' => [ 'Conditions' => [ [ 'CrawlState' => 'RUNNING|CANCELLING|CANCELLED|SUCCEEDED|FAILED', 'CrawlerName' => '<string>', 'JobName' => '<string>', 'LogicalOperator' => 'EQUALS', 'State' => 'STARTING|RUNNING|STOPPING|STOPPED|SUCCEEDED|FAILED|TIMEOUT', ], // ... ], 'Logical' => 'AND|ANY', ], 'Schedule' => '<string>', 'State' => 'CREATING|CREATED|ACTIVATING|ACTIVATED|DEACTIVATING|DEACTIVATED|DELETING|UPDATING', 'Type' => 'SCHEDULED|CONDITIONAL|ON_DEMAND', 'WorkflowName' => '<string>', ], // ... ], ]
Result Details
Members
- NextToken
-
- Type: string
A continuation token, if not all the requested triggers have yet been returned.
- Triggers
-
- Type: Array of Trigger structures
A list of triggers for the specified job.
Errors
-
A specified entity does not exist
-
The input provided was not valid.
-
An internal service error occurred.
-
The operation timed out.
GetUserDefinedFunction
$result = $client->getUserDefinedFunction
([/* ... */]); $promise = $client->getUserDefinedFunctionAsync
([/* ... */]);
Retrieves a specified function definition from the Data Catalog.
Parameter Syntax
$result = $client->getUserDefinedFunction([ 'CatalogId' => '<string>', 'DatabaseName' => '<string>', // REQUIRED 'FunctionName' => '<string>', // REQUIRED ]);
Parameter Details
Members
- CatalogId
-
- Type: string
The ID of the Data Catalog where the function to be retrieved is located. If none is provided, the AWS account ID is used by default.
- DatabaseName
-
- Required: Yes
- Type: string
The name of the catalog database where the function is located.
- FunctionName
-
- Required: Yes
- Type: string
The name of the function.
Result Syntax
[ 'UserDefinedFunction' => [ 'CatalogId' => '<string>', 'ClassName' => '<string>', 'CreateTime' => <DateTime>, 'DatabaseName' => '<string>', 'FunctionName' => '<string>', 'OwnerName' => '<string>', 'OwnerType' => 'USER|ROLE|GROUP', 'ResourceUris' => [ [ 'ResourceType' => 'JAR|FILE|ARCHIVE', 'Uri' => '<string>', ], // ... ], ], ]
Result Details
Members
- UserDefinedFunction
-
- Type: UserDefinedFunction structure
The requested function definition.
Errors
-
A specified entity does not exist
-
The input provided was not valid.
-
An internal service error occurred.
-
The operation timed out.
-
An encryption operation failed.
GetUserDefinedFunctions
$result = $client->getUserDefinedFunctions
([/* ... */]); $promise = $client->getUserDefinedFunctionsAsync
([/* ... */]);
Retrieves multiple function definitions from the Data Catalog.
Parameter Syntax
$result = $client->getUserDefinedFunctions([ 'CatalogId' => '<string>', 'DatabaseName' => '<string>', 'MaxResults' => <integer>, 'NextToken' => '<string>', 'Pattern' => '<string>', // REQUIRED ]);
Parameter Details
Members
- CatalogId
-
- Type: string
The ID of the Data Catalog where the functions to be retrieved are located. If none is provided, the AWS account ID is used by default.
- DatabaseName
-
- Type: string
The name of the catalog database where the functions are located. If none is provided, functions from all the databases across the catalog will be returned.
- MaxResults
-
- Type: int
The maximum number of functions to return in one response.
- NextToken
-
- Type: string
A continuation token, if this is a continuation call.
- Pattern
-
- Required: Yes
- Type: string
An optional function-name pattern string that filters the function definitions returned.
Result Syntax
[ 'NextToken' => '<string>', 'UserDefinedFunctions' => [ [ 'CatalogId' => '<string>', 'ClassName' => '<string>', 'CreateTime' => <DateTime>, 'DatabaseName' => '<string>', 'FunctionName' => '<string>', 'OwnerName' => '<string>', 'OwnerType' => 'USER|ROLE|GROUP', 'ResourceUris' => [ [ 'ResourceType' => 'JAR|FILE|ARCHIVE', 'Uri' => '<string>', ], // ... ], ], // ... ], ]
Result Details
Members
- NextToken
-
- Type: string
A continuation token, if the list of functions returned does not include the last requested function.
- UserDefinedFunctions
-
- Type: Array of UserDefinedFunction structures
A list of requested function definitions.
Errors
-
A specified entity does not exist
-
The input provided was not valid.
-
The operation timed out.
-
An internal service error occurred.
-
An encryption operation failed.
GetWorkflow
$result = $client->getWorkflow
([/* ... */]); $promise = $client->getWorkflowAsync
([/* ... */]);
Retrieves resource metadata for a workflow.
Parameter Syntax
$result = $client->getWorkflow([ 'IncludeGraph' => true || false, 'Name' => '<string>', // REQUIRED ]);
Parameter Details
Members
Result Syntax
[ 'Workflow' => [ 'CreatedOn' => <DateTime>, 'DefaultRunProperties' => ['<string>', ...], 'Description' => '<string>', 'Graph' => [ 'Edges' => [ [ 'DestinationId' => '<string>', 'SourceId' => '<string>', ], // ... ], 'Nodes' => [ [ 'CrawlerDetails' => [ 'Crawls' => [ [ 'CompletedOn' => <DateTime>, 'ErrorMessage' => '<string>', 'LogGroup' => '<string>', 'LogStream' => '<string>', 'StartedOn' => <DateTime>, 'State' => 'RUNNING|CANCELLING|CANCELLED|SUCCEEDED|FAILED', ], // ... ], ], 'JobDetails' => [ 'JobRuns' => [ [ 'AllocatedCapacity' => <integer>, 'Arguments' => ['<string>', ...], 'Attempt' => <integer>, 'CompletedOn' => <DateTime>, 'ErrorMessage' => '<string>', 'ExecutionTime' => <integer>, 'GlueVersion' => '<string>', 'Id' => '<string>', 'JobName' => '<string>', 'JobRunState' => 'STARTING|RUNNING|STOPPING|STOPPED|SUCCEEDED|FAILED|TIMEOUT', 'LastModifiedOn' => <DateTime>, 'LogGroupName' => '<string>', 'MaxCapacity' => <float>, 'NotificationProperty' => [ 'NotifyDelayAfter' => <integer>, ], 'NumberOfWorkers' => <integer>, 'PredecessorRuns' => [ [ 'JobName' => '<string>', 'RunId' => '<string>', ], // ... ], 'PreviousRunId' => '<string>', 'SecurityConfiguration' => '<string>', 'StartedOn' => <DateTime>, 'Timeout' => <integer>, 'TriggerName' => '<string>', 'WorkerType' => 'Standard|G.1X|G.2X', ], // ... ], ], 'Name' => '<string>', 'TriggerDetails' => [ 'Trigger' => [ 'Actions' => [ [ 'Arguments' => ['<string>', ...], 'CrawlerName' => '<string>', 'JobName' => '<string>', 'NotificationProperty' => [ 'NotifyDelayAfter' => <integer>, ], 'SecurityConfiguration' => '<string>', 'Timeout' => <integer>, ], // ... ], 'Description' => '<string>', 'Id' => '<string>', 'Name' => '<string>', 'Predicate' => [ 'Conditions' => [ [ 'CrawlState' => 'RUNNING|CANCELLING|CANCELLED|SUCCEEDED|FAILED', 'CrawlerName' => '<string>', 'JobName' => '<string>', 'LogicalOperator' => 'EQUALS', 'State' => 'STARTING|RUNNING|STOPPING|STOPPED|SUCCEEDED|FAILED|TIMEOUT', ], // ... ], 'Logical' => 'AND|ANY', ], 'Schedule' => '<string>', 'State' => 'CREATING|CREATED|ACTIVATING|ACTIVATED|DEACTIVATING|DEACTIVATED|DELETING|UPDATING', 'Type' => 'SCHEDULED|CONDITIONAL|ON_DEMAND', 'WorkflowName' => '<string>', ], ], 'Type' => 'CRAWLER|JOB|TRIGGER', 'UniqueId' => '<string>', ], // ... ], ], 'LastModifiedOn' => <DateTime>, 'LastRun' => [ 'CompletedOn' => <DateTime>, 'ErrorMessage' => '<string>', 'Graph' => [ 'Edges' => [ [ 'DestinationId' => '<string>', 'SourceId' => '<string>', ], // ... ], 'Nodes' => [ [ 'CrawlerDetails' => [ 'Crawls' => [ [ 'CompletedOn' => <DateTime>, 'ErrorMessage' => '<string>', 'LogGroup' => '<string>', 'LogStream' => '<string>', 'StartedOn' => <DateTime>, 'State' => 'RUNNING|CANCELLING|CANCELLED|SUCCEEDED|FAILED', ], // ... ], ], 'JobDetails' => [ 'JobRuns' => [ [ 'AllocatedCapacity' => <integer>, 'Arguments' => ['<string>', ...], 'Attempt' => <integer>, 'CompletedOn' => <DateTime>, 'ErrorMessage' => '<string>', 'ExecutionTime' => <integer>, 'GlueVersion' => '<string>', 'Id' => '<string>', 'JobName' => '<string>', 'JobRunState' => 'STARTING|RUNNING|STOPPING|STOPPED|SUCCEEDED|FAILED|TIMEOUT', 'LastModifiedOn' => <DateTime>, 'LogGroupName' => '<string>', 'MaxCapacity' => <float>, 'NotificationProperty' => [ 'NotifyDelayAfter' => <integer>, ], 'NumberOfWorkers' => <integer>, 'PredecessorRuns' => [ [ 'JobName' => '<string>', 'RunId' => '<string>', ], // ... ], 'PreviousRunId' => '<string>', 'SecurityConfiguration' => '<string>', 'StartedOn' => <DateTime>, 'Timeout' => <integer>, 'TriggerName' => '<string>', 'WorkerType' => 'Standard|G.1X|G.2X', ], // ... ], ], 'Name' => '<string>', 'TriggerDetails' => [ 'Trigger' => [ 'Actions' => [ [ 'Arguments' => ['<string>', ...], 'CrawlerName' => '<string>', 'JobName' => '<string>', 'NotificationProperty' => [ 'NotifyDelayAfter' => <integer>, ], 'SecurityConfiguration' => '<string>', 'Timeout' => <integer>, ], // ... ], 'Description' => '<string>', 'Id' => '<string>', 'Name' => '<string>', 'Predicate' => [ 'Conditions' => [ [ 'CrawlState' => 'RUNNING|CANCELLING|CANCELLED|SUCCEEDED|FAILED', 'CrawlerName' => '<string>', 'JobName' => '<string>', 'LogicalOperator' => 'EQUALS', 'State' => 'STARTING|RUNNING|STOPPING|STOPPED|SUCCEEDED|FAILED|TIMEOUT', ], // ... ], 'Logical' => 'AND|ANY', ], 'Schedule' => '<string>', 'State' => 'CREATING|CREATED|ACTIVATING|ACTIVATED|DEACTIVATING|DEACTIVATED|DELETING|UPDATING', 'Type' => 'SCHEDULED|CONDITIONAL|ON_DEMAND', 'WorkflowName' => '<string>', ], ], 'Type' => 'CRAWLER|JOB|TRIGGER', 'UniqueId' => '<string>', ], // ... ], ], 'Name' => '<string>', 'PreviousRunId' => '<string>', 'StartedOn' => <DateTime>, 'Statistics' => [ 'FailedActions' => <integer>, 'RunningActions' => <integer>, 'StoppedActions' => <integer>, 'SucceededActions' => <integer>, 'TimeoutActions' => <integer>, 'TotalActions' => <integer>, ], 'Status' => 'RUNNING|COMPLETED|STOPPING|STOPPED|ERROR', 'WorkflowRunId' => '<string>', 'WorkflowRunProperties' => ['<string>', ...], ], 'MaxConcurrentRuns' => <integer>, 'Name' => '<string>', ], ]
Result Details
Members
- Workflow
-
- Type: Workflow structure
The resource metadata for the workflow.
Errors
-
The input provided was not valid.
-
A specified entity does not exist
-
An internal service error occurred.
-
The operation timed out.
GetWorkflowRun
$result = $client->getWorkflowRun
([/* ... */]); $promise = $client->getWorkflowRunAsync
([/* ... */]);
Retrieves the metadata for a given workflow run.
Parameter Syntax
$result = $client->getWorkflowRun([ 'IncludeGraph' => true || false, 'Name' => '<string>', // REQUIRED 'RunId' => '<string>', // REQUIRED ]);
Parameter Details
Members
Result Syntax
[ 'Run' => [ 'CompletedOn' => <DateTime>, 'ErrorMessage' => '<string>', 'Graph' => [ 'Edges' => [ [ 'DestinationId' => '<string>', 'SourceId' => '<string>', ], // ... ], 'Nodes' => [ [ 'CrawlerDetails' => [ 'Crawls' => [ [ 'CompletedOn' => <DateTime>, 'ErrorMessage' => '<string>', 'LogGroup' => '<string>', 'LogStream' => '<string>', 'StartedOn' => <DateTime>, 'State' => 'RUNNING|CANCELLING|CANCELLED|SUCCEEDED|FAILED', ], // ... ], ], 'JobDetails' => [ 'JobRuns' => [ [ 'AllocatedCapacity' => <integer>, 'Arguments' => ['<string>', ...], 'Attempt' => <integer>, 'CompletedOn' => <DateTime>, 'ErrorMessage' => '<string>', 'ExecutionTime' => <integer>, 'GlueVersion' => '<string>', 'Id' => '<string>', 'JobName' => '<string>', 'JobRunState' => 'STARTING|RUNNING|STOPPING|STOPPED|SUCCEEDED|FAILED|TIMEOUT', 'LastModifiedOn' => <DateTime>, 'LogGroupName' => '<string>', 'MaxCapacity' => <float>, 'NotificationProperty' => [ 'NotifyDelayAfter' => <integer>, ], 'NumberOfWorkers' => <integer>, 'PredecessorRuns' => [ [ 'JobName' => '<string>', 'RunId' => '<string>', ], // ... ], 'PreviousRunId' => '<string>', 'SecurityConfiguration' => '<string>', 'StartedOn' => <DateTime>, 'Timeout' => <integer>, 'TriggerName' => '<string>', 'WorkerType' => 'Standard|G.1X|G.2X', ], // ... ], ], 'Name' => '<string>', 'TriggerDetails' => [ 'Trigger' => [ 'Actions' => [ [ 'Arguments' => ['<string>', ...], 'CrawlerName' => '<string>', 'JobName' => '<string>', 'NotificationProperty' => [ 'NotifyDelayAfter' => <integer>, ], 'SecurityConfiguration' => '<string>', 'Timeout' => <integer>, ], // ... ], 'Description' => '<string>', 'Id' => '<string>', 'Name' => '<string>', 'Predicate' => [ 'Conditions' => [ [ 'CrawlState' => 'RUNNING|CANCELLING|CANCELLED|SUCCEEDED|FAILED', 'CrawlerName' => '<string>', 'JobName' => '<string>', 'LogicalOperator' => 'EQUALS', 'State' => 'STARTING|RUNNING|STOPPING|STOPPED|SUCCEEDED|FAILED|TIMEOUT', ], // ... ], 'Logical' => 'AND|ANY', ], 'Schedule' => '<string>', 'State' => 'CREATING|CREATED|ACTIVATING|ACTIVATED|DEACTIVATING|DEACTIVATED|DELETING|UPDATING', 'Type' => 'SCHEDULED|CONDITIONAL|ON_DEMAND', 'WorkflowName' => '<string>', ], ], 'Type' => 'CRAWLER|JOB|TRIGGER', 'UniqueId' => '<string>', ], // ... ], ], 'Name' => '<string>', 'PreviousRunId' => '<string>', 'StartedOn' => <DateTime>, 'Statistics' => [ 'FailedActions' => <integer>, 'RunningActions' => <integer>, 'StoppedActions' => <integer>, 'SucceededActions' => <integer>, 'TimeoutActions' => <integer>, 'TotalActions' => <integer>, ], 'Status' => 'RUNNING|COMPLETED|STOPPING|STOPPED|ERROR', 'WorkflowRunId' => '<string>', 'WorkflowRunProperties' => ['<string>', ...], ], ]
Result Details
Members
- Run
-
- Type: WorkflowRun structure
The requested workflow run metadata.
Errors
-
The input provided was not valid.
-
A specified entity does not exist
-
An internal service error occurred.
-
The operation timed out.
GetWorkflowRunProperties
$result = $client->getWorkflowRunProperties
([/* ... */]); $promise = $client->getWorkflowRunPropertiesAsync
([/* ... */]);
Retrieves the workflow run properties which were set during the run.
Parameter Syntax
$result = $client->getWorkflowRunProperties([ 'Name' => '<string>', // REQUIRED 'RunId' => '<string>', // REQUIRED ]);
Parameter Details
Members
Result Syntax
[ 'RunProperties' => ['<string>', ...], ]
Result Details
Members
Errors
-
The input provided was not valid.
-
A specified entity does not exist
-
An internal service error occurred.
-
The operation timed out.
GetWorkflowRuns
$result = $client->getWorkflowRuns
([/* ... */]); $promise = $client->getWorkflowRunsAsync
([/* ... */]);
Retrieves metadata for all runs of a given workflow.
Parameter Syntax
$result = $client->getWorkflowRuns([ 'IncludeGraph' => true || false, 'MaxResults' => <integer>, 'Name' => '<string>', // REQUIRED 'NextToken' => '<string>', ]);
Parameter Details
Members
- IncludeGraph
-
- Type: boolean
Specifies whether to include the workflow graph in response or not.
- MaxResults
-
- Type: int
The maximum number of workflow runs to be included in the response.
- Name
-
- Required: Yes
- Type: string
Name of the workflow whose metadata of runs should be returned.
- NextToken
-
- Type: string
The maximum size of the response.
Result Syntax
[ 'NextToken' => '<string>', 'Runs' => [ [ 'CompletedOn' => <DateTime>, 'ErrorMessage' => '<string>', 'Graph' => [ 'Edges' => [ [ 'DestinationId' => '<string>', 'SourceId' => '<string>', ], // ... ], 'Nodes' => [ [ 'CrawlerDetails' => [ 'Crawls' => [ [ 'CompletedOn' => <DateTime>, 'ErrorMessage' => '<string>', 'LogGroup' => '<string>', 'LogStream' => '<string>', 'StartedOn' => <DateTime>, 'State' => 'RUNNING|CANCELLING|CANCELLED|SUCCEEDED|FAILED', ], // ... ], ], 'JobDetails' => [ 'JobRuns' => [ [ 'AllocatedCapacity' => <integer>, 'Arguments' => ['<string>', ...], 'Attempt' => <integer>, 'CompletedOn' => <DateTime>, 'ErrorMessage' => '<string>', 'ExecutionTime' => <integer>, 'GlueVersion' => '<string>', 'Id' => '<string>', 'JobName' => '<string>', 'JobRunState' => 'STARTING|RUNNING|STOPPING|STOPPED|SUCCEEDED|FAILED|TIMEOUT', 'LastModifiedOn' => <DateTime>, 'LogGroupName' => '<string>', 'MaxCapacity' => <float>, 'NotificationProperty' => [ 'NotifyDelayAfter' => <integer>, ], 'NumberOfWorkers' => <integer>, 'PredecessorRuns' => [ [ 'JobName' => '<string>', 'RunId' => '<string>', ], // ... ], 'PreviousRunId' => '<string>', 'SecurityConfiguration' => '<string>', 'StartedOn' => <DateTime>, 'Timeout' => <integer>, 'TriggerName' => '<string>', 'WorkerType' => 'Standard|G.1X|G.2X', ], // ... ], ], 'Name' => '<string>', 'TriggerDetails' => [ 'Trigger' => [ 'Actions' => [ [ 'Arguments' => ['<string>', ...], 'CrawlerName' => '<string>', 'JobName' => '<string>', 'NotificationProperty' => [ 'NotifyDelayAfter' => <integer>, ], 'SecurityConfiguration' => '<string>', 'Timeout' => <integer>, ], // ... ], 'Description' => '<string>', 'Id' => '<string>', 'Name' => '<string>', 'Predicate' => [ 'Conditions' => [ [ 'CrawlState' => 'RUNNING|CANCELLING|CANCELLED|SUCCEEDED|FAILED', 'CrawlerName' => '<string>', 'JobName' => '<string>', 'LogicalOperator' => 'EQUALS', 'State' => 'STARTING|RUNNING|STOPPING|STOPPED|SUCCEEDED|FAILED|TIMEOUT', ], // ... ], 'Logical' => 'AND|ANY', ], 'Schedule' => '<string>', 'State' => 'CREATING|CREATED|ACTIVATING|ACTIVATED|DEACTIVATING|DEACTIVATED|DELETING|UPDATING', 'Type' => 'SCHEDULED|CONDITIONAL|ON_DEMAND', 'WorkflowName' => '<string>', ], ], 'Type' => 'CRAWLER|JOB|TRIGGER', 'UniqueId' => '<string>', ], // ... ], ], 'Name' => '<string>', 'PreviousRunId' => '<string>', 'StartedOn' => <DateTime>, 'Statistics' => [ 'FailedActions' => <integer>, 'RunningActions' => <integer>, 'StoppedActions' => <integer>, 'SucceededActions' => <integer>, 'TimeoutActions' => <integer>, 'TotalActions' => <integer>, ], 'Status' => 'RUNNING|COMPLETED|STOPPING|STOPPED|ERROR', 'WorkflowRunId' => '<string>', 'WorkflowRunProperties' => ['<string>', ...], ], // ... ], ]
Result Details
Members
- NextToken
-
- Type: string
A continuation token, if not all requested workflow runs have been returned.
- Runs
-
- Type: Array of WorkflowRun structures
A list of workflow run metadata objects.
Errors
-
The input provided was not valid.
-
A specified entity does not exist
-
An internal service error occurred.
-
The operation timed out.
ImportCatalogToGlue
$result = $client->importCatalogToGlue
([/* ... */]); $promise = $client->importCatalogToGlueAsync
([/* ... */]);
Imports an existing Amazon Athena Data Catalog to AWS Glue
Parameter Syntax
$result = $client->importCatalogToGlue([ 'CatalogId' => '<string>', ]);
Parameter Details
Members
Result Syntax
[]
Result Details
Errors
-
An internal service error occurred.
-
The operation timed out.
ListCrawlers
$result = $client->listCrawlers
([/* ... */]); $promise = $client->listCrawlersAsync
([/* ... */]);
Retrieves the names of all crawler resources in this AWS account, or the resources with the specified tag. This operation allows you to see which resources are available in your account, and their names.
This operation takes the optional Tags
field, which you can use as a filter on the response so that tagged resources can be retrieved as a group. If you choose to use tags filtering, only resources with the tag are retrieved.
Parameter Syntax
$result = $client->listCrawlers([ 'MaxResults' => <integer>, 'NextToken' => '<string>', 'Tags' => ['<string>', ...], ]);
Parameter Details
Members
Result Syntax
[ 'CrawlerNames' => ['<string>', ...], 'NextToken' => '<string>', ]
Result Details
Members
Errors
-
The operation timed out.
ListDevEndpoints
$result = $client->listDevEndpoints
([/* ... */]); $promise = $client->listDevEndpointsAsync
([/* ... */]);
Retrieves the names of all DevEndpoint
resources in this AWS account, or the resources with the specified tag. This operation allows you to see which resources are available in your account, and their names.
This operation takes the optional Tags
field, which you can use as a filter on the response so that tagged resources can be retrieved as a group. If you choose to use tags filtering, only resources with the tag are retrieved.
Parameter Syntax
$result = $client->listDevEndpoints([ 'MaxResults' => <integer>, 'NextToken' => '<string>', 'Tags' => ['<string>', ...], ]);
Parameter Details
Members
Result Syntax
[ 'DevEndpointNames' => ['<string>', ...], 'NextToken' => '<string>', ]
Result Details
Members
Errors
-
The input provided was not valid.
-
A specified entity does not exist
-
An internal service error occurred.
-
The operation timed out.
ListJobs
$result = $client->listJobs
([/* ... */]); $promise = $client->listJobsAsync
([/* ... */]);
Retrieves the names of all job resources in this AWS account, or the resources with the specified tag. This operation allows you to see which resources are available in your account, and their names.
This operation takes the optional Tags
field, which you can use as a filter on the response so that tagged resources can be retrieved as a group. If you choose to use tags filtering, only resources with the tag are retrieved.
Parameter Syntax
$result = $client->listJobs([ 'MaxResults' => <integer>, 'NextToken' => '<string>', 'Tags' => ['<string>', ...], ]);
Parameter Details
Members
Result Syntax
[ 'JobNames' => ['<string>', ...], 'NextToken' => '<string>', ]
Result Details
Members
Errors
-
The input provided was not valid.
-
A specified entity does not exist
-
An internal service error occurred.
-
The operation timed out.
ListMLTransforms
$result = $client->listMLTransforms
([/* ... */]); $promise = $client->listMLTransformsAsync
([/* ... */]);
Retrieves a sortable, filterable list of existing AWS Glue machine learning transforms in this AWS account, or the resources with the specified tag. This operation takes the optional Tags
field, which you can use as a filter of the responses so that tagged resources can be retrieved as a group. If you choose to use tag filtering, only resources with the tags are retrieved.
Parameter Syntax
$result = $client->listMLTransforms([ 'Filter' => [ 'CreatedAfter' => <integer || string || DateTime>, 'CreatedBefore' => <integer || string || DateTime>, 'GlueVersion' => '<string>', 'LastModifiedAfter' => <integer || string || DateTime>, 'LastModifiedBefore' => <integer || string || DateTime>, 'Name' => '<string>', 'Schema' => [ [ 'DataType' => '<string>', 'Name' => '<string>', ], // ... ], 'Status' => 'NOT_READY|READY|DELETING', 'TransformType' => 'FIND_MATCHES', ], 'MaxResults' => <integer>, 'NextToken' => '<string>', 'Sort' => [ 'Column' => 'NAME|TRANSFORM_TYPE|STATUS|CREATED|LAST_MODIFIED', // REQUIRED 'SortDirection' => 'DESCENDING|ASCENDING', // REQUIRED ], 'Tags' => ['<string>', ...], ]);
Parameter Details
Members
- Filter
-
- Type: TransformFilterCriteria structure
A
TransformFilterCriteria
used to filter the machine learning transforms. - MaxResults
-
- Type: int
The maximum size of a list to return.
- NextToken
-
- Type: string
A continuation token, if this is a continuation request.
- Sort
-
- Type: TransformSortCriteria structure
A
TransformSortCriteria
used to sort the machine learning transforms. - Tags
-
- Type: Associative array of custom strings keys (TagKey) to strings
Specifies to return only these tagged resources.
Result Syntax
[ 'NextToken' => '<string>', 'TransformIds' => ['<string>', ...], ]
Result Details
Members
Errors
-
A specified entity does not exist
-
The input provided was not valid.
-
The operation timed out.
-
An internal service error occurred.
ListRegistries
$result = $client->listRegistries
([/* ... */]); $promise = $client->listRegistriesAsync
([/* ... */]);
Returns a list of registries that you have created, with minimal registry information. Registries in the Deleting
status will not be included in the results. Empty results will be returned if there are no registries available.
Parameter Syntax
$result = $client->listRegistries([ 'MaxResults' => <integer>, 'NextToken' => '<string>', ]);
Parameter Details
Members
Result Syntax
[ 'NextToken' => '<string>', 'Registries' => [ [ 'CreatedTime' => '<string>', 'Description' => '<string>', 'RegistryArn' => '<string>', 'RegistryName' => '<string>', 'Status' => 'AVAILABLE|DELETING', 'UpdatedTime' => '<string>', ], // ... ], ]
Result Details
Members
- NextToken
-
- Type: string
A continuation token for paginating the returned list of tokens, returned if the current segment of the list is not the last.
- Registries
-
- Type: Array of RegistryListItem structures
An array of
RegistryDetailedListItem
objects containing minimal details of each registry.
Errors
-
The input provided was not valid.
-
Access to a resource was denied.
-
An internal service error occurred.
ListSchemaVersions
$result = $client->listSchemaVersions
([/* ... */]); $promise = $client->listSchemaVersionsAsync
([/* ... */]);
Returns a list of schema versions that you have created, with minimal information. Schema versions in Deleted status will not be included in the results. Empty results will be returned if there are no schema versions available.
Parameter Syntax
$result = $client->listSchemaVersions([ 'MaxResults' => <integer>, 'NextToken' => '<string>', 'SchemaId' => [ // REQUIRED 'RegistryName' => '<string>', 'SchemaArn' => '<string>', 'SchemaName' => '<string>', ], ]);
Parameter Details
Members
- MaxResults
-
- Type: int
Maximum number of results required per page. If the value is not supplied, this will be defaulted to 25 per page.
- NextToken
-
- Type: string
A continuation token, if this is a continuation call.
- SchemaId
-
- Required: Yes
- Type: SchemaId structure
This is a wrapper structure to contain schema identity fields. The structure contains:
-
SchemaId$SchemaArn: The Amazon Resource Name (ARN) of the schema. Either
SchemaArn
orSchemaName
andRegistryName
has to be provided. -
SchemaId$SchemaName: The name of the schema. Either
SchemaArn
orSchemaName
andRegistryName
has to be provided.
Result Syntax
[ 'NextToken' => '<string>', 'Schemas' => [ [ 'CreatedTime' => '<string>', 'SchemaArn' => '<string>', 'SchemaVersionId' => '<string>', 'Status' => 'AVAILABLE|PENDING|FAILURE|DELETING', 'VersionNumber' => <integer>, ], // ... ], ]
Result Details
Members
- NextToken
-
- Type: string
A continuation token for paginating the returned list of tokens, returned if the current segment of the list is not the last.
- Schemas
-
- Type: Array of SchemaVersionListItem structures
An array of
SchemaVersionList
objects containing details of each schema version.
Errors
-
The input provided was not valid.
-
Access to a resource was denied.
-
A specified entity does not exist
-
An internal service error occurred.
ListSchemas
$result = $client->listSchemas
([/* ... */]); $promise = $client->listSchemasAsync
([/* ... */]);
Returns a list of schemas with minimal details. Schemas in Deleting status will not be included in the results. Empty results will be returned if there are no schemas available.
When the RegistryId
is not provided, all the schemas across registries will be part of the API response.
Parameter Syntax
$result = $client->listSchemas([ 'MaxResults' => <integer>, 'NextToken' => '<string>', 'RegistryId' => [ 'RegistryArn' => '<string>', 'RegistryName' => '<string>', ], ]);
Parameter Details
Members
- MaxResults
-
- Type: int
Maximum number of results required per page. If the value is not supplied, this will be defaulted to 25 per page.
- NextToken
-
- Type: string
A continuation token, if this is a continuation call.
- RegistryId
-
- Type: RegistryId structure
A wrapper structure that may contain the registry name and Amazon Resource Name (ARN).
Result Syntax
[ 'NextToken' => '<string>', 'Schemas' => [ [ 'CreatedTime' => '<string>', 'Description' => '<string>', 'RegistryName' => '<string>', 'SchemaArn' => '<string>', 'SchemaName' => '<string>', 'SchemaStatus' => 'AVAILABLE|PENDING|DELETING', 'UpdatedTime' => '<string>', ], // ... ], ]
Result Details
Members
- NextToken
-
- Type: string
A continuation token for paginating the returned list of tokens, returned if the current segment of the list is not the last.
- Schemas
-
- Type: Array of SchemaListItem structures
An array of
SchemaListItem
objects containing details of each schema.
Errors
-
The input provided was not valid.
-
Access to a resource was denied.
-
A specified entity does not exist
-
An internal service error occurred.
ListTriggers
$result = $client->listTriggers
([/* ... */]); $promise = $client->listTriggersAsync
([/* ... */]);
Retrieves the names of all trigger resources in this AWS account, or the resources with the specified tag. This operation allows you to see which resources are available in your account, and their names.
This operation takes the optional Tags
field, which you can use as a filter on the response so that tagged resources can be retrieved as a group. If you choose to use tags filtering, only resources with the tag are retrieved.
Parameter Syntax
$result = $client->listTriggers([ 'DependentJobName' => '<string>', 'MaxResults' => <integer>, 'NextToken' => '<string>', 'Tags' => ['<string>', ...], ]);
Parameter Details
Members
- DependentJobName
-
- Type: string
The name of the job for which to retrieve triggers. The trigger that can start this job is returned. If there is no such trigger, all triggers are returned.
- MaxResults
-
- Type: int
The maximum size of a list to return.
- NextToken
-
- Type: string
A continuation token, if this is a continuation request.
- Tags
-
- Type: Associative array of custom strings keys (TagKey) to strings
Specifies to return only these tagged resources.
Result Syntax
[ 'NextToken' => '<string>', 'TriggerNames' => ['<string>', ...], ]
Result Details
Members
Errors
-
A specified entity does not exist
-
The input provided was not valid.
-
An internal service error occurred.
-
The operation timed out.
ListWorkflows
$result = $client->listWorkflows
([/* ... */]); $promise = $client->listWorkflowsAsync
([/* ... */]);
Lists names of workflows created in the account.
Parameter Syntax
$result = $client->listWorkflows([ 'MaxResults' => <integer>, 'NextToken' => '<string>', ]);
Parameter Details
Members
Result Syntax
[ 'NextToken' => '<string>', 'Workflows' => ['<string>', ...], ]
Result Details
Members
Errors
-
The input provided was not valid.
-
An internal service error occurred.
-
The operation timed out.
PutDataCatalogEncryptionSettings
$result = $client->putDataCatalogEncryptionSettings
([/* ... */]); $promise = $client->putDataCatalogEncryptionSettingsAsync
([/* ... */]);
Sets the security configuration for a specified catalog. After the configuration has been set, the specified encryption is applied to every catalog write thereafter.
Parameter Syntax
$result = $client->putDataCatalogEncryptionSettings([ 'CatalogId' => '<string>', 'DataCatalogEncryptionSettings' => [ // REQUIRED 'ConnectionPasswordEncryption' => [ 'AwsKmsKeyId' => '<string>', 'ReturnConnectionPasswordEncrypted' => true || false, // REQUIRED ], 'EncryptionAtRest' => [ 'CatalogEncryptionMode' => 'DISABLED|SSE-KMS', // REQUIRED 'SseAwsKmsKeyId' => '<string>', ], ], ]);
Parameter Details
Members
- CatalogId
-
- Type: string
The ID of the Data Catalog to set the security configuration for. If none is provided, the AWS account ID is used by default.
- DataCatalogEncryptionSettings
-
- Required: Yes
- Type: DataCatalogEncryptionSettings structure
The security configuration to set.
Result Syntax
[]
Result Details
Errors
-
An internal service error occurred.
-
The input provided was not valid.
-
The operation timed out.
PutResourcePolicy
$result = $client->putResourcePolicy
([/* ... */]); $promise = $client->putResourcePolicyAsync
([/* ... */]);
Sets the Data Catalog resource policy for access control.
Parameter Syntax
$result = $client->putResourcePolicy([ 'EnableHybrid' => 'TRUE|FALSE', 'PolicyExistsCondition' => 'MUST_EXIST|NOT_EXIST|NONE', 'PolicyHashCondition' => '<string>', 'PolicyInJson' => '<string>', // REQUIRED 'ResourceArn' => '<string>', ]);
Parameter Details
Members
- EnableHybrid
-
- Type: string
If
'TRUE'
, indicates that you are using both methods to grant cross-account access to Data Catalog resources:-
By directly updating the resource policy with
PutResourePolicy
-
By using the Grant permissions command on the AWS Management Console.
Must be set to
'TRUE'
if you have already used the Management Console to grant cross-account access, otherwise the call fails. Default is 'FALSE'. - PolicyExistsCondition
-
- Type: string
A value of
MUST_EXIST
is used to update a policy. A value ofNOT_EXIST
is used to create a new policy. If a value ofNONE
or a null value is used, the call does not depend on the existence of a policy. - PolicyHashCondition
-
- Type: string
The hash value returned when the previous policy was set using
PutResourcePolicy
. Its purpose is to prevent concurrent modifications of a policy. Do not use this parameter if no previous policy has been set. - PolicyInJson
-
- Required: Yes
- Type: string
Contains the policy document to set, in JSON format.
- ResourceArn
-
- Type: string
Do not use. For internal use only.
Result Syntax
[ 'PolicyHash' => '<string>', ]
Result Details
Members
Errors
-
A specified entity does not exist
-
An internal service error occurred.
-
The operation timed out.
-
The input provided was not valid.
-
ConditionCheckFailureException:
A specified condition was not satisfied.
PutSchemaVersionMetadata
$result = $client->putSchemaVersionMetadata
([/* ... */]); $promise = $client->putSchemaVersionMetadataAsync
([/* ... */]);
Puts the metadata key value pair for a specified schema version ID. A maximum of 10 key value pairs will be allowed per schema version. They can be added over one or more calls.
Parameter Syntax
$result = $client->putSchemaVersionMetadata([ 'MetadataKeyValue' => [ // REQUIRED 'MetadataKey' => '<string>', 'MetadataValue' => '<string>', ], 'SchemaId' => [ 'RegistryName' => '<string>', 'SchemaArn' => '<string>', 'SchemaName' => '<string>', ], 'SchemaVersionId' => '<string>', 'SchemaVersionNumber' => [ 'LatestVersion' => true || false, 'VersionNumber' => <integer>, ], ]);
Parameter Details
Members
- MetadataKeyValue
-
- Required: Yes
- Type: MetadataKeyValuePair structure
The metadata key's corresponding value.
- SchemaId
-
- Type: SchemaId structure
The unique ID for the schema.
- SchemaVersionId
-
- Type: string
The unique version ID of the schema version.
- SchemaVersionNumber
-
- Type: SchemaVersionNumber structure
The version number of the schema.
Result Syntax
[ 'LatestVersion' => true || false, 'MetadataKey' => '<string>', 'MetadataValue' => '<string>', 'RegistryName' => '<string>', 'SchemaArn' => '<string>', 'SchemaName' => '<string>', 'SchemaVersionId' => '<string>', 'VersionNumber' => <integer>, ]
Result Details
Members
- LatestVersion
-
- Type: boolean
The latest version of the schema.
- MetadataKey
-
- Type: string
The metadata key.
- MetadataValue
-
- Type: string
The value of the metadata key.
- RegistryName
-
- Type: string
The name for the registry.
- SchemaArn
-
- Type: string
The Amazon Resource Name (ARN) for the schema.
- SchemaName
-
- Type: string
The name for the schema.
- SchemaVersionId
-
- Type: string
The unique version ID of the schema version.
- VersionNumber
-
- Type: long (int|float)
The version number of the schema.
Errors
-
The input provided was not valid.
-
Access to a resource was denied.
-
A resource to be created or added already exists.
-
A specified entity does not exist
-
ResourceNumberLimitExceededException:
A resource numerical limit was exceeded.
PutWorkflowRunProperties
$result = $client->putWorkflowRunProperties
([/* ... */]); $promise = $client->putWorkflowRunPropertiesAsync
([/* ... */]);
Puts the specified workflow run properties for the given workflow run. If a property already exists for the specified run, then it overrides the value otherwise adds the property to existing properties.
Parameter Syntax
$result = $client->putWorkflowRunProperties([ 'Name' => '<string>', // REQUIRED 'RunId' => '<string>', // REQUIRED 'RunProperties' => ['<string>', ...], // REQUIRED ]);
Parameter Details
Members
- Name
-
- Required: Yes
- Type: string
Name of the workflow which was run.
- RunId
-
- Required: Yes
- Type: string
The ID of the workflow run for which the run properties should be updated.
- RunProperties
-
- Required: Yes
- Type: Associative array of custom strings keys (IdString) to strings
The properties to put for the specified run.
Result Syntax
[]
Result Details
Errors
-
A resource to be created or added already exists.
-
A specified entity does not exist
-
The input provided was not valid.
-
An internal service error occurred.
-
The operation timed out.
-
ResourceNumberLimitExceededException:
A resource numerical limit was exceeded.
-
ConcurrentModificationException:
Two processes are trying to modify a resource simultaneously.
QuerySchemaVersionMetadata
$result = $client->querySchemaVersionMetadata
([/* ... */]); $promise = $client->querySchemaVersionMetadataAsync
([/* ... */]);
Queries for the schema version metadata information.
Parameter Syntax
$result = $client->querySchemaVersionMetadata([ 'MaxResults' => <integer>, 'MetadataList' => [ [ 'MetadataKey' => '<string>', 'MetadataValue' => '<string>', ], // ... ], 'NextToken' => '<string>', 'SchemaId' => [ 'RegistryName' => '<string>', 'SchemaArn' => '<string>', 'SchemaName' => '<string>', ], 'SchemaVersionId' => '<string>', 'SchemaVersionNumber' => [ 'LatestVersion' => true || false, 'VersionNumber' => <integer>, ], ]);
Parameter Details
Members
- MaxResults
-
- Type: int
Maximum number of results required per page. If the value is not supplied, this will be defaulted to 25 per page.
- MetadataList
-
- Type: Array of MetadataKeyValuePair structures
Search key-value pairs for metadata, if they are not provided all the metadata information will be fetched.
- NextToken
-
- Type: string
A continuation token, if this is a continuation call.
- SchemaId
-
- Type: SchemaId structure
A wrapper structure that may contain the schema name and Amazon Resource Name (ARN).
- SchemaVersionId
-
- Type: string
The unique version ID of the schema version.
- SchemaVersionNumber
-
- Type: SchemaVersionNumber structure
The version number of the schema.
Result Syntax
[ 'MetadataInfoMap' => [ '<MetadataKeyString>' => [ 'CreatedTime' => '<string>', 'MetadataValue' => '<string>', 'OtherMetadataValueList' => [ [ 'CreatedTime' => '<string>', 'MetadataValue' => '<string>', ], // ... ], ], // ... ], 'NextToken' => '<string>', 'SchemaVersionId' => '<string>', ]
Result Details
Members
- MetadataInfoMap
-
- Type: Associative array of custom strings keys (MetadataKeyString) to MetadataInfo structures
A map of a metadata key and associated values.
- NextToken
-
- Type: string
A continuation token for paginating the returned list of tokens, returned if the current segment of the list is not the last.
- SchemaVersionId
-
- Type: string
The unique version ID of the schema version.
Errors
-
The input provided was not valid.
-
Access to a resource was denied.
-
A specified entity does not exist
RegisterSchemaVersion
$result = $client->registerSchemaVersion
([/* ... */]); $promise = $client->registerSchemaVersionAsync
([/* ... */]);
Adds a new version to the existing schema. Returns an error if new version of schema does not meet the compatibility requirements of the schema set. This API will not create a new schema set and will return a 404 error if the schema set is not already present in the Schema Registry.
If this is the first schema definition to be registered in the Schema Registry, this API will store the schema version and return immediately. Otherwise, this call has the potential to run longer than other operations due to compatibility modes. You can call the GetSchemaVersion
API with the SchemaVersionId
to check compatibility modes.
If the same schema definition is already stored in Schema Registry as a version, the schema ID of the existing schema is returned to the caller.
Parameter Syntax
$result = $client->registerSchemaVersion([ 'SchemaDefinition' => '<string>', // REQUIRED 'SchemaId' => [ // REQUIRED 'RegistryName' => '<string>', 'SchemaArn' => '<string>', 'SchemaName' => '<string>', ], ]);
Parameter Details
Members
- SchemaDefinition
-
- Required: Yes
- Type: string
The schema definition using the
DataFormat
setting for theSchemaName
. - SchemaId
-
- Required: Yes
- Type: SchemaId structure
This is a wrapper structure to contain schema identity fields. The structure contains:
-
SchemaId$SchemaArn: The Amazon Resource Name (ARN) of the schema. Either
SchemaArn
orSchemaName
andRegistryName
has to be provided. -
SchemaId$SchemaName: The name of the schema. Either
SchemaArn
orSchemaName
andRegistryName
has to be provided.
Result Syntax
[ 'SchemaVersionId' => '<string>', 'Status' => 'AVAILABLE|PENDING|FAILURE|DELETING', 'VersionNumber' => <integer>, ]
Result Details
Members
Errors
-
The input provided was not valid.
-
Access to a resource was denied.
-
A specified entity does not exist
-
ResourceNumberLimitExceededException:
A resource numerical limit was exceeded.
-
ConcurrentModificationException:
Two processes are trying to modify a resource simultaneously.
-
An internal service error occurred.
RemoveSchemaVersionMetadata
$result = $client->removeSchemaVersionMetadata
([/* ... */]); $promise = $client->removeSchemaVersionMetadataAsync
([/* ... */]);
Removes a key value pair from the schema version metadata for the specified schema version ID.
Parameter Syntax
$result = $client->removeSchemaVersionMetadata([ 'MetadataKeyValue' => [ // REQUIRED 'MetadataKey' => '<string>', 'MetadataValue' => '<string>', ], 'SchemaId' => [ 'RegistryName' => '<string>', 'SchemaArn' => '<string>', 'SchemaName' => '<string>', ], 'SchemaVersionId' => '<string>', 'SchemaVersionNumber' => [ 'LatestVersion' => true || false, 'VersionNumber' => <integer>, ], ]);
Parameter Details
Members
- MetadataKeyValue
-
- Required: Yes
- Type: MetadataKeyValuePair structure
The value of the metadata key.
- SchemaId
-
- Type: SchemaId structure
A wrapper structure that may contain the schema name and Amazon Resource Name (ARN).
- SchemaVersionId
-
- Type: string
The unique version ID of the schema version.
- SchemaVersionNumber
-
- Type: SchemaVersionNumber structure
The version number of the schema.
Result Syntax
[ 'LatestVersion' => true || false, 'MetadataKey' => '<string>', 'MetadataValue' => '<string>', 'RegistryName' => '<string>', 'SchemaArn' => '<string>', 'SchemaName' => '<string>', 'SchemaVersionId' => '<string>', 'VersionNumber' => <integer>, ]
Result Details
Members
- LatestVersion
-
- Type: boolean
The latest version of the schema.
- MetadataKey
-
- Type: string
The metadata key.
- MetadataValue
-
- Type: string
The value of the metadata key.
- RegistryName
-
- Type: string
The name of the registry.
- SchemaArn
-
- Type: string
The Amazon Resource Name (ARN) of the schema.
- SchemaName
-
- Type: string
The name of the schema.
- SchemaVersionId
-
- Type: string
The version ID for the schema version.
- VersionNumber
-
- Type: long (int|float)
The version number of the schema.
Errors
-
The input provided was not valid.
-
Access to a resource was denied.
-
A specified entity does not exist
ResetJobBookmark
$result = $client->resetJobBookmark
([/* ... */]); $promise = $client->resetJobBookmarkAsync
([/* ... */]);
Resets a bookmark entry.
Parameter Syntax
$result = $client->resetJobBookmark([ 'JobName' => '<string>', // REQUIRED 'RunId' => '<string>', ]);
Parameter Details
Members
Result Syntax
[ 'JobBookmarkEntry' => [ 'Attempt' => <integer>, 'JobBookmark' => '<string>', 'JobName' => '<string>', 'PreviousRunId' => '<string>', 'Run' => <integer>, 'RunId' => '<string>', 'Version' => <integer>, ], ]
Result Details
Members
- JobBookmarkEntry
-
- Type: JobBookmarkEntry structure
The reset bookmark entry.
Errors
-
A specified entity does not exist
-
The input provided was not valid.
-
An internal service error occurred.
-
The operation timed out.
ResumeWorkflowRun
$result = $client->resumeWorkflowRun
([/* ... */]); $promise = $client->resumeWorkflowRunAsync
([/* ... */]);
Restarts selected nodes of a previous partially completed workflow run and resumes the workflow run. The selected nodes and all nodes that are downstream from the selected nodes are run.
Parameter Syntax
$result = $client->resumeWorkflowRun([ 'Name' => '<string>', // REQUIRED 'NodeIds' => ['<string>', ...], // REQUIRED 'RunId' => '<string>', // REQUIRED ]);
Parameter Details
Members
- Name
-
- Required: Yes
- Type: string
The name of the workflow to resume.
- NodeIds
-
- Required: Yes
- Type: Array of strings
A list of the node IDs for the nodes you want to restart. The nodes that are to be restarted must have a run attempt in the original run.
- RunId
-
- Required: Yes
- Type: string
The ID of the workflow run to resume.
Result Syntax
[ 'NodeIds' => ['<string>', ...], 'RunId' => '<string>', ]
Result Details
Members
Errors
-
The input provided was not valid.
-
A specified entity does not exist
-
An internal service error occurred.
-
The operation timed out.
-
ConcurrentRunsExceededException:
Too many jobs are being run concurrently.
-
IllegalWorkflowStateException:
The workflow is in an invalid state to perform a requested operation.
SearchTables
$result = $client->searchTables
([/* ... */]); $promise = $client->searchTablesAsync
([/* ... */]);
Searches a set of tables based on properties in the table metadata as well as on the parent database. You can search against text or filter conditions.
You can only get tables that you have access to based on the security policies defined in Lake Formation. You need at least a read-only access to the table for it to be returned. If you do not have access to all the columns in the table, these columns will not be searched against when returning the list of tables back to you. If you have access to the columns but not the data in the columns, those columns and the associated metadata for those columns will be included in the search.
Parameter Syntax
$result = $client->searchTables([ 'CatalogId' => '<string>', 'Filters' => [ [ 'Comparator' => 'EQUALS|GREATER_THAN|LESS_THAN|GREATER_THAN_EQUALS|LESS_THAN_EQUALS', 'Key' => '<string>', 'Value' => '<string>', ], // ... ], 'MaxResults' => <integer>, 'NextToken' => '<string>', 'ResourceShareType' => 'FOREIGN|ALL', 'SearchText' => '<string>', 'SortCriteria' => [ [ 'FieldName' => '<string>', 'Sort' => 'ASC|DESC', ], // ... ], ]);
Parameter Details
Members
- CatalogId
-
- Type: string
A unique identifier, consisting of
account_id
. - Filters
-
- Type: Array of PropertyPredicate structures
A list of key-value pairs, and a comparator used to filter the search results. Returns all entities matching the predicate.
The
Comparator
member of thePropertyPredicate
struct is used only for time fields, and can be omitted for other field types. Also, when comparing string values, such as whenKey=Name
, a fuzzy match algorithm is used. TheKey
field (for example, the value of theName
field) is split on certain punctuation characters, for example, -, :, #, etc. into tokens. Then each token is exact-match compared with theValue
member ofPropertyPredicate
. For example, ifKey=Name
andValue=link
, tables namedcustomer-link
andxx-link-yy
are returned, butxxlinkyy
is not returned. - MaxResults
-
- Type: int
The maximum number of tables to return in a single response.
- NextToken
-
- Type: string
A continuation token, included if this is a continuation call.
- ResourceShareType
-
- Type: string
Allows you to specify that you want to search the tables shared with your account. The allowable values are
FOREIGN
orALL
.-
If set to
FOREIGN
, will search the tables shared with your account. -
If set to
ALL
, will search the tables shared with your account, as well as the tables in yor local account.
- SearchText
-
- Type: string
A string used for a text search.
Specifying a value in quotes filters based on an exact match to the value.
- SortCriteria
-
- Type: Array of SortCriterion structures
A list of criteria for sorting the results by a field name, in an ascending or descending order.
Result Syntax
[ 'NextToken' => '<string>', 'TableList' => [ [ 'CatalogId' => '<string>', 'CreateTime' => <DateTime>, 'CreatedBy' => '<string>', 'DatabaseName' => '<string>', 'Description' => '<string>', 'IsRegisteredWithLakeFormation' => true || false, 'LastAccessTime' => <DateTime>, 'LastAnalyzedTime' => <DateTime>, 'Name' => '<string>', 'Owner' => '<string>', 'Parameters' => ['<string>', ...], 'PartitionKeys' => [ [ 'Comment' => '<string>', 'Name' => '<string>', 'Parameters' => ['<string>', ...], 'Type' => '<string>', ], // ... ], 'Retention' => <integer>, 'StorageDescriptor' => [ 'BucketColumns' => ['<string>', ...], 'Columns' => [ [ 'Comment' => '<string>', 'Name' => '<string>', 'Parameters' => ['<string>', ...], 'Type' => '<string>', ], // ... ], 'Compressed' => true || false, 'InputFormat' => '<string>', 'Location' => '<string>', 'NumberOfBuckets' => <integer>, 'OutputFormat' => '<string>', 'Parameters' => ['<string>', ...], 'SchemaReference' => [ 'SchemaId' => [ 'RegistryName' => '<string>', 'SchemaArn' => '<string>', 'SchemaName' => '<string>', ], 'SchemaVersionId' => '<string>', 'SchemaVersionNumber' => <integer>, ], 'SerdeInfo' => [ 'Name' => '<string>', 'Parameters' => ['<string>', ...], 'SerializationLibrary' => '<string>', ], 'SkewedInfo' => [ 'SkewedColumnNames' => ['<string>', ...], 'SkewedColumnValueLocationMaps' => ['<string>', ...], 'SkewedColumnValues' => ['<string>', ...], ], 'SortColumns' => [ [ 'Column' => '<string>', 'SortOrder' => <integer>, ], // ... ], 'StoredAsSubDirectories' => true || false, ], 'TableType' => '<string>', 'TargetTable' => [ 'CatalogId' => '<string>', 'DatabaseName' => '<string>', 'Name' => '<string>', ], 'UpdateTime' => <DateTime>, 'ViewExpandedText' => '<string>', 'ViewOriginalText' => '<string>', ], // ... ], ]
Result Details
Members
- NextToken
-
- Type: string
A continuation token, present if the current list segment is not the last.
- TableList
-
- Type: Array of Table structures
A list of the requested
Table
objects. TheSearchTables
response returns only the tables that you have access to.
Errors
-
An internal service error occurred.
-
The input provided was not valid.
-
The operation timed out.
StartCrawler
$result = $client->startCrawler
([/* ... */]); $promise = $client->startCrawlerAsync
([/* ... */]);
Starts a crawl using the specified crawler, regardless of what is scheduled. If the crawler is already running, returns a CrawlerRunningException.
Parameter Syntax
$result = $client->startCrawler([ 'Name' => '<string>', // REQUIRED ]);
Parameter Details
Result Syntax
[]
Result Details
Errors
-
A specified entity does not exist
-
The operation cannot be performed because the crawler is already running.
-
The operation timed out.
StartCrawlerSchedule
$result = $client->startCrawlerSchedule
([/* ... */]); $promise = $client->startCrawlerScheduleAsync
([/* ... */]);
Changes the schedule state of the specified crawler to SCHEDULED
, unless the crawler is already running or the schedule state is already SCHEDULED
.
Parameter Syntax
$result = $client->startCrawlerSchedule([ 'CrawlerName' => '<string>', // REQUIRED ]);
Parameter Details
Result Syntax
[]
Result Details
Errors
-
A specified entity does not exist
-
The specified scheduler is already running.
-
SchedulerTransitioningException:
The specified scheduler is transitioning.
-
There is no applicable schedule.
-
The operation timed out.
StartExportLabelsTaskRun
$result = $client->startExportLabelsTaskRun
([/* ... */]); $promise = $client->startExportLabelsTaskRunAsync
([/* ... */]);
Begins an asynchronous task to export all labeled data for a particular transform. This task is the only label-related API call that is not part of the typical active learning workflow. You typically use StartExportLabelsTaskRun
when you want to work with all of your existing labels at the same time, such as when you want to remove or change labels that were previously submitted as truth. This API operation accepts the TransformId
whose labels you want to export and an Amazon Simple Storage Service (Amazon S3) path to export the labels to. The operation returns a TaskRunId
. You can check on the status of your task run by calling the GetMLTaskRun
API.
Parameter Syntax
$result = $client->startExportLabelsTaskRun([ 'OutputS3Path' => '<string>', // REQUIRED 'TransformId' => '<string>', // REQUIRED ]);
Parameter Details
Members
Result Syntax
[ 'TaskRunId' => '<string>', ]
Result Details
Errors
-
A specified entity does not exist
-
The input provided was not valid.
-
The operation timed out.
-
An internal service error occurred.
StartImportLabelsTaskRun
$result = $client->startImportLabelsTaskRun
([/* ... */]); $promise = $client->startImportLabelsTaskRunAsync
([/* ... */]);
Enables you to provide additional labels (examples of truth) to be used to teach the machine learning transform and improve its quality. This API operation is generally used as part of the active learning workflow that starts with the StartMLLabelingSetGenerationTaskRun
call and that ultimately results in improving the quality of your machine learning transform.
After the StartMLLabelingSetGenerationTaskRun
finishes, AWS Glue machine learning will have generated a series of questions for humans to answer. (Answering these questions is often called 'labeling' in the machine learning workflows). In the case of the FindMatches
transform, these questions are of the form, “What is the correct way to group these rows together into groups composed entirely of matching records?” After the labeling process is finished, users upload their answers/labels with a call to StartImportLabelsTaskRun
. After StartImportLabelsTaskRun
finishes, all future runs of the machine learning transform use the new and improved labels and perform a higher-quality transformation.
By default, StartMLLabelingSetGenerationTaskRun
continually learns from and combines all labels that you upload unless you set Replace
to true. If you set Replace
to true, StartImportLabelsTaskRun
deletes and forgets all previously uploaded labels and learns only from the exact set that you upload. Replacing labels can be helpful if you realize that you previously uploaded incorrect labels, and you believe that they are having a negative effect on your transform quality.
You can check on the status of your task run by calling the GetMLTaskRun
operation.
Parameter Syntax
$result = $client->startImportLabelsTaskRun([ 'InputS3Path' => '<string>', // REQUIRED 'ReplaceAllLabels' => true || false, 'TransformId' => '<string>', // REQUIRED ]);
Parameter Details
Members
- InputS3Path
-
- Required: Yes
- Type: string
The Amazon Simple Storage Service (Amazon S3) path from where you import the labels.
- ReplaceAllLabels
-
- Type: boolean
Indicates whether to overwrite your existing labels.
- TransformId
-
- Required: Yes
- Type: string
The unique identifier of the machine learning transform.
Result Syntax
[ 'TaskRunId' => '<string>', ]
Result Details
Errors
-
A specified entity does not exist
-
The input provided was not valid.
-
The operation timed out.
-
ResourceNumberLimitExceededException:
A resource numerical limit was exceeded.
-
An internal service error occurred.
StartJobRun
$result = $client->startJobRun
([/* ... */]); $promise = $client->startJobRunAsync
([/* ... */]);
Starts a job run using a job definition.
Parameter Syntax
$result = $client->startJobRun([ 'AllocatedCapacity' => <integer>, 'Arguments' => ['<string>', ...], 'JobName' => '<string>', // REQUIRED 'JobRunId' => '<string>', 'MaxCapacity' => <float>, 'NotificationProperty' => [ 'NotifyDelayAfter' => <integer>, ], 'NumberOfWorkers' => <integer>, 'SecurityConfiguration' => '<string>', 'Timeout' => <integer>, 'WorkerType' => 'Standard|G.1X|G.2X', ]);
Parameter Details
Members
- AllocatedCapacity
-
- Type: int
This field is deprecated. Use
MaxCapacity
instead.The number of AWS Glue data processing units (DPUs) to allocate to this JobRun. From 2 to 100 DPUs can be allocated; the default is 10. A DPU is a relative measure of processing power that consists of 4 vCPUs of compute capacity and 16 GB of memory. For more information, see the AWS Glue pricing page.
- Arguments
-
- Type: Associative array of custom strings keys (GenericString) to strings
The job arguments specifically for this run. For this job run, they replace the default arguments set in the job definition itself.
You can specify arguments here that your own job-execution script consumes, as well as arguments that AWS Glue itself consumes.
For information about how to specify and consume your own Job arguments, see the Calling AWS Glue APIs in Python topic in the developer guide.
For information about the key-value pairs that AWS Glue consumes to set up your job, see the Special Parameters Used by AWS Glue topic in the developer guide.
- JobName
-
- Required: Yes
- Type: string
The name of the job definition to use.
- JobRunId
-
- Type: string
The ID of a previous
JobRun
to retry. - MaxCapacity
-
- Type: double
The number of AWS Glue data processing units (DPUs) that can be allocated when this job runs. A DPU is a relative measure of processing power that consists of 4 vCPUs of compute capacity and 16 GB of memory. For more information, see the AWS Glue pricing page.
Do not set
Max Capacity
if usingWorkerType
andNumberOfWorkers
.The value that can be allocated for
MaxCapacity
depends on whether you are running a Python shell job, or an Apache Spark ETL job:-
When you specify a Python shell job (
JobCommand.Name
="pythonshell"), you can allocate either 0.0625 or 1 DPU. The default is 0.0625 DPU. -
When you specify an Apache Spark ETL job (
JobCommand.Name
="glueetl"), you can allocate from 2 to 100 DPUs. The default is 10 DPUs. This job type cannot have a fractional DPU allocation.
- NotificationProperty
-
- Type: NotificationProperty structure
Specifies configuration properties of a job run notification.
- NumberOfWorkers
-
- Type: int
The number of workers of a defined
workerType
that are allocated when a job runs.The maximum number of workers you can define are 299 for
G.1X
, and 149 forG.2X
. - SecurityConfiguration
-
- Type: string
The name of the
SecurityConfiguration
structure to be used with this job run. - Timeout
-
- Type: int
The
JobRun
timeout in minutes. This is the maximum time that a job run can consume resources before it is terminated and entersTIMEOUT
status. The default is 2,880 minutes (48 hours). This overrides the timeout value set in the parent job. - WorkerType
-
- Type: string
The type of predefined worker that is allocated when a job runs. Accepts a value of Standard, G.1X, or G.2X.
-
For the
Standard
worker type, each worker provides 4 vCPU, 16 GB of memory and a 50GB disk, and 2 executors per worker. -
For the
G.1X
worker type, each worker provides 4 vCPU, 16 GB of memory and a 64GB disk, and 1 executor per worker. -
For the
G.2X
worker type, each worker provides 8 vCPU, 32 GB of memory and a 128GB disk, and 1 executor per worker.
Result Syntax
[ 'JobRunId' => '<string>', ]
Result Details
Errors
-
The input provided was not valid.
-
A specified entity does not exist
-
An internal service error occurred.
-
The operation timed out.
-
ResourceNumberLimitExceededException:
A resource numerical limit was exceeded.
-
ConcurrentRunsExceededException:
Too many jobs are being run concurrently.
StartMLEvaluationTaskRun
$result = $client->startMLEvaluationTaskRun
([/* ... */]); $promise = $client->startMLEvaluationTaskRunAsync
([/* ... */]);
Starts a task to estimate the quality of the transform.
When you provide label sets as examples of truth, AWS Glue machine learning uses some of those examples to learn from them. The rest of the labels are used as a test to estimate quality.
Returns a unique identifier for the run. You can call GetMLTaskRun
to get more information about the stats of the EvaluationTaskRun
.
Parameter Syntax
$result = $client->startMLEvaluationTaskRun([ 'TransformId' => '<string>', // REQUIRED ]);
Parameter Details
Members
Result Syntax
[ 'TaskRunId' => '<string>', ]
Result Details
Errors
-
A specified entity does not exist
-
The input provided was not valid.
-
The operation timed out.
-
An internal service error occurred.
-
ConcurrentRunsExceededException:
Too many jobs are being run concurrently.
-
The machine learning transform is not ready to run.
StartMLLabelingSetGenerationTaskRun
$result = $client->startMLLabelingSetGenerationTaskRun
([/* ... */]); $promise = $client->startMLLabelingSetGenerationTaskRunAsync
([/* ... */]);
Starts the active learning workflow for your machine learning transform to improve the transform's quality by generating label sets and adding labels.
When the StartMLLabelingSetGenerationTaskRun
finishes, AWS Glue will have generated a "labeling set" or a set of questions for humans to answer.
In the case of the FindMatches
transform, these questions are of the form, “What is the correct way to group these rows together into groups composed entirely of matching records?”
After the labeling process is finished, you can upload your labels with a call to StartImportLabelsTaskRun
. After StartImportLabelsTaskRun
finishes, all future runs of the machine learning transform will use the new and improved labels and perform a higher-quality transformation.
Parameter Syntax
$result = $client->startMLLabelingSetGenerationTaskRun([ 'OutputS3Path' => '<string>', // REQUIRED 'TransformId' => '<string>', // REQUIRED ]);
Parameter Details
Members
Result Syntax
[ 'TaskRunId' => '<string>', ]
Result Details
Errors
-
A specified entity does not exist
-
The input provided was not valid.
-
The operation timed out.
-
An internal service error occurred.
-
ConcurrentRunsExceededException:
Too many jobs are being run concurrently.
StartTrigger
$result = $client->startTrigger
([/* ... */]); $promise = $client->startTriggerAsync
([/* ... */]);
Starts an existing trigger. See Triggering Jobs for information about how different types of trigger are started.
Parameter Syntax
$result = $client->startTrigger([ 'Name' => '<string>', // REQUIRED ]);
Parameter Details
Result Syntax
[ 'Name' => '<string>', ]
Result Details
Errors
-
The input provided was not valid.
-
An internal service error occurred.
-
A specified entity does not exist
-
The operation timed out.
-
ResourceNumberLimitExceededException:
A resource numerical limit was exceeded.
-
ConcurrentRunsExceededException:
Too many jobs are being run concurrently.
StartWorkflowRun
$result = $client->startWorkflowRun
([/* ... */]); $promise = $client->startWorkflowRunAsync
([/* ... */]);
Starts a new run of the specified workflow.
Parameter Syntax
$result = $client->startWorkflowRun([ 'Name' => '<string>', // REQUIRED ]);
Parameter Details
Result Syntax
[ 'RunId' => '<string>', ]
Result Details
Errors
-
The input provided was not valid.
-
A specified entity does not exist
-
An internal service error occurred.
-
The operation timed out.
-
ResourceNumberLimitExceededException:
A resource numerical limit was exceeded.
-
ConcurrentRunsExceededException:
Too many jobs are being run concurrently.
StopCrawler
$result = $client->stopCrawler
([/* ... */]); $promise = $client->stopCrawlerAsync
([/* ... */]);
If the specified crawler is running, stops the crawl.
Parameter Syntax
$result = $client->stopCrawler([ 'Name' => '<string>', // REQUIRED ]);
Parameter Details
Result Syntax
[]
Result Details
Errors
-
A specified entity does not exist
-
The specified crawler is not running.
-
The specified crawler is stopping.
-
The operation timed out.
StopCrawlerSchedule
$result = $client->stopCrawlerSchedule
([/* ... */]); $promise = $client->stopCrawlerScheduleAsync
([/* ... */]);
Sets the schedule state of the specified crawler to NOT_SCHEDULED
, but does not stop the crawler if it is already running.
Parameter Syntax
$result = $client->stopCrawlerSchedule([ 'CrawlerName' => '<string>', // REQUIRED ]);
Parameter Details
Result Syntax
[]
Result Details
Errors
-
A specified entity does not exist
-
The specified scheduler is not running.
-
SchedulerTransitioningException:
The specified scheduler is transitioning.
-
The operation timed out.
StopTrigger
$result = $client->stopTrigger
([/* ... */]); $promise = $client->stopTriggerAsync
([/* ... */]);
Stops a specified trigger.
Parameter Syntax
$result = $client->stopTrigger([ 'Name' => '<string>', // REQUIRED ]);
Parameter Details
Result Syntax
[ 'Name' => '<string>', ]
Result Details
Errors
-
The input provided was not valid.
-
An internal service error occurred.
-
A specified entity does not exist
-
The operation timed out.
-
ConcurrentModificationException:
Two processes are trying to modify a resource simultaneously.
StopWorkflowRun
$result = $client->stopWorkflowRun
([/* ... */]); $promise = $client->stopWorkflowRunAsync
([/* ... */]);
Stops the execution of the specified workflow run.
Parameter Syntax
$result = $client->stopWorkflowRun([ 'Name' => '<string>', // REQUIRED 'RunId' => '<string>', // REQUIRED ]);
Parameter Details
Members
Result Syntax
[]
Result Details
Errors
-
The input provided was not valid.
-
A specified entity does not exist
-
An internal service error occurred.
-
The operation timed out.
-
IllegalWorkflowStateException:
The workflow is in an invalid state to perform a requested operation.
TagResource
$result = $client->tagResource
([/* ... */]); $promise = $client->tagResourceAsync
([/* ... */]);
Adds tags to a resource. A tag is a label you can assign to an AWS resource. In AWS Glue, you can tag only certain resources. For information about what resources you can tag, see AWS Tags in AWS Glue.
Parameter Syntax
$result = $client->tagResource([ 'ResourceArn' => '<string>', // REQUIRED 'TagsToAdd' => ['<string>', ...], // REQUIRED ]);
Parameter Details
Members
- ResourceArn
-
- Required: Yes
- Type: string
The ARN of the AWS Glue resource to which to add the tags. For more information about AWS Glue resource ARNs, see the AWS Glue ARN string pattern.
- TagsToAdd
-
- Required: Yes
- Type: Associative array of custom strings keys (TagKey) to strings
Tags to add to this resource.
Result Syntax
[]
Result Details
Errors
-
The input provided was not valid.
-
An internal service error occurred.
-
The operation timed out.
-
A specified entity does not exist
UntagResource
$result = $client->untagResource
([/* ... */]); $promise = $client->untagResourceAsync
([/* ... */]);
Removes tags from a resource.
Parameter Syntax
$result = $client->untagResource([ 'ResourceArn' => '<string>', // REQUIRED 'TagsToRemove' => ['<string>', ...], // REQUIRED ]);
Parameter Details
Members
Result Syntax
[]
Result Details
Errors
-
The input provided was not valid.
-
An internal service error occurred.
-
The operation timed out.
-
A specified entity does not exist
UpdateClassifier
$result = $client->updateClassifier
([/* ... */]); $promise = $client->updateClassifierAsync
([/* ... */]);
Modifies an existing classifier (a GrokClassifier
, an XMLClassifier
, a JsonClassifier
, or a CsvClassifier
, depending on which field is present).
Parameter Syntax
$result = $client->updateClassifier([ 'CsvClassifier' => [ 'AllowSingleColumn' => true || false, 'ContainsHeader' => 'UNKNOWN|PRESENT|ABSENT', 'Delimiter' => '<string>', 'DisableValueTrimming' => true || false, 'Header' => ['<string>', ...], 'Name' => '<string>', // REQUIRED 'QuoteSymbol' => '<string>', ], 'GrokClassifier' => [ 'Classification' => '<string>', 'CustomPatterns' => '<string>', 'GrokPattern' => '<string>', 'Name' => '<string>', // REQUIRED ], 'JsonClassifier' => [ 'JsonPath' => '<string>', 'Name' => '<string>', // REQUIRED ], 'XMLClassifier' => [ 'Classification' => '<string>', 'Name' => '<string>', // REQUIRED 'RowTag' => '<string>', ], ]);
Parameter Details
Members
- CsvClassifier
-
- Type: UpdateCsvClassifierRequest structure
A
CsvClassifier
object with updated fields. - GrokClassifier
-
- Type: UpdateGrokClassifierRequest structure
A
GrokClassifier
object with updated fields. - JsonClassifier
-
- Type: UpdateJsonClassifierRequest structure
A
JsonClassifier
object with updated fields. - XMLClassifier
-
- Type: UpdateXMLClassifierRequest structure
An
XMLClassifier
object with updated fields.
Result Syntax
[]
Result Details
Errors
-
The input provided was not valid.
-
There was a version conflict.
-
A specified entity does not exist
-
The operation timed out.
UpdateColumnStatisticsForPartition
$result = $client->updateColumnStatisticsForPartition
([/* ... */]); $promise = $client->updateColumnStatisticsForPartitionAsync
([/* ... */]);
Creates or updates partition statistics of columns.
The Identity and Access Management (IAM) permission required for this operation is UpdatePartition
.
Parameter Syntax
$result = $client->updateColumnStatisticsForPartition([ 'CatalogId' => '<string>', 'ColumnStatisticsList' => [ // REQUIRED [ 'AnalyzedTime' => <integer || string || DateTime>, // REQUIRED 'ColumnName' => '<string>', // REQUIRED 'ColumnType' => '<string>', // REQUIRED 'StatisticsData' => [ // REQUIRED 'BinaryColumnStatisticsData' => [ 'AverageLength' => <float>, // REQUIRED 'MaximumLength' => <integer>, // REQUIRED 'NumberOfNulls' => <integer>, // REQUIRED ], 'BooleanColumnStatisticsData' => [ 'NumberOfFalses' => <integer>, // REQUIRED 'NumberOfNulls' => <integer>, // REQUIRED 'NumberOfTrues' => <integer>, // REQUIRED ], 'DateColumnStatisticsData' => [ 'MaximumValue' => <integer || string || DateTime>, 'MinimumValue' => <integer || string || DateTime>, 'NumberOfDistinctValues' => <integer>, // REQUIRED 'NumberOfNulls' => <integer>, // REQUIRED ], 'DecimalColumnStatisticsData' => [ 'MaximumValue' => [ 'Scale' => <integer>, // REQUIRED 'UnscaledValue' => <string || resource || Psr\Http\Message\StreamInterface>, // REQUIRED ], 'MinimumValue' => [ 'Scale' => <integer>, // REQUIRED 'UnscaledValue' => <string || resource || Psr\Http\Message\StreamInterface>, // REQUIRED ], 'NumberOfDistinctValues' => <integer>, // REQUIRED 'NumberOfNulls' => <integer>, // REQUIRED ], 'DoubleColumnStatisticsData' => [ 'MaximumValue' => <float>, 'MinimumValue' => <float>, 'NumberOfDistinctValues' => <integer>, // REQUIRED 'NumberOfNulls' => <integer>, // REQUIRED ], 'LongColumnStatisticsData' => [ 'MaximumValue' => <integer>, 'MinimumValue' => <integer>, 'NumberOfDistinctValues' => <integer>, // REQUIRED 'NumberOfNulls' => <integer>, // REQUIRED ], 'StringColumnStatisticsData' => [ 'AverageLength' => <float>, // REQUIRED 'MaximumLength' => <integer>, // REQUIRED 'NumberOfDistinctValues' => <integer>, // REQUIRED 'NumberOfNulls' => <integer>, // REQUIRED ], 'Type' => 'BOOLEAN|DATE|DECIMAL|DOUBLE|LONG|STRING|BINARY', // REQUIRED ], ], // ... ], 'DatabaseName' => '<string>', // REQUIRED 'PartitionValues' => ['<string>', ...], // REQUIRED 'TableName' => '<string>', // REQUIRED ]);
Parameter Details
Members
- CatalogId
-
- Type: string
The ID of the Data Catalog where the partitions in question reside. If none is supplied, the AWS account ID is used by default.
- ColumnStatisticsList
-
- Required: Yes
- Type: Array of ColumnStatistics structures
A list of the column statistics.
- DatabaseName
-
- Required: Yes
- Type: string
The name of the catalog database where the partitions reside.
- PartitionValues
-
- Required: Yes
- Type: Array of strings
A list of partition values identifying the partition.
- TableName
-
- Required: Yes
- Type: string
The name of the partitions' table.
Result Syntax
[ 'Errors' => [ [ 'ColumnStatistics' => [ 'AnalyzedTime' => <DateTime>, 'ColumnName' => '<string>', 'ColumnType' => '<string>', 'StatisticsData' => [ 'BinaryColumnStatisticsData' => [ 'AverageLength' => <float>, 'MaximumLength' => <integer>, 'NumberOfNulls' => <integer>, ], 'BooleanColumnStatisticsData' => [ 'NumberOfFalses' => <integer>, 'NumberOfNulls' => <integer>, 'NumberOfTrues' => <integer>, ], 'DateColumnStatisticsData' => [ 'MaximumValue' => <DateTime>, 'MinimumValue' => <DateTime>, 'NumberOfDistinctValues' => <integer>, 'NumberOfNulls' => <integer>, ], 'DecimalColumnStatisticsData' => [ 'MaximumValue' => [ 'Scale' => <integer>, 'UnscaledValue' => <string || resource || Psr\Http\Message\StreamInterface>, ], 'MinimumValue' => [ 'Scale' => <integer>, 'UnscaledValue' => <string || resource || Psr\Http\Message\StreamInterface>, ], 'NumberOfDistinctValues' => <integer>, 'NumberOfNulls' => <integer>, ], 'DoubleColumnStatisticsData' => [ 'MaximumValue' => <float>, 'MinimumValue' => <float>, 'NumberOfDistinctValues' => <integer>, 'NumberOfNulls' => <integer>, ], 'LongColumnStatisticsData' => [ 'MaximumValue' => <integer>, 'MinimumValue' => <integer>, 'NumberOfDistinctValues' => <integer>, 'NumberOfNulls' => <integer>, ], 'StringColumnStatisticsData' => [ 'AverageLength' => <float>, 'MaximumLength' => <integer>, 'NumberOfDistinctValues' => <integer>, 'NumberOfNulls' => <integer>, ], 'Type' => 'BOOLEAN|DATE|DECIMAL|DOUBLE|LONG|STRING|BINARY', ], ], 'Error' => [ 'ErrorCode' => '<string>', 'ErrorMessage' => '<string>', ], ], // ... ], ]
Result Details
Members
- Errors
-
- Type: Array of ColumnStatisticsError structures
Error occurred during updating column statistics data.
Errors
-
A specified entity does not exist
-
The input provided was not valid.
-
An internal service error occurred.
-
The operation timed out.
-
An encryption operation failed.
UpdateColumnStatisticsForTable
$result = $client->updateColumnStatisticsForTable
([/* ... */]); $promise = $client->updateColumnStatisticsForTableAsync
([/* ... */]);
Creates or updates table statistics of columns.
The Identity and Access Management (IAM) permission required for this operation is UpdateTable
.
Parameter Syntax
$result = $client->updateColumnStatisticsForTable([ 'CatalogId' => '<string>', 'ColumnStatisticsList' => [ // REQUIRED [ 'AnalyzedTime' => <integer || string || DateTime>, // REQUIRED 'ColumnName' => '<string>', // REQUIRED 'ColumnType' => '<string>', // REQUIRED 'StatisticsData' => [ // REQUIRED 'BinaryColumnStatisticsData' => [ 'AverageLength' => <float>, // REQUIRED 'MaximumLength' => <integer>, // REQUIRED 'NumberOfNulls' => <integer>, // REQUIRED ], 'BooleanColumnStatisticsData' => [ 'NumberOfFalses' => <integer>, // REQUIRED 'NumberOfNulls' => <integer>, // REQUIRED 'NumberOfTrues' => <integer>, // REQUIRED ], 'DateColumnStatisticsData' => [ 'MaximumValue' => <integer || string || DateTime>, 'MinimumValue' => <integer || string || DateTime>, 'NumberOfDistinctValues' => <integer>, // REQUIRED 'NumberOfNulls' => <integer>, // REQUIRED ], 'DecimalColumnStatisticsData' => [ 'MaximumValue' => [ 'Scale' => <integer>, // REQUIRED 'UnscaledValue' => <string || resource || Psr\Http\Message\StreamInterface>, // REQUIRED ], 'MinimumValue' => [ 'Scale' => <integer>, // REQUIRED 'UnscaledValue' => <string || resource || Psr\Http\Message\StreamInterface>, // REQUIRED ], 'NumberOfDistinctValues' => <integer>, // REQUIRED 'NumberOfNulls' => <integer>, // REQUIRED ], 'DoubleColumnStatisticsData' => [ 'MaximumValue' => <float>, 'MinimumValue' => <float>, 'NumberOfDistinctValues' => <integer>, // REQUIRED 'NumberOfNulls' => <integer>, // REQUIRED ], 'LongColumnStatisticsData' => [ 'MaximumValue' => <integer>, 'MinimumValue' => <integer>, 'NumberOfDistinctValues' => <integer>, // REQUIRED 'NumberOfNulls' => <integer>, // REQUIRED ], 'StringColumnStatisticsData' => [ 'AverageLength' => <float>, // REQUIRED 'MaximumLength' => <integer>, // REQUIRED 'NumberOfDistinctValues' => <integer>, // REQUIRED 'NumberOfNulls' => <integer>, // REQUIRED ], 'Type' => 'BOOLEAN|DATE|DECIMAL|DOUBLE|LONG|STRING|BINARY', // REQUIRED ], ], // ... ], 'DatabaseName' => '<string>', // REQUIRED 'TableName' => '<string>', // REQUIRED ]);
Parameter Details
Members
- CatalogId
-
- Type: string
The ID of the Data Catalog where the partitions in question reside. If none is supplied, the AWS account ID is used by default.
- ColumnStatisticsList
-
- Required: Yes
- Type: Array of ColumnStatistics structures
A list of the column statistics.
- DatabaseName
-
- Required: Yes
- Type: string
The name of the catalog database where the partitions reside.
- TableName
-
- Required: Yes
- Type: string
The name of the partitions' table.
Result Syntax
[ 'Errors' => [ [ 'ColumnStatistics' => [ 'AnalyzedTime' => <DateTime>, 'ColumnName' => '<string>', 'ColumnType' => '<string>', 'StatisticsData' => [ 'BinaryColumnStatisticsData' => [ 'AverageLength' => <float>, 'MaximumLength' => <integer>, 'NumberOfNulls' => <integer>, ], 'BooleanColumnStatisticsData' => [ 'NumberOfFalses' => <integer>, 'NumberOfNulls' => <integer>, 'NumberOfTrues' => <integer>, ], 'DateColumnStatisticsData' => [ 'MaximumValue' => <DateTime>, 'MinimumValue' => <DateTime>, 'NumberOfDistinctValues' => <integer>, 'NumberOfNulls' => <integer>, ], 'DecimalColumnStatisticsData' => [ 'MaximumValue' => [ 'Scale' => <integer>, 'UnscaledValue' => <string || resource || Psr\Http\Message\StreamInterface>, ], 'MinimumValue' => [ 'Scale' => <integer>, 'UnscaledValue' => <string || resource || Psr\Http\Message\StreamInterface>, ], 'NumberOfDistinctValues' => <integer>, 'NumberOfNulls' => <integer>, ], 'DoubleColumnStatisticsData' => [ 'MaximumValue' => <float>, 'MinimumValue' => <float>, 'NumberOfDistinctValues' => <integer>, 'NumberOfNulls' => <integer>, ], 'LongColumnStatisticsData' => [ 'MaximumValue' => <integer>, 'MinimumValue' => <integer>, 'NumberOfDistinctValues' => <integer>, 'NumberOfNulls' => <integer>, ], 'StringColumnStatisticsData' => [ 'AverageLength' => <float>, 'MaximumLength' => <integer>, 'NumberOfDistinctValues' => <integer>, 'NumberOfNulls' => <integer>, ], 'Type' => 'BOOLEAN|DATE|DECIMAL|DOUBLE|LONG|STRING|BINARY', ], ], 'Error' => [ 'ErrorCode' => '<string>', 'ErrorMessage' => '<string>', ], ], // ... ], ]
Result Details
Members
- Errors
-
- Type: Array of ColumnStatisticsError structures
List of ColumnStatisticsErrors.
Errors
-
A specified entity does not exist
-
The input provided was not valid.
-
An internal service error occurred.
-
The operation timed out.
-
An encryption operation failed.
UpdateConnection
$result = $client->updateConnection
([/* ... */]); $promise = $client->updateConnectionAsync
([/* ... */]);
Updates a connection definition in the Data Catalog.
Parameter Syntax
$result = $client->updateConnection([ 'CatalogId' => '<string>', 'ConnectionInput' => [ // REQUIRED 'ConnectionProperties' => ['<string>', ...], // REQUIRED 'ConnectionType' => 'JDBC|SFTP|MONGODB|KAFKA|NETWORK|MARKETPLACE|CUSTOM', // REQUIRED 'Description' => '<string>', 'MatchCriteria' => ['<string>', ...], 'Name' => '<string>', // REQUIRED 'PhysicalConnectionRequirements' => [ 'AvailabilityZone' => '<string>', 'SecurityGroupIdList' => ['<string>', ...], 'SubnetId' => '<string>', ], ], 'Name' => '<string>', // REQUIRED ]);
Parameter Details
Members
- CatalogId
-
- Type: string
The ID of the Data Catalog in which the connection resides. If none is provided, the AWS account ID is used by default.
- ConnectionInput
-
- Required: Yes
- Type: ConnectionInput structure
A
ConnectionInput
object that redefines the connection in question. - Name
-
- Required: Yes
- Type: string
The name of the connection definition to update.
Result Syntax
[]
Result Details
Errors
-
The input provided was not valid.
-
A specified entity does not exist
-
The operation timed out.
-
The input provided was not valid.
-
An encryption operation failed.
UpdateCrawler
$result = $client->updateCrawler
([/* ... */]); $promise = $client->updateCrawlerAsync
([/* ... */]);
Updates a crawler. If a crawler is running, you must stop it using StopCrawler
before updating it.
Parameter Syntax
$result = $client->updateCrawler([ 'Classifiers' => ['<string>', ...], 'Configuration' => '<string>', 'CrawlerSecurityConfiguration' => '<string>', 'DatabaseName' => '<string>', 'Description' => '<string>', 'LineageConfiguration' => [ 'CrawlerLineageSettings' => 'ENABLE|DISABLE', ], 'Name' => '<string>', // REQUIRED 'RecrawlPolicy' => [ 'RecrawlBehavior' => 'CRAWL_EVERYTHING|CRAWL_NEW_FOLDERS_ONLY', ], 'Role' => '<string>', 'Schedule' => '<string>', 'SchemaChangePolicy' => [ 'DeleteBehavior' => 'LOG|DELETE_FROM_DATABASE|DEPRECATE_IN_DATABASE', 'UpdateBehavior' => 'LOG|UPDATE_IN_DATABASE', ], 'TablePrefix' => '<string>', 'Targets' => [ 'CatalogTargets' => [ [ 'DatabaseName' => '<string>', // REQUIRED 'Tables' => ['<string>', ...], // REQUIRED ], // ... ], 'DynamoDBTargets' => [ [ 'Path' => '<string>', 'scanAll' => true || false, 'scanRate' => <float>, ], // ... ], 'JdbcTargets' => [ [ 'ConnectionName' => '<string>', 'Exclusions' => ['<string>', ...], 'Path' => '<string>', ], // ... ], 'MongoDBTargets' => [ [ 'ConnectionName' => '<string>', 'Path' => '<string>', 'ScanAll' => true || false, ], // ... ], 'S3Targets' => [ [ 'ConnectionName' => '<string>', 'Exclusions' => ['<string>', ...], 'Path' => '<string>', ], // ... ], ], ]);
Parameter Details
Members
- Classifiers
-
- Type: Array of strings
A list of custom classifiers that the user has registered. By default, all built-in classifiers are included in a crawl, but these custom classifiers always override the default classifiers for a given classification.
- Configuration
-
- Type: string
Crawler configuration information. This versioned JSON string allows users to specify aspects of a crawler's behavior. For more information, see Configuring a Crawler.
- CrawlerSecurityConfiguration
-
- Type: string
The name of the
SecurityConfiguration
structure to be used by this crawler. - DatabaseName
-
- Type: string
The AWS Glue database where results are stored, such as:
arn:aws:daylight:us-east-1::database/sometable/*
. - Description
-
- Type: string
A description of the new crawler.
- LineageConfiguration
-
- Type: LineageConfiguration structure
Specifies data lineage configuration settings for the crawler.
- Name
-
- Required: Yes
- Type: string
Name of the new crawler.
- RecrawlPolicy
-
- Type: RecrawlPolicy structure
A policy that specifies whether to crawl the entire dataset again, or to crawl only folders that were added since the last crawler run.
- Role
-
- Type: string
The IAM role or Amazon Resource Name (ARN) of an IAM role that is used by the new crawler to access customer resources.
- Schedule
-
- Type: string
A
cron
expression used to specify the schedule (see Time-Based Schedules for Jobs and Crawlers. For example, to run something every day at 12:15 UTC, you would specify:cron(15 12 * * ? *)
. - SchemaChangePolicy
-
- Type: SchemaChangePolicy structure
The policy for the crawler's update and deletion behavior.
- TablePrefix
-
- Type: string
The table prefix used for catalog tables that are created.
- Targets
-
- Type: CrawlerTargets structure
A list of targets to crawl.
Result Syntax
[]
Result Details
Errors
-
The input provided was not valid.
-
There was a version conflict.
-
A specified entity does not exist
-
The operation cannot be performed because the crawler is already running.
-
The operation timed out.
UpdateCrawlerSchedule
$result = $client->updateCrawlerSchedule
([/* ... */]); $promise = $client->updateCrawlerScheduleAsync
([/* ... */]);
Updates the schedule of a crawler using a cron
expression.
Parameter Syntax
$result = $client->updateCrawlerSchedule([ 'CrawlerName' => '<string>', // REQUIRED 'Schedule' => '<string>', ]);
Parameter Details
Members
- CrawlerName
-
- Required: Yes
- Type: string
The name of the crawler whose schedule to update.
- Schedule
-
- Type: string
The updated
cron
expression used to specify the schedule (see Time-Based Schedules for Jobs and Crawlers. For example, to run something every day at 12:15 UTC, you would specify:cron(15 12 * * ? *)
.
Result Syntax
[]
Result Details
Errors
-
A specified entity does not exist
-
The input provided was not valid.
-
There was a version conflict.
-
SchedulerTransitioningException:
The specified scheduler is transitioning.
-
The operation timed out.
UpdateDatabase
$result = $client->updateDatabase
([/* ... */]); $promise = $client->updateDatabaseAsync
([/* ... */]);
Updates an existing database definition in a Data Catalog.
Parameter Syntax
$result = $client->updateDatabase([ 'CatalogId' => '<string>', 'DatabaseInput' => [ // REQUIRED 'CreateTableDefaultPermissions' => [ [ 'Permissions' => ['<string>', ...], 'Principal' => [ 'DataLakePrincipalIdentifier' => '<string>', ], ], // ... ], 'Description' => '<string>', 'LocationUri' => '<string>', 'Name' => '<string>', // REQUIRED 'Parameters' => ['<string>', ...], 'TargetDatabase' => [ 'CatalogId' => '<string>', 'DatabaseName' => '<string>', ], ], 'Name' => '<string>', // REQUIRED ]);
Parameter Details
Members
- CatalogId
-
- Type: string
The ID of the Data Catalog in which the metadata database resides. If none is provided, the AWS account ID is used by default.
- DatabaseInput
-
- Required: Yes
- Type: DatabaseInput structure
A
DatabaseInput
object specifying the new definition of the metadata database in the catalog. - Name
-
- Required: Yes
- Type: string
The name of the database to update in the catalog. For Hive compatibility, this is folded to lowercase.
Result Syntax
[]
Result Details
Errors
-
A specified entity does not exist
-
The input provided was not valid.
-
An internal service error occurred.
-
The operation timed out.
-
An encryption operation failed.
UpdateDevEndpoint
$result = $client->updateDevEndpoint
([/* ... */]); $promise = $client->updateDevEndpointAsync
([/* ... */]);
Updates a specified development endpoint.
Parameter Syntax
$result = $client->updateDevEndpoint([ 'AddArguments' => ['<string>', ...], 'AddPublicKeys' => ['<string>', ...], 'CustomLibraries' => [ 'ExtraJarsS3Path' => '<string>', 'ExtraPythonLibsS3Path' => '<string>', ], 'DeleteArguments' => ['<string>', ...], 'DeletePublicKeys' => ['<string>', ...], 'EndpointName' => '<string>', // REQUIRED 'PublicKey' => '<string>', 'UpdateEtlLibraries' => true || false, ]);
Parameter Details
Members
- AddArguments
-
- Type: Associative array of custom strings keys (GenericString) to strings
The map of arguments to add the map of arguments used to configure the
DevEndpoint
.Valid arguments are:
-
"--enable-glue-datacatalog": ""
-
"GLUE_PYTHON_VERSION": "3"
-
"GLUE_PYTHON_VERSION": "2"
You can specify a version of Python support for development endpoints by using the
Arguments
parameter in theCreateDevEndpoint
orUpdateDevEndpoint
APIs. If no arguments are provided, the version defaults to Python 2. - AddPublicKeys
-
- Type: Array of strings
The list of public keys for the
DevEndpoint
to use. - CustomLibraries
-
- Type: DevEndpointCustomLibraries structure
Custom Python or Java libraries to be loaded in the
DevEndpoint
. - DeleteArguments
-
- Type: Array of strings
The list of argument keys to be deleted from the map of arguments used to configure the
DevEndpoint
. - DeletePublicKeys
-
- Type: Array of strings
The list of public keys to be deleted from the
DevEndpoint
. - EndpointName
-
- Required: Yes
- Type: string
The name of the
DevEndpoint
to be updated. - PublicKey
-
- Type: string
The public key for the
DevEndpoint
to use. - UpdateEtlLibraries
-
- Type: boolean
True
if the list of custom libraries to be loaded in the development endpoint needs to be updated, orFalse
if otherwise.
Result Syntax
[]
Result Details
Errors
-
A specified entity does not exist
-
An internal service error occurred.
-
The operation timed out.
-
The input provided was not valid.
-
A value could not be validated.
UpdateJob
$result = $client->updateJob
([/* ... */]); $promise = $client->updateJobAsync
([/* ... */]);
Updates an existing job definition.
Parameter Syntax
$result = $client->updateJob([ 'JobName' => '<string>', // REQUIRED 'JobUpdate' => [ // REQUIRED 'AllocatedCapacity' => <integer>, 'Command' => [ 'Name' => '<string>', 'PythonVersion' => '<string>', 'ScriptLocation' => '<string>', ], 'Connections' => [ 'Connections' => ['<string>', ...], ], 'DefaultArguments' => ['<string>', ...], 'Description' => '<string>', 'ExecutionProperty' => [ 'MaxConcurrentRuns' => <integer>, ], 'GlueVersion' => '<string>', 'LogUri' => '<string>', 'MaxCapacity' => <float>, 'MaxRetries' => <integer>, 'NonOverridableArguments' => ['<string>', ...], 'NotificationProperty' => [ 'NotifyDelayAfter' => <integer>, ], 'NumberOfWorkers' => <integer>, 'Role' => '<string>', 'SecurityConfiguration' => '<string>', 'Timeout' => <integer>, 'WorkerType' => 'Standard|G.1X|G.2X', ], ]);
Parameter Details
Members
- JobName
-
- Required: Yes
- Type: string
The name of the job definition to update.
- JobUpdate
-
- Required: Yes
- Type: JobUpdate structure
Specifies the values with which to update the job definition.
Result Syntax
[ 'JobName' => '<string>', ]
Result Details
Errors
-
The input provided was not valid.
-
A specified entity does not exist
-
An internal service error occurred.
-
The operation timed out.
-
ConcurrentModificationException:
Two processes are trying to modify a resource simultaneously.
UpdateMLTransform
$result = $client->updateMLTransform
([/* ... */]); $promise = $client->updateMLTransformAsync
([/* ... */]);
Updates an existing machine learning transform. Call this operation to tune the algorithm parameters to achieve better results.
After calling this operation, you can call the StartMLEvaluationTaskRun
operation to assess how well your new parameters achieved your goals (such as improving the quality of your machine learning transform, or making it more cost-effective).
Parameter Syntax
$result = $client->updateMLTransform([ 'Description' => '<string>', 'GlueVersion' => '<string>', 'MaxCapacity' => <float>, 'MaxRetries' => <integer>, 'Name' => '<string>', 'NumberOfWorkers' => <integer>, 'Parameters' => [ 'FindMatchesParameters' => [ 'AccuracyCostTradeoff' => <float>, 'EnforceProvidedLabels' => true || false, 'PrecisionRecallTradeoff' => <float>, 'PrimaryKeyColumnName' => '<string>', ], 'TransformType' => 'FIND_MATCHES', // REQUIRED ], 'Role' => '<string>', 'Timeout' => <integer>, 'TransformId' => '<string>', // REQUIRED 'WorkerType' => 'Standard|G.1X|G.2X', ]);
Parameter Details
Members
- Description
-
- Type: string
A description of the transform. The default is an empty string.
- GlueVersion
-
- Type: string
This value determines which version of AWS Glue this machine learning transform is compatible with. Glue 1.0 is recommended for most customers. If the value is not set, the Glue compatibility defaults to Glue 0.9. For more information, see AWS Glue Versions in the developer guide.
- MaxCapacity
-
- Type: double
The number of AWS Glue data processing units (DPUs) that are allocated to task runs for this transform. You can allocate from 2 to 100 DPUs; the default is 10. A DPU is a relative measure of processing power that consists of 4 vCPUs of compute capacity and 16 GB of memory. For more information, see the AWS Glue pricing page.
When the
WorkerType
field is set to a value other thanStandard
, theMaxCapacity
field is set automatically and becomes read-only. - MaxRetries
-
- Type: int
The maximum number of times to retry a task for this transform after a task run fails.
- Name
-
- Type: string
The unique name that you gave the transform when you created it.
- NumberOfWorkers
-
- Type: int
The number of workers of a defined
workerType
that are allocated when this task runs. - Parameters
-
- Type: TransformParameters structure
The configuration parameters that are specific to the transform type (algorithm) used. Conditionally dependent on the transform type.
- Role
-
- Type: string
The name or Amazon Resource Name (ARN) of the IAM role with the required permissions.
- Timeout
-
- Type: int
The timeout for a task run for this transform in minutes. This is the maximum time that a task run for this transform can consume resources before it is terminated and enters
TIMEOUT
status. The default is 2,880 minutes (48 hours). - TransformId
-
- Required: Yes
- Type: string
A unique identifier that was generated when the transform was created.
- WorkerType
-
- Type: string
The type of predefined worker that is allocated when this task runs. Accepts a value of Standard, G.1X, or G.2X.
-
For the
Standard
worker type, each worker provides 4 vCPU, 16 GB of memory and a 50GB disk, and 2 executors per worker. -
For the
G.1X
worker type, each worker provides 4 vCPU, 16 GB of memory and a 64GB disk, and 1 executor per worker. -
For the
G.2X
worker type, each worker provides 8 vCPU, 32 GB of memory and a 128GB disk, and 1 executor per worker.
Result Syntax
[ 'TransformId' => '<string>', ]
Result Details
Errors
-
A specified entity does not exist
-
The input provided was not valid.
-
The operation timed out.
-
An internal service error occurred.
-
Access to a resource was denied.
UpdatePartition
$result = $client->updatePartition
([/* ... */]); $promise = $client->updatePartitionAsync
([/* ... */]);
Updates a partition.
Parameter Syntax
$result = $client->updatePartition([ 'CatalogId' => '<string>', 'DatabaseName' => '<string>', // REQUIRED 'PartitionInput' => [ // REQUIRED 'LastAccessTime' => <integer || string || DateTime>, 'LastAnalyzedTime' => <integer || string || DateTime>, 'Parameters' => ['<string>', ...], 'StorageDescriptor' => [ 'BucketColumns' => ['<string>', ...], 'Columns' => [ [ 'Comment' => '<string>', 'Name' => '<string>', // REQUIRED 'Parameters' => ['<string>', ...], 'Type' => '<string>', ], // ... ], 'Compressed' => true || false, 'InputFormat' => '<string>', 'Location' => '<string>', 'NumberOfBuckets' => <integer>, 'OutputFormat' => '<string>', 'Parameters' => ['<string>', ...], 'SchemaReference' => [ 'SchemaId' => [ 'RegistryName' => '<string>', 'SchemaArn' => '<string>', 'SchemaName' => '<string>', ], 'SchemaVersionId' => '<string>', 'SchemaVersionNumber' => <integer>, ], 'SerdeInfo' => [ 'Name' => '<string>', 'Parameters' => ['<string>', ...], 'SerializationLibrary' => '<string>', ], 'SkewedInfo' => [ 'SkewedColumnNames' => ['<string>', ...], 'SkewedColumnValueLocationMaps' => ['<string>', ...], 'SkewedColumnValues' => ['<string>', ...], ], 'SortColumns' => [ [ 'Column' => '<string>', // REQUIRED 'SortOrder' => <integer>, // REQUIRED ], // ... ], 'StoredAsSubDirectories' => true || false, ], 'Values' => ['<string>', ...], ], 'PartitionValueList' => ['<string>', ...], // REQUIRED 'TableName' => '<string>', // REQUIRED ]);
Parameter Details
Members
- CatalogId
-
- Type: string
The ID of the Data Catalog where the partition to be updated resides. If none is provided, the AWS account ID is used by default.
- DatabaseName
-
- Required: Yes
- Type: string
The name of the catalog database in which the table in question resides.
- PartitionInput
-
- Required: Yes
- Type: PartitionInput structure
The new partition object to update the partition to.
The
Values
property can't be changed. If you want to change the partition key values for a partition, delete and recreate the partition. - PartitionValueList
-
- Required: Yes
- Type: Array of strings
List of partition key values that define the partition to update.
- TableName
-
- Required: Yes
- Type: string
The name of the table in which the partition to be updated is located.
Result Syntax
[]
Result Details
Errors
-
A specified entity does not exist
-
The input provided was not valid.
-
An internal service error occurred.
-
The operation timed out.
-
An encryption operation failed.
UpdateRegistry
$result = $client->updateRegistry
([/* ... */]); $promise = $client->updateRegistryAsync
([/* ... */]);
Updates an existing registry which is used to hold a collection of schemas. The updated properties relate to the registry, and do not modify any of the schemas within the registry.
Parameter Syntax
$result = $client->updateRegistry([ 'Description' => '<string>', // REQUIRED 'RegistryId' => [ // REQUIRED 'RegistryArn' => '<string>', 'RegistryName' => '<string>', ], ]);
Parameter Details
Members
- Description
-
- Required: Yes
- Type: string
A description of the registry. If description is not provided, this field will not be updated.
- RegistryId
-
- Required: Yes
- Type: RegistryId structure
This is a wrapper structure that may contain the registry name and Amazon Resource Name (ARN).
Result Syntax
[ 'RegistryArn' => '<string>', 'RegistryName' => '<string>', ]
Result Details
Members
Errors
-
The input provided was not valid.
-
Access to a resource was denied.
-
A specified entity does not exist
-
ConcurrentModificationException:
Two processes are trying to modify a resource simultaneously.
-
An internal service error occurred.
UpdateSchema
$result = $client->updateSchema
([/* ... */]); $promise = $client->updateSchemaAsync
([/* ... */]);
Updates the description, compatibility setting, or version checkpoint for a schema set.
For updating the compatibility setting, the call will not validate compatibility for the entire set of schema versions with the new compatibility setting. If the value for Compatibility
is provided, the VersionNumber
(a checkpoint) is also required. The API will validate the checkpoint version number for consistency.
If the value for the VersionNumber
(checkpoint) is provided, Compatibility
is optional and this can be used to set/reset a checkpoint for the schema.
This update will happen only if the schema is in the AVAILABLE state.
Parameter Syntax
$result = $client->updateSchema([ 'Compatibility' => 'NONE|DISABLED|BACKWARD|BACKWARD_ALL|FORWARD|FORWARD_ALL|FULL|FULL_ALL', 'D