Class: Aws::Glue::Client

Inherits:
Seahorse::Client::Base show all
Includes:
ClientStubs
Defined in:
gems/aws-sdk-glue/lib/aws-sdk-glue/client.rb

Overview

An API client for Glue. To construct a client, you need to configure a :region and :credentials.

client = Aws::Glue::Client.new(
  region: region_name,
  credentials: credentials,
  # ...
)

For details on configuring region and credentials see the developer guide.

See #initialize for a full list of supported configuration options.

Instance Attribute Summary

Attributes inherited from Seahorse::Client::Base

#config, #handlers

API Operations collapse

Instance Method Summary collapse

Methods included from ClientStubs

#api_requests, #stub_data, #stub_responses

Methods inherited from Seahorse::Client::Base

add_plugin, api, clear_plugins, define, new, #operation_names, plugins, remove_plugin, set_api, set_plugins

Methods included from Seahorse::Client::HandlerBuilder

#handle, #handle_request, #handle_response

Constructor Details

#initialize(options) ⇒ Client

Returns a new instance of Client.

Parameters:

  • options (Hash)

Options Hash (options):

  • :credentials (required, Aws::CredentialProvider)

    Your AWS credentials. This can be an instance of any one of the following classes:

    • Aws::Credentials - Used for configuring static, non-refreshing credentials.

    • Aws::SharedCredentials - Used for loading static credentials from a shared file, such as ~/.aws/config.

    • Aws::AssumeRoleCredentials - Used when you need to assume a role.

    • Aws::AssumeRoleWebIdentityCredentials - Used when you need to assume a role after providing credentials via the web.

    • Aws::SSOCredentials - Used for loading credentials from AWS SSO using an access token generated from aws login.

    • Aws::ProcessCredentials - Used for loading credentials from a process that outputs to stdout.

    • Aws::InstanceProfileCredentials - Used for loading credentials from an EC2 IMDS on an EC2 instance.

    • Aws::ECSCredentials - Used for loading credentials from instances running in ECS.

    • Aws::CognitoIdentityCredentials - Used for loading credentials from the Cognito Identity service.

    When :credentials are not configured directly, the following locations will be searched for credentials:

    • Aws.config[:credentials]
    • The :access_key_id, :secret_access_key, and :session_token options.
    • ENV['AWS_ACCESS_KEY_ID'], ENV['AWS_SECRET_ACCESS_KEY']
    • ~/.aws/credentials
    • ~/.aws/config
    • EC2/ECS IMDS instance profile - When used by default, the timeouts are very aggressive. Construct and pass an instance of Aws::InstanceProfileCredentails or Aws::ECSCredentials to enable retries and extended timeouts.
  • :region (required, String)

    The AWS region to connect to. The configured :region is used to determine the service :endpoint. When not passed, a default :region is searched for in the following locations:

    • Aws.config[:region]
    • ENV['AWS_REGION']
    • ENV['AMAZON_REGION']
    • ENV['AWS_DEFAULT_REGION']
    • ~/.aws/credentials
    • ~/.aws/config
  • :access_key_id (String)
  • :active_endpoint_cache (Boolean) — default: false

    When set to true, a thread polling for endpoints will be running in the background every 60 secs (default). Defaults to false.

  • :adaptive_retry_wait_to_fill (Boolean) — default: true

    Used only in adaptive retry mode. When true, the request will sleep until there is sufficent client side capacity to retry the request. When false, the request will raise a RetryCapacityNotAvailableError and will not retry instead of sleeping.

  • :client_side_monitoring (Boolean) — default: false

    When true, client-side metrics will be collected for all API requests from this client.

  • :client_side_monitoring_client_id (String) — default: ""

    Allows you to provide an identifier for this client which will be attached to all generated client side metrics. Defaults to an empty string.

  • :client_side_monitoring_host (String) — default: "127.0.0.1"

    Allows you to specify the DNS hostname or IPv4 or IPv6 address that the client side monitoring agent is running on, where client metrics will be published via UDP.

  • :client_side_monitoring_port (Integer) — default: 31000

    Required for publishing client metrics. The port that the client side monitoring agent is running on, where client metrics will be published via UDP.

  • :client_side_monitoring_publisher (Aws::ClientSideMonitoring::Publisher) — default: Aws::ClientSideMonitoring::Publisher

    Allows you to provide a custom client-side monitoring publisher class. By default, will use the Client Side Monitoring Agent Publisher.

  • :convert_params (Boolean) — default: true

    When true, an attempt is made to coerce request parameters into the required types.

  • :correct_clock_skew (Boolean) — default: true

    Used only in standard and adaptive retry modes. Specifies whether to apply a clock skew correction and retry requests with skewed client clocks.

  • :disable_host_prefix_injection (Boolean) — default: false

    Set to true to disable SDK automatically adding host prefix to default service endpoint when available.

  • :endpoint (String)

    The client endpoint is normally constructed from the :region option. You should only configure an :endpoint when connecting to test or custom endpoints. This should be a valid HTTP(S) URI.

  • :endpoint_cache_max_entries (Integer) — default: 1000

    Used for the maximum size limit of the LRU cache storing endpoints data for endpoint discovery enabled operations. Defaults to 1000.

  • :endpoint_cache_max_threads (Integer) — default: 10

    Used for the maximum threads in use for polling endpoints to be cached, defaults to 10.

  • :endpoint_cache_poll_interval (Integer) — default: 60

    When :endpoint_discovery and :active_endpoint_cache is enabled, Use this option to config the time interval in seconds for making requests fetching endpoints information. Defaults to 60 sec.

  • :endpoint_discovery (Boolean) — default: false

    When set to true, endpoint discovery will be enabled for operations when available.

  • :log_formatter (Aws::Log::Formatter) — default: Aws::Log::Formatter.default

    The log formatter.

  • :log_level (Symbol) — default: :info

    The log level to send messages to the :logger at.

  • :logger (Logger)

    The Logger instance to send log messages to. If this option is not set, logging will be disabled.

  • :max_attempts (Integer) — default: 3

    An integer representing the maximum number attempts that will be made for a single request, including the initial attempt. For example, setting this value to 5 will result in a request being retried up to 4 times. Used in standard and adaptive retry modes.

  • :profile (String) — default: "default"

    Used when loading credentials from the shared credentials file at HOME/.aws/credentials. When not specified, 'default' is used.

  • :retry_backoff (Proc)

    A proc or lambda used for backoff. Defaults to 2**retries * retry_base_delay. This option is only used in the legacy retry mode.

  • :retry_base_delay (Float) — default: 0.3

    The base delay in seconds used by the default backoff function. This option is only used in the legacy retry mode.

  • :retry_jitter (Symbol) — default: :none

    A delay randomiser function used by the default backoff function. Some predefined functions can be referenced by name - :none, :equal, :full, otherwise a Proc that takes and returns a number. This option is only used in the legacy retry mode.

    @see https://www.awsarchitectureblog.com/2015/03/backoff.html

  • :retry_limit (Integer) — default: 3

    The maximum number of times to retry failed requests. Only ~ 500 level server errors and certain ~ 400 level client errors are retried. Generally, these are throttling errors, data checksum errors, networking errors, timeout errors, auth errors, endpoint discovery, and errors from expired credentials. This option is only used in the legacy retry mode.

  • :retry_max_delay (Integer) — default: 0

    The maximum number of seconds to delay between retries (0 for no limit) used by the default backoff function. This option is only used in the legacy retry mode.

  • :retry_mode (String) — default: "legacy"

    Specifies which retry algorithm to use. Values are:

    • legacy - The pre-existing retry behavior. This is default value if no retry mode is provided.

    • standard - A standardized set of retry rules across the AWS SDKs. This includes support for retry quotas, which limit the number of unsuccessful retries a client can make.

    • adaptive - An experimental retry mode that includes all the functionality of standard mode along with automatic client side throttling. This is a provisional mode that may change behavior in the future.

  • :secret_access_key (String)
  • :session_token (String)
  • :simple_json (Boolean) — default: false

    Disables request parameter conversion, validation, and formatting. Also disable response data type conversions. This option is useful when you want to ensure the highest level of performance by avoiding overhead of walking request parameters and response data structures.

    When :simple_json is enabled, the request parameters hash must be formatted exactly as the DynamoDB API expects.

  • :stub_responses (Boolean) — default: false

    Causes the client to return stubbed responses. By default fake responses are generated and returned. You can specify the response data to return or errors to raise by calling ClientStubs#stub_responses. See ClientStubs for more information.

    Please note When response stubbing is enabled, no HTTP requests are made, and retries are disabled.

  • :validate_params (Boolean) — default: true

    When true, request parameters are validated before sending the request.

  • :http_proxy (URI::HTTP, String)

    A proxy to send requests through. Formatted like 'http://proxy.com:123'.

  • :http_open_timeout (Float) — default: 15

    The number of seconds to wait when opening a HTTP session before raising a Timeout::Error.

  • :http_read_timeout (Integer) — default: 60

    The default number of seconds to wait for response data. This value can safely be set per-request on the session.

  • :http_idle_timeout (Float) — default: 5

    The number of seconds a connection is allowed to sit idle before it is considered stale. Stale connections are closed and removed from the pool before making a request.

  • :http_continue_timeout (Float) — default: 1

    The number of seconds to wait for a 100-continue response before sending the request body. This option has no effect unless the request has "Expect" header set to "100-continue". Defaults to nil which disables this behaviour. This value can safely be set per request on the session.

  • :http_wire_trace (Boolean) — default: false

    When true, HTTP debug output will be sent to the :logger.

  • :ssl_verify_peer (Boolean) — default: true

    When true, SSL peer certificates are verified when establishing a connection.

  • :ssl_ca_bundle (String)

    Full path to the SSL certificate authority bundle file that should be used when verifying peer certificates. If you do not pass :ssl_ca_bundle or :ssl_ca_directory the the system default will be used if available.

  • :ssl_ca_directory (String)

    Full path of the directory that contains the unbundled SSL certificate authority files for verifying peer certificates. If you do not pass :ssl_ca_bundle or :ssl_ca_directory the the system default will be used if available.



334
335
336
# File 'gems/aws-sdk-glue/lib/aws-sdk-glue/client.rb', line 334

def initialize(*args)
  super
end

Instance Method Details

#batch_create_partition(params = {}) ⇒ Types::BatchCreatePartitionResponse

Creates one or more partitions in a batch operation.

Examples:

Request syntax with placeholder values


resp = client.batch_create_partition({
  catalog_id: "CatalogIdString",
  database_name: "NameString", # required
  table_name: "NameString", # required
  partition_input_list: [ # required
    {
      values: ["ValueString"],
      last_access_time: Time.now,
      storage_descriptor: {
        columns: [
          {
            name: "NameString", # required
            type: "ColumnTypeString",
            comment: "CommentString",
            parameters: {
              "KeyString" => "ParametersMapValue",
            },
          },
        ],
        location: "LocationString",
        input_format: "FormatString",
        output_format: "FormatString",
        compressed: false,
        number_of_buckets: 1,
        serde_info: {
          name: "NameString",
          serialization_library: "NameString",
          parameters: {
            "KeyString" => "ParametersMapValue",
          },
        },
        bucket_columns: ["NameString"],
        sort_columns: [
          {
            column: "NameString", # required
            sort_order: 1, # required
          },
        ],
        parameters: {
          "KeyString" => "ParametersMapValue",
        },
        skewed_info: {
          skewed_column_names: ["NameString"],
          skewed_column_values: ["ColumnValuesString"],
          skewed_column_value_location_maps: {
            "ColumnValuesString" => "ColumnValuesString",
          },
        },
        stored_as_sub_directories: false,
      },
      parameters: {
        "KeyString" => "ParametersMapValue",
      },
      last_analyzed_time: Time.now,
    },
  ],
})

Response structure


resp.errors #=> Array
resp.errors[0].partition_values #=> Array
resp.errors[0].partition_values[0] #=> String
resp.errors[0].error_detail.error_code #=> String
resp.errors[0].error_detail.error_message #=> String

Parameters:

  • params (Hash) (defaults to: {})

    ({})

Options Hash (params):

  • :catalog_id (String)

    The ID of the catalog in which the partition is to be created. Currently, this should be the AWS account ID.

  • :database_name (required, String)

    The name of the metadata database in which the partition is to be created.

  • :table_name (required, String)

    The name of the metadata table in which the partition is to be created.

  • :partition_input_list (required, Array<Types::PartitionInput>)

    A list of PartitionInput structures that define the partitions to be created.

Returns:

See Also:



434
435
436
437
# File 'gems/aws-sdk-glue/lib/aws-sdk-glue/client.rb', line 434

def batch_create_partition(params = {}, options = {})
  req = build_request(:batch_create_partition, params)
  req.send_request(options)
end

#batch_delete_connection(params = {}) ⇒ Types::BatchDeleteConnectionResponse

Deletes a list of connection definitions from the Data Catalog.

Examples:

Request syntax with placeholder values


resp = client.batch_delete_connection({
  catalog_id: "CatalogIdString",
  connection_name_list: ["NameString"], # required
})

Response structure


resp.succeeded #=> Array
resp.succeeded[0] #=> String
resp.errors #=> Hash
resp.errors["NameString"].error_code #=> String
resp.errors["NameString"].error_message #=> String

Parameters:

  • params (Hash) (defaults to: {})

    ({})

Options Hash (params):

  • :catalog_id (String)

    The ID of the Data Catalog in which the connections reside. If none is provided, the AWS account ID is used by default.

  • :connection_name_list (required, Array<String>)

    A list of names of the connections to delete.

Returns:

See Also:



472
473
474
475
# File 'gems/aws-sdk-glue/lib/aws-sdk-glue/client.rb', line 472

def batch_delete_connection(params = {}, options = {})
  req = build_request(:batch_delete_connection, params)
  req.send_request(options)
end

#batch_delete_partition(params = {}) ⇒ Types::BatchDeletePartitionResponse

Deletes one or more partitions in a batch operation.

Examples:

Request syntax with placeholder values


resp = client.batch_delete_partition({
  catalog_id: "CatalogIdString",
  database_name: "NameString", # required
  table_name: "NameString", # required
  partitions_to_delete: [ # required
    {
      values: ["ValueString"], # required
    },
  ],
})

Response structure


resp.errors #=> Array
resp.errors[0].partition_values #=> Array
resp.errors[0].partition_values[0] #=> String
resp.errors[0].error_detail.error_code #=> String
resp.errors[0].error_detail.error_message #=> String

Parameters:

  • params (Hash) (defaults to: {})

    ({})

Options Hash (params):

  • :catalog_id (String)

    The ID of the Data Catalog where the partition to be deleted resides. If none is provided, the AWS account ID is used by default.

  • :database_name (required, String)

    The name of the catalog database in which the table in question resides.

  • :table_name (required, String)

    The name of the table that contains the partitions to be deleted.

  • :partitions_to_delete (required, Array<Types::PartitionValueList>)

    A list of PartitionInput structures that define the partitions to be deleted.

Returns:

See Also:



523
524
525
526
# File 'gems/aws-sdk-glue/lib/aws-sdk-glue/client.rb', line 523

def batch_delete_partition(params = {}, options = {})
  req = build_request(:batch_delete_partition, params)
  req.send_request(options)
end

#batch_delete_table(params = {}) ⇒ Types::BatchDeleteTableResponse

Deletes multiple tables at once.

After completing this operation, you no longer have access to the table versions and partitions that belong to the deleted table. AWS Glue deletes these "orphaned" resources asynchronously in a timely manner, at the discretion of the service.

To ensure the immediate deletion of all related resources, before calling BatchDeleteTable, use DeleteTableVersion or BatchDeleteTableVersion, and DeletePartition or BatchDeletePartition, to delete any resources that belong to the table.

Examples:

Request syntax with placeholder values


resp = client.batch_delete_table({
  catalog_id: "CatalogIdString",
  database_name: "NameString", # required
  tables_to_delete: ["NameString"], # required
})

Response structure


resp.errors #=> Array
resp.errors[0].table_name #=> String
resp.errors[0].error_detail.error_code #=> String
resp.errors[0].error_detail.error_message #=> String

Parameters:

  • params (Hash) (defaults to: {})

    ({})

Options Hash (params):

  • :catalog_id (String)

    The ID of the Data Catalog where the table resides. If none is provided, the AWS account ID is used by default.

  • :database_name (required, String)

    The name of the catalog database in which the tables to delete reside. For Hive compatibility, this name is entirely lowercase.

  • :tables_to_delete (required, Array<String>)

    A list of the table to delete.

Returns:

See Also:



577
578
579
580
# File 'gems/aws-sdk-glue/lib/aws-sdk-glue/client.rb', line 577

def batch_delete_table(params = {}, options = {})
  req = build_request(:batch_delete_table, params)
  req.send_request(options)
end

#batch_delete_table_version(params = {}) ⇒ Types::BatchDeleteTableVersionResponse

Deletes a specified batch of versions of a table.

Examples:

Request syntax with placeholder values


resp = client.batch_delete_table_version({
  catalog_id: "CatalogIdString",
  database_name: "NameString", # required
  table_name: "NameString", # required
  version_ids: ["VersionString"], # required
})

Response structure


resp.errors #=> Array
resp.errors[0].table_name #=> String
resp.errors[0].version_id #=> String
resp.errors[0].error_detail.error_code #=> String
resp.errors[0].error_detail.error_message #=> String

Parameters:

  • params (Hash) (defaults to: {})

    ({})

Options Hash (params):

  • :catalog_id (String)

    The ID of the Data Catalog where the tables reside. If none is provided, the AWS account ID is used by default.

  • :database_name (required, String)

    The database in the catalog in which the table resides. For Hive compatibility, this name is entirely lowercase.

  • :table_name (required, String)

    The name of the table. For Hive compatibility, this name is entirely lowercase.

  • :version_ids (required, Array<String>)

    A list of the IDs of versions to be deleted. A VersionId is a string representation of an integer. Each version is incremented by 1.

Returns:

See Also:



625
626
627
628
# File 'gems/aws-sdk-glue/lib/aws-sdk-glue/client.rb', line 625

def batch_delete_table_version(params = {}, options = {})
  req = build_request(:batch_delete_table_version, params)
  req.send_request(options)
end

#batch_get_crawlers(params = {}) ⇒ Types::BatchGetCrawlersResponse

Returns a list of resource metadata for a given list of crawler names. After calling the ListCrawlers operation, you can call this operation to access the data to which you have been granted permissions. This operation supports all IAM permissions, including permission conditions that uses tags.

Examples:

Request syntax with placeholder values


resp = client.batch_get_crawlers({
  crawler_names: ["NameString"], # required
})

Response structure


resp.crawlers #=> Array
resp.crawlers[0].name #=> String
resp.crawlers[0].role #=> String
resp.crawlers[0].targets.s3_targets #=> Array
resp.crawlers[0].targets.s3_targets[0].path #=> String
resp.crawlers[0].targets.s3_targets[0].exclusions #=> Array
resp.crawlers[0].targets.s3_targets[0].exclusions[0] #=> String
resp.crawlers[0].targets.s3_targets[0].connection_name #=> String
resp.crawlers[0].targets.jdbc_targets #=> Array
resp.crawlers[0].targets.jdbc_targets[0].connection_name #=> String
resp.crawlers[0].targets.jdbc_targets[0].path #=> String
resp.crawlers[0].targets.jdbc_targets[0].exclusions #=> Array
resp.crawlers[0].targets.jdbc_targets[0].exclusions[0] #=> String
resp.crawlers[0].targets.mongo_db_targets #=> Array
resp.crawlers[0].targets.mongo_db_targets[0].connection_name #=> String
resp.crawlers[0].targets.mongo_db_targets[0].path #=> String
resp.crawlers[0].targets.mongo_db_targets[0].scan_all #=> Boolean
resp.crawlers[0].targets.dynamo_db_targets #=> Array
resp.crawlers[0].targets.dynamo_db_targets[0].path #=> String
resp.crawlers[0].targets.dynamo_db_targets[0].scan_all #=> Boolean
resp.crawlers[0].targets.dynamo_db_targets[0].scan_rate #=> Float
resp.crawlers[0].targets.catalog_targets #=> Array
resp.crawlers[0].targets.catalog_targets[0].database_name #=> String
resp.crawlers[0].targets.catalog_targets[0].tables #=> Array
resp.crawlers[0].targets.catalog_targets[0].tables[0] #=> String
resp.crawlers[0].database_name #=> String
resp.crawlers[0].description #=> String
resp.crawlers[0].classifiers #=> Array
resp.crawlers[0].classifiers[0] #=> String
resp.crawlers[0].recrawl_policy.recrawl_behavior #=> String, one of "CRAWL_EVERYTHING", "CRAWL_NEW_FOLDERS_ONLY"
resp.crawlers[0].schema_change_policy.update_behavior #=> String, one of "LOG", "UPDATE_IN_DATABASE"
resp.crawlers[0].schema_change_policy.delete_behavior #=> String, one of "LOG", "DELETE_FROM_DATABASE", "DEPRECATE_IN_DATABASE"
resp.crawlers[0].state #=> String, one of "READY", "RUNNING", "STOPPING"
resp.crawlers[0].table_prefix #=> String
resp.crawlers[0].schedule.schedule_expression #=> String
resp.crawlers[0].schedule.state #=> String, one of "SCHEDULED", "NOT_SCHEDULED", "TRANSITIONING"
resp.crawlers[0].crawl_elapsed_time #=> Integer
resp.crawlers[0].creation_time #=> Time
resp.crawlers[0].last_updated #=> Time
resp.crawlers[0].last_crawl.status #=> String, one of "SUCCEEDED", "CANCELLED", "FAILED"
resp.crawlers[0].last_crawl.error_message #=> String
resp.crawlers[0].last_crawl.log_group #=> String
resp.crawlers[0].last_crawl.log_stream #=> String
resp.crawlers[0].last_crawl.message_prefix #=> String
resp.crawlers[0].last_crawl.start_time #=> Time
resp.crawlers[0].version #=> Integer
resp.crawlers[0].configuration #=> String
resp.crawlers[0].crawler_security_configuration #=> String
resp.crawlers_not_found #=> Array
resp.crawlers_not_found[0] #=> String

Parameters:

  • params (Hash) (defaults to: {})

    ({})

Options Hash (params):

  • :crawler_names (required, Array<String>)

    A list of crawler names, which might be the names returned from the ListCrawlers operation.

Returns:

See Also:



708
709
710
711
# File 'gems/aws-sdk-glue/lib/aws-sdk-glue/client.rb', line 708

def batch_get_crawlers(params = {}, options = {})
  req = build_request(:batch_get_crawlers, params)
  req.send_request(options)
end

#batch_get_dev_endpoints(params = {}) ⇒ Types::BatchGetDevEndpointsResponse

Returns a list of resource metadata for a given list of development endpoint names. After calling the ListDevEndpoints operation, you can call this operation to access the data to which you have been granted permissions. This operation supports all IAM permissions, including permission conditions that uses tags.

Examples:

Request syntax with placeholder values


resp = client.batch_get_dev_endpoints({
  dev_endpoint_names: ["GenericString"], # required
})

Response structure


resp.dev_endpoints #=> Array
resp.dev_endpoints[0].endpoint_name #=> String
resp.dev_endpoints[0].role_arn #=> String
resp.dev_endpoints[0].security_group_ids #=> Array
resp.dev_endpoints[0].security_group_ids[0] #=> String
resp.dev_endpoints[0].subnet_id #=> String
resp.dev_endpoints[0].yarn_endpoint_address #=> String
resp.dev_endpoints[0].private_address #=> String
resp.dev_endpoints[0].zeppelin_remote_spark_interpreter_port #=> Integer
resp.dev_endpoints[0].public_address #=> String
resp.dev_endpoints[0].status #=> String
resp.dev_endpoints[0].worker_type #=> String, one of "Standard", "G.1X", "G.2X"
resp.dev_endpoints[0].glue_version #=> String
resp.dev_endpoints[0].number_of_workers #=> Integer
resp.dev_endpoints[0].number_of_nodes #=> Integer
resp.dev_endpoints[0].availability_zone #=> String
resp.dev_endpoints[0].vpc_id #=> String
resp.dev_endpoints[0].extra_python_libs_s3_path #=> String
resp.dev_endpoints[0].extra_jars_s3_path #=> String
resp.dev_endpoints[0].failure_reason #=> String
resp.dev_endpoints[0].last_update_status #=> String
resp.dev_endpoints[0].created_timestamp #=> Time
resp.dev_endpoints[0].last_modified_timestamp #=> Time
resp.dev_endpoints[0].public_key #=> String
resp.dev_endpoints[0].public_keys #=> Array
resp.dev_endpoints[0].public_keys[0] #=> String
resp.dev_endpoints[0].security_configuration #=> String
resp.dev_endpoints[0].arguments #=> Hash
resp.dev_endpoints[0].arguments["GenericString"] #=> String
resp.dev_endpoints_not_found #=> Array
resp.dev_endpoints_not_found[0] #=> String

Parameters:

  • params (Hash) (defaults to: {})

    ({})

Options Hash (params):

  • :dev_endpoint_names (required, Array<String>)

    The list of DevEndpoint names, which might be the names returned from the ListDevEndpoint operation.

Returns:

See Also:



772
773
774
775
# File 'gems/aws-sdk-glue/lib/aws-sdk-glue/client.rb', line 772

def batch_get_dev_endpoints(params = {}, options = {})
  req = build_request(:batch_get_dev_endpoints, params)
  req.send_request(options)
end

#batch_get_jobs(params = {}) ⇒ Types::BatchGetJobsResponse

Returns a list of resource metadata for a given list of job names. After calling the ListJobs operation, you can call this operation to access the data to which you have been granted permissions. This operation supports all IAM permissions, including permission conditions that uses tags.

Examples:

Request syntax with placeholder values


resp = client.batch_get_jobs({
  job_names: ["NameString"], # required
})

Response structure


resp.jobs #=> Array
resp.jobs[0].name #=> String
resp.jobs[0].description #=> String
resp.jobs[0].log_uri #=> String
resp.jobs[0].role #=> String
resp.jobs[0].created_on #=> Time
resp.jobs[0].last_modified_on #=> Time
resp.jobs[0].execution_property.max_concurrent_runs #=> Integer
resp.jobs[0].command.name #=> String
resp.jobs[0].command.script_location #=> String
resp.jobs[0].command.python_version #=> String
resp.jobs[0].default_arguments #=> Hash
resp.jobs[0].default_arguments["GenericString"] #=> String
resp.jobs[0].non_overridable_arguments #=> Hash
resp.jobs[0].non_overridable_arguments["GenericString"] #=> String
resp.jobs[0].connections.connections #=> Array
resp.jobs[0].connections.connections[0] #=> String
resp.jobs[0].max_retries #=> Integer
resp.jobs[0].allocated_capacity #=> Integer
resp.jobs[0].timeout #=> Integer
resp.jobs[0].max_capacity #=> Float
resp.jobs[0].worker_type #=> String, one of "Standard", "G.1X", "G.2X"
resp.jobs[0].number_of_workers #=> Integer
resp.jobs[0].security_configuration #=> String
resp.jobs[0].notification_property.notify_delay_after #=> Integer
resp.jobs[0].glue_version #=> String
resp.jobs_not_found #=> Array
resp.jobs_not_found[0] #=> String

Parameters:

  • params (Hash) (defaults to: {})

    ({})

Options Hash (params):

  • :job_names (required, Array<String>)

    A list of job names, which might be the names returned from the ListJobs operation.

Returns:

See Also:



833
834
835
836
# File 'gems/aws-sdk-glue/lib/aws-sdk-glue/client.rb', line 833

def batch_get_jobs(params = {}, options = {})
  req = build_request(:batch_get_jobs, params)
  req.send_request(options)
end

#batch_get_partition(params = {}) ⇒ Types::BatchGetPartitionResponse

Retrieves partitions in a batch request.

Examples:

Request syntax with placeholder values


resp = client.batch_get_partition({
  catalog_id: "CatalogIdString",
  database_name: "NameString", # required
  table_name: "NameString", # required
  partitions_to_get: [ # required
    {
      values: ["ValueString"], # required
    },
  ],
})

Response structure


resp.partitions #=> Array
resp.partitions[0].values #=> Array
resp.partitions[0].values[0] #=> String
resp.partitions[0].database_name #=> String
resp.partitions[0].table_name #=> String
resp.partitions[0].creation_time #=> Time
resp.partitions[0].last_access_time #=> Time
resp.partitions[0].storage_descriptor.columns #=> Array
resp.partitions[0].storage_descriptor.columns[0].name #=> String
resp.partitions[0].storage_descriptor.columns[0].type #=> String
resp.partitions[0].storage_descriptor.columns[0].comment #=> String
resp.partitions[0].storage_descriptor.columns[0].parameters #=> Hash
resp.partitions[0].storage_descriptor.columns[0].parameters["KeyString"] #=> String
resp.partitions[0].storage_descriptor.location #=> String
resp.partitions[0].storage_descriptor.input_format #=> String
resp.partitions[0].storage_descriptor.output_format #=> String
resp.partitions[0].storage_descriptor.compressed #=> Boolean
resp.partitions[0].storage_descriptor.number_of_buckets #=> Integer
resp.partitions[0].storage_descriptor.serde_info.name #=> String
resp.partitions[0].storage_descriptor.serde_info.serialization_library #=> String
resp.partitions[0].storage_descriptor.serde_info.parameters #=> Hash
resp.partitions[0].storage_descriptor.serde_info.parameters["KeyString"] #=> String
resp.partitions[0].storage_descriptor.bucket_columns #=> Array
resp.partitions[0].storage_descriptor.bucket_columns[0] #=> String
resp.partitions[0].storage_descriptor.sort_columns #=> Array
resp.partitions[0].storage_descriptor.sort_columns[0].column #=> String
resp.partitions[0].storage_descriptor.sort_columns[0].sort_order #=> Integer
resp.partitions[0].storage_descriptor.parameters #=> Hash
resp.partitions[0].storage_descriptor.parameters["KeyString"] #=> String
resp.partitions[0].storage_descriptor.skewed_info.skewed_column_names #=> Array
resp.partitions[0].storage_descriptor.skewed_info.skewed_column_names[0] #=> String
resp.partitions[0].storage_descriptor.skewed_info.skewed_column_values #=> Array
resp.partitions[0].storage_descriptor.skewed_info.skewed_column_values[0] #=> String
resp.partitions[0].storage_descriptor.skewed_info.skewed_column_value_location_maps #=> Hash
resp.partitions[0].storage_descriptor.skewed_info.skewed_column_value_location_maps["ColumnValuesString"] #=> String
resp.partitions[0].storage_descriptor.stored_as_sub_directories #=> Boolean
resp.partitions[0].parameters #=> Hash
resp.partitions[0].parameters["KeyString"] #=> String
resp.partitions[0].last_analyzed_time #=> Time
resp.partitions[0].catalog_id #=> String
resp.unprocessed_keys #=> Array
resp.unprocessed_keys[0].values #=> Array
resp.unprocessed_keys[0].values[0] #=> String

Parameters:

  • params (Hash) (defaults to: {})

    ({})

Options Hash (params):

  • :catalog_id (String)

    The ID of the Data Catalog where the partitions in question reside. If none is supplied, the AWS account ID is used by default.

  • :database_name (required, String)

    The name of the catalog database where the partitions reside.

  • :table_name (required, String)

    The name of the partitions' table.

  • :partitions_to_get (required, Array<Types::PartitionValueList>)

    A list of partition values identifying the partitions to retrieve.

Returns:

See Also:



921
922
923
924
# File 'gems/aws-sdk-glue/lib/aws-sdk-glue/client.rb', line 921

def batch_get_partition(params = {}, options = {})
  req = build_request(:batch_get_partition, params)
  req.send_request(options)
end

#batch_get_triggers(params = {}) ⇒ Types::BatchGetTriggersResponse

Returns a list of resource metadata for a given list of trigger names. After calling the ListTriggers operation, you can call this operation to access the data to which you have been granted permissions. This operation supports all IAM permissions, including permission conditions that uses tags.

Examples:

Request syntax with placeholder values


resp = client.batch_get_triggers({
  trigger_names: ["NameString"], # required
})

Response structure


resp.triggers #=> Array
resp.triggers[0].name #=> String
resp.triggers[0].workflow_name #=> String
resp.triggers[0].id #=> String
resp.triggers[0].type #=> String, one of "SCHEDULED", "CONDITIONAL", "ON_DEMAND"
resp.triggers[0].state #=> String, one of "CREATING", "CREATED", "ACTIVATING", "ACTIVATED", "DEACTIVATING", "DEACTIVATED", "DELETING", "UPDATING"
resp.triggers[0].description #=> String
resp.triggers[0].schedule #=> String
resp.triggers[0].actions #=> Array
resp.triggers[0].actions[0].job_name #=> String
resp.triggers[0].actions[0].arguments #=> Hash
resp.triggers[0].actions[0].arguments["GenericString"] #=> String
resp.triggers[0].actions[0].timeout #=> Integer
resp.triggers[0].actions[0].security_configuration #=> String
resp.triggers[0].actions[0].notification_property.notify_delay_after #=> Integer
resp.triggers[0].actions[0].crawler_name #=> String
resp.triggers[0].predicate.logical #=> String, one of "AND", "ANY"
resp.triggers[0].predicate.conditions #=> Array
resp.triggers[0].predicate.conditions[0].logical_operator #=> String, one of "EQUALS"
resp.triggers[0].predicate.conditions[0].job_name #=> String
resp.triggers[0].predicate.conditions[0].state #=> String, one of "STARTING", "RUNNING", "STOPPING", "STOPPED", "SUCCEEDED", "FAILED", "TIMEOUT"
resp.triggers[0].predicate.conditions[0].crawler_name #=> String
resp.triggers[0].predicate.conditions[0].crawl_state #=> String, one of "RUNNING", "CANCELLING", "CANCELLED", "SUCCEEDED", "FAILED"
resp.triggers_not_found #=> Array
resp.triggers_not_found[0] #=> String

Parameters:

  • params (Hash) (defaults to: {})

    ({})

Options Hash (params):

  • :trigger_names (required, Array<String>)

    A list of trigger names, which may be the names returned from the ListTriggers operation.

Returns:

See Also:



979
980
981
982
# File 'gems/aws-sdk-glue/lib/aws-sdk-glue/client.rb', line 979

def batch_get_triggers(params = {}, options = {})
  req = build_request(:batch_get_triggers, params)
  req.send_request(options)
end

#batch_get_workflows(params = {}) ⇒ Types::BatchGetWorkflowsResponse

Returns a list of resource metadata for a given list of workflow names. After calling the ListWorkflows operation, you can call this operation to access the data to which you have been granted permissions. This operation supports all IAM permissions, including permission conditions that uses tags.

Examples:

Request syntax with placeholder values


resp = client.batch_get_workflows({
  names: ["NameString"], # required
  include_graph: false,
})

Response structure


resp.workflows #=> Array
resp.workflows[0].name #=> String
resp.workflows[0].description #=> String
resp.workflows[0].default_run_properties #=> Hash
resp.workflows[0].default_run_properties["IdString"] #=> String
resp.workflows[0].created_on #=> Time
resp.workflows[0].last_modified_on #=> Time
resp.workflows[0].last_run.name #=> String
resp.workflows[0].last_run.workflow_run_id #=> String
resp.workflows[0].last_run.previous_run_id #=> String
resp.workflows[0].last_run.workflow_run_properties #=> Hash
resp.workflows[0].last_run.workflow_run_properties["IdString"] #=> String
resp.workflows[0].last_run.started_on #=> Time
resp.workflows[0].last_run.completed_on #=> Time
resp.workflows[0].last_run.status #=> String, one of "RUNNING", "COMPLETED", "STOPPING", "STOPPED", "ERROR"
resp.workflows[0].last_run.error_message #=> String
resp.workflows[0].last_run.statistics.total_actions #=> Integer
resp.workflows[0].last_run.statistics.timeout_actions #=> Integer
resp.workflows[0].last_run.statistics.failed_actions #=> Integer
resp.workflows[0].last_run.statistics.stopped_actions #=> Integer
resp.workflows[0].last_run.statistics.succeeded_actions #=> Integer
resp.workflows[0].last_run.statistics.running_actions #=> Integer
resp.workflows[0].last_run.graph.nodes #=> Array
resp.workflows[0].last_run.graph.nodes[0].type #=> String, one of "CRAWLER", "JOB", "TRIGGER"
resp.workflows[0].last_run.graph.nodes[0].name #=> String
resp.workflows[0].last_run.graph.nodes[0].unique_id #=> String
resp.workflows[0].last_run.graph.nodes[0].trigger_details.trigger.name #=> String
resp.workflows[0].last_run.graph.nodes[0].trigger_details.trigger.workflow_name #=> String
resp.workflows[0].last_run.graph.nodes[0].trigger_details.trigger.id #=> String
resp.workflows[0].last_run.graph.nodes[0].trigger_details.trigger.type #=> String, one of "SCHEDULED", "CONDITIONAL", "ON_DEMAND"
resp.workflows[0].last_run.graph.nodes[0].trigger_details.trigger.state #=> String, one of "CREATING", "CREATED", "ACTIVATING", "ACTIVATED", "DEACTIVATING", "DEACTIVATED", "DELETING", "UPDATING"
resp.workflows[0].last_run.graph.nodes[0].trigger_details.trigger.description #=> String
resp.workflows[0].last_run.graph.nodes[0].trigger_details.trigger.schedule #=> String
resp.workflows[0].last_run.graph.nodes[0].trigger_details.trigger.actions #=> Array
resp.workflows[0].last_run.graph.nodes[0].trigger_details.trigger.actions[0].job_name #=> String
resp.workflows[0].last_run.graph.nodes[0].trigger_details.trigger.actions[0].arguments #=> Hash
resp.workflows[0].last_run.graph.nodes[0].trigger_details.trigger.actions[0].arguments["GenericString"] #=> String
resp.workflows[0].last_run.graph.nodes[0].trigger_details.trigger.actions[0].timeout #=> Integer
resp.workflows[0].last_run.graph.nodes[0].trigger_details.trigger.actions[0].security_configuration #=> String
resp.workflows[0].last_run.graph.nodes[0].trigger_details.trigger.actions[0].notification_property.notify_delay_after #=> Integer
resp.workflows[0].last_run.graph.nodes[0].trigger_details.trigger.actions[0].crawler_name #=> String
resp.workflows[0].last_run.graph.nodes[0].trigger_details.trigger.predicate.logical #=> String, one of "AND", "ANY"
resp.workflows[0].last_run.graph.nodes[0].trigger_details.trigger.predicate.conditions #=> Array
resp.workflows[0].last_run.graph.nodes[0].trigger_details.trigger.predicate.conditions[0].logical_operator #=> String, one of "EQUALS"
resp.workflows[0].last_run.graph.nodes[0].trigger_details.trigger.predicate.conditions[0].job_name #=> String
resp.workflows[0].last_run.graph.nodes[0].trigger_details.trigger.predicate.conditions[0].state #=> String, one of "STARTING", "RUNNING", "STOPPING", "STOPPED", "SUCCEEDED", "FAILED", "TIMEOUT"
resp.workflows[0].last_run.graph.nodes[0].trigger_details.trigger.predicate.conditions[0].crawler_name #=> String
resp.workflows[0].last_run.graph.nodes[0].trigger_details.trigger.predicate.conditions[0].crawl_state #=> String, one of "RUNNING", "CANCELLING", "CANCELLED", "SUCCEEDED", "FAILED"
resp.workflows[0].last_run.graph.nodes[0].job_details.job_runs #=> Array
resp.workflows[0].last_run.graph.nodes[0].job_details.job_runs[0].id #=> String
resp.workflows[0].last_run.graph.nodes[0].job_details.job_runs[0].attempt #=> Integer
resp.workflows[0].last_run.graph.nodes[0].job_details.job_runs[0].previous_run_id #=> String
resp.workflows[0].last_run.graph.nodes[0].job_details.job_runs[0].trigger_name #=> String
resp.workflows[0].last_run.graph.nodes[0].job_details.job_runs[0].job_name #=> String
resp.workflows[0].last_run.graph.nodes[0].job_details.job_runs[0].started_on #=> Time
resp.workflows[0].last_run.graph.nodes[0].job_details.job_runs[0].last_modified_on #=> Time
resp.workflows[0].last_run.graph.nodes[0].job_details.job_runs[0].completed_on #=> Time
resp.workflows[0].last_run.graph.nodes[0].job_details.job_runs[0].job_run_state #=> String, one of "STARTING", "RUNNING", "STOPPING", "STOPPED", "SUCCEEDED", "FAILED", "TIMEOUT"
resp.workflows[0].last_run.graph.nodes[0].job_details.job_runs[0].arguments #=> Hash
resp.workflows[0].last_run.graph.nodes[0].job_details.job_runs[0].arguments["GenericString"] #=> String
resp.workflows[0].last_run.graph.nodes[0].job_details.job_runs[0].error_message #=> String
resp.workflows[0].last_run.graph.nodes[0].job_details.job_runs[0].predecessor_runs #=> Array
resp.workflows[0].last_run.graph.nodes[0].job_details.job_runs[0].predecessor_runs[0].job_name #=> String
resp.workflows[0].last_run.graph.nodes[0].job_details.job_runs[0].predecessor_runs[0].run_id #=> String
resp.workflows[0].last_run.graph.nodes[0].job_details.job_runs[0].allocated_capacity #=> Integer
resp.workflows[0].last_run.graph.nodes[0].job_details.job_runs[0].execution_time #=> Integer
resp.workflows[0].last_run.graph.nodes[0].job_details.job_runs[0].timeout #=> Integer
resp.workflows[0].last_run.graph.nodes[0].job_details.job_runs[0].max_capacity #=> Float
resp.workflows[0].last_run.graph.nodes[0].job_details.job_runs[0].worker_type #=> String, one of "Standard", "G.1X", "G.2X"
resp.workflows[0].last_run.graph.nodes[0].job_details.job_runs[0].number_of_workers #=> Integer
resp.workflows[0].last_run.graph.nodes[0].job_details.job_runs[0].security_configuration #=> String
resp.workflows[0].last_run.graph.nodes[0].job_details.job_runs[0].log_group_name #=> String
resp.workflows[0].last_run.graph.nodes[0].job_details.job_runs[0].notification_property.notify_delay_after #=> Integer
resp.workflows[0].last_run.graph.nodes[0].job_details.job_runs[0].glue_version #=> String
resp.workflows[0].last_run.graph.nodes[0].crawler_details.crawls #=> Array
resp.workflows[0].last_run.graph.nodes[0].crawler_details.crawls[0].state #=> String, one of "RUNNING", "CANCELLING", "CANCELLED", "SUCCEEDED", "FAILED"
resp.workflows[0].last_run.graph.nodes[0].crawler_details.crawls[0].started_on #=> Time
resp.workflows[0].last_run.graph.nodes[0].crawler_details.crawls[0].completed_on #=> Time
resp.workflows[0].last_run.graph.nodes[0].crawler_details.crawls[0].error_message #=> String
resp.workflows[0].last_run.graph.nodes[0].crawler_details.crawls[0].log_group #=> String
resp.workflows[0].last_run.graph.nodes[0].crawler_details.crawls[0].log_stream #=> String
resp.workflows[0].last_run.graph.edges #=> Array
resp.workflows[0].last_run.graph.edges[0].source_id #=> String
resp.workflows[0].last_run.graph.edges[0].destination_id #=> String
resp.workflows[0].graph.nodes #=> Array
resp.workflows[0].graph.nodes[0].type #=> String, one of "CRAWLER", "JOB", "TRIGGER"
resp.workflows[0].graph.nodes[0].name #=> String
resp.workflows[0].graph.nodes[0].unique_id #=> String
resp.workflows[0].graph.nodes[0].trigger_details.trigger.name #=> String
resp.workflows[0].graph.nodes[0].trigger_details.trigger.workflow_name #=> String
resp.workflows[0].graph.nodes[0].trigger_details.trigger.id #=> String
resp.workflows[0].graph.nodes[0].trigger_details.trigger.type #=> String, one of "SCHEDULED", "CONDITIONAL", "ON_DEMAND"
resp.workflows[0].graph.nodes[0].trigger_details.trigger.state #=> String, one of "CREATING", "CREATED", "ACTIVATING", "ACTIVATED", "DEACTIVATING", "DEACTIVATED", "DELETING", "UPDATING"
resp.workflows[0].graph.nodes[0].trigger_details.trigger.description #=> String
resp.workflows[0].graph.nodes[0].trigger_details.trigger.schedule #=> String
resp.workflows[0].graph.nodes[0].trigger_details.trigger.actions #=> Array
resp.workflows[0].graph.nodes[0].trigger_details.trigger.actions[0].job_name #=> String
resp.workflows[0].graph.nodes[0].trigger_details.trigger.actions[0].arguments #=> Hash
resp.workflows[0].graph.nodes[0].trigger_details.trigger.actions[0].arguments["GenericString"] #=> String
resp.workflows[0].graph.nodes[0].trigger_details.trigger.actions[0].timeout #=> Integer
resp.workflows[0].graph.nodes[0].trigger_details.trigger.actions[0].security_configuration #=> String
resp.workflows[0].graph.nodes[0].trigger_details.trigger.actions[0].notification_property.notify_delay_after #=> Integer
resp.workflows[0].graph.nodes[0].trigger_details.trigger.actions[0].crawler_name #=> String
resp.workflows[0].graph.nodes[0].trigger_details.trigger.predicate.logical #=> String, one of "AND", "ANY"
resp.workflows[0].graph.nodes[0].trigger_details.trigger.predicate.conditions #=> Array
resp.workflows[0].graph.nodes[0].trigger_details.trigger.predicate.conditions[0].logical_operator #=> String, one of "EQUALS"
resp.workflows[0].graph.nodes[0].trigger_details.trigger.predicate.conditions[0].job_name #=> String
resp.workflows[0].graph.nodes[0].trigger_details.trigger.predicate.conditions[0].state #=> String, one of "STARTING", "RUNNING", "STOPPING", "STOPPED", "SUCCEEDED", "FAILED", "TIMEOUT"
resp.workflows[0].graph.nodes[0].trigger_details.trigger.predicate.conditions[0].crawler_name #=> String
resp.workflows[0].graph.nodes[0].trigger_details.trigger.predicate.conditions[0].crawl_state #=> String, one of "RUNNING", "CANCELLING", "CANCELLED", "SUCCEEDED", "FAILED"
resp.workflows[0].graph.nodes[0].job_details.job_runs #=> Array
resp.workflows[0].graph.nodes[0].job_details.job_runs[0].id #=> String
resp.workflows[0].graph.nodes[0].job_details.job_runs[0].attempt #=> Integer
resp.workflows[0].graph.nodes[0].job_details.job_runs[0].previous_run_id #=> String
resp.workflows[0].graph.nodes[0].job_details.job_runs[0].trigger_name #=> String
resp.workflows[0].graph.nodes[0].job_details.job_runs[0].job_name #=> String
resp.workflows[0].graph.nodes[0].job_details.job_runs[0].started_on #=> Time
resp.workflows[0].graph.nodes[0].job_details.job_runs[0].last_modified_on #=> Time
resp.workflows[0].graph.nodes[0].job_details.job_runs[0].completed_on #=> Time
resp.workflows[0].graph.nodes[0].job_details.job_runs[0].job_run_state #=> String, one of "STARTING", "RUNNING", "STOPPING", "STOPPED", "SUCCEEDED", "FAILED", "TIMEOUT"
resp.workflows[0].graph.nodes[0].job_details.job_runs[0].arguments #=> Hash
resp.workflows[0].graph.nodes[0].job_details.job_runs[0].arguments["GenericString"] #=> String
resp.workflows[0].graph.nodes[0].job_details.job_runs[0].error_message #=> String
resp.workflows[0].graph.nodes[0].job_details.job_runs[0].predecessor_runs #=> Array
resp.workflows[0].graph.nodes[0].job_details.job_runs[0].predecessor_runs[0].job_name #=> String
resp.workflows[0].graph.nodes[0].job_details.job_runs[0].predecessor_runs[0].run_id #=> String
resp.workflows[0].graph.nodes[0].job_details.job_runs[0].allocated_capacity #=> Integer
resp.workflows[0].graph.nodes[0].job_details.job_runs[0].execution_time #=> Integer
resp.workflows[0].graph.nodes[0].job_details.job_runs[0].timeout #=> Integer
resp.workflows[0].graph.nodes[0].job_details.job_runs[0].max_capacity #=> Float
resp.workflows[0].graph.nodes[0].job_details.job_runs[0].worker_type #=> String, one of "Standard", "G.1X", "G.2X"
resp.workflows[0].graph.nodes[0].job_details.job_runs[0].number_of_workers #=> Integer
resp.workflows[0].graph.nodes[0].job_details.job_runs[0].security_configuration #=> String
resp.workflows[0].graph.nodes[0].job_details.job_runs[0].log_group_name #=> String
resp.workflows[0].graph.nodes[0].job_details.job_runs[0].notification_property.notify_delay_after #=> Integer
resp.workflows[0].graph.nodes[0].job_details.job_runs[0].glue_version #=> String
resp.workflows[0].graph.nodes[0].crawler_details.crawls #=> Array
resp.workflows[0].graph.nodes[0].crawler_details.crawls[0].state #=> String, one of "RUNNING", "CANCELLING", "CANCELLED", "SUCCEEDED", "FAILED"
resp.workflows[0].graph.nodes[0].crawler_details.crawls[0].started_on #=> Time
resp.workflows[0].graph.nodes[0].crawler_details.crawls[0].completed_on #=> Time
resp.workflows[0].graph.nodes[0].crawler_details.crawls[0].error_message #=> String
resp.workflows[0].graph.nodes[0].crawler_details.crawls[0].log_group #=> String
resp.workflows[0].graph.nodes[0].crawler_details.crawls[0].log_stream #=> String
resp.workflows[0].graph.edges #=> Array
resp.workflows[0].graph.edges[0].source_id #=> String
resp.workflows[0].graph.edges[0].destination_id #=> String
resp.workflows[0].max_concurrent_runs #=> Integer
resp.missing_workflows #=> Array
resp.missing_workflows[0] #=> String

Parameters:

  • params (Hash) (defaults to: {})

    ({})

Options Hash (params):

  • :names (required, Array<String>)

    A list of workflow names, which may be the names returned from the ListWorkflows operation.

  • :include_graph (Boolean)

    Specifies whether to include a graph when returning the workflow resource metadata.

Returns:

See Also:



1166
1167
1168
1169
# File 'gems/aws-sdk-glue/lib/aws-sdk-glue/client.rb', line 1166

def batch_get_workflows(params = {}, options = {})
  req = build_request(:batch_get_workflows, params)
  req.send_request(options)
end

#batch_stop_job_run(params = {}) ⇒ Types::BatchStopJobRunResponse

Stops one or more job runs for a specified job definition.

Examples:

Request syntax with placeholder values


resp = client.batch_stop_job_run({
  job_name: "NameString", # required
  job_run_ids: ["IdString"], # required
})

Response structure


resp.successful_submissions #=> Array
resp.successful_submissions[0].job_name #=> String
resp.successful_submissions[0].job_run_id #=> String
resp.errors #=> Array
resp.errors[0].job_name #=> String
resp.errors[0].job_run_id #=> String
resp.errors[0].error_detail.error_code #=> String
resp.errors[0].error_detail.error_message #=> String

Parameters:

  • params (Hash) (defaults to: {})

    ({})

Options Hash (params):

  • :job_name (required, String)

    The name of the job definition for which to stop job runs.

  • :job_run_ids (required, Array<String>)

    A list of the JobRunIds that should be stopped for that job definition.

Returns:

See Also:



1207
1208
1209
1210
# File 'gems/aws-sdk-glue/lib/aws-sdk-glue/client.rb', line 1207

def batch_stop_job_run(params = {}, options = {})
  req = build_request(:batch_stop_job_run, params)
  req.send_request(options)
end

#batch_update_partition(params = {}) ⇒ Types::BatchUpdatePartitionResponse

Updates one or more partitions in a batch operation.

Examples:

Request syntax with placeholder values


resp = client.batch_update_partition({
  catalog_id: "CatalogIdString",
  database_name: "NameString", # required
  table_name: "NameString", # required
  entries: [ # required
    {
      partition_value_list: ["ValueString"], # required
      partition_input: { # required
        values: ["ValueString"],
        last_access_time: Time.now,
        storage_descriptor: {
          columns: [
            {
              name: "NameString", # required
              type: "ColumnTypeString",
              comment: "CommentString",
              parameters: {
                "KeyString" => "ParametersMapValue",
              },
            },
          ],
          location: "LocationString",
          input_format: "FormatString",
          output_format: "FormatString",
          compressed: false,
          number_of_buckets: 1,
          serde_info: {
            name: "NameString",
            serialization_library: "NameString",
            parameters: {
              "KeyString" => "ParametersMapValue",
            },
          },
          bucket_columns: ["NameString"],
          sort_columns: [
            {
              column: "NameString", # required
              sort_order: 1, # required
            },
          ],
          parameters: {
            "KeyString" => "ParametersMapValue",
          },
          skewed_info: {
            skewed_column_names: ["NameString"],
            skewed_column_values: ["ColumnValuesString"],
            skewed_column_value_location_maps: {
              "ColumnValuesString" => "ColumnValuesString",
            },
          },
          stored_as_sub_directories: false,
        },
        parameters: {
          "KeyString" => "ParametersMapValue",
        },
        last_analyzed_time: Time.now,
      },
    },
  ],
})

Response structure


resp.errors #=> Array
resp.errors[0].partition_value_list #=> Array
resp.errors[0].partition_value_list[0] #=> String
resp.errors[0].error_detail.error_code #=> String
resp.errors[0].error_detail.error_message #=> String

Parameters:

  • params (Hash) (defaults to: {})

    ({})

Options Hash (params):

  • :catalog_id (String)

    The ID of the catalog in which the partition is to be updated. Currently, this should be the AWS account ID.

  • :database_name (required, String)

    The name of the metadata database in which the partition is to be updated.

  • :table_name (required, String)

    The name of the metadata table in which the partition is to be updated.

  • :entries (required, Array<Types::BatchUpdatePartitionRequestEntry>)

    A list of up to 100 BatchUpdatePartitionRequestEntry objects to update.

Returns:

See Also:



1309
1310
1311
1312
# File 'gems/aws-sdk-glue/lib/aws-sdk-glue/client.rb', line 1309

def batch_update_partition(params = {}, options = {})
  req = build_request(:batch_update_partition, params)
  req.send_request(options)
end

#cancel_ml_task_run(params = {}) ⇒ Types::CancelMLTaskRunResponse

Cancels (stops) a task run. Machine learning task runs are asynchronous tasks that AWS Glue runs on your behalf as part of various machine learning workflows. You can cancel a machine learning task run at any time by calling CancelMLTaskRun with a task run's parent transform's TransformID and the task run's TaskRunId.

Examples:

Request syntax with placeholder values


resp = client.cancel_ml_task_run({
  transform_id: "HashString", # required
  task_run_id: "HashString", # required
})

Response structure


resp.transform_id #=> String
resp.task_run_id #=> String
resp.status #=> String, one of "STARTING", "RUNNING", "STOPPING", "STOPPED", "SUCCEEDED", "FAILED", "TIMEOUT"

Parameters:

  • params (Hash) (defaults to: {})

    ({})

Options Hash (params):

  • :transform_id (required, String)

    The unique identifier of the machine learning transform.

  • :task_run_id (required, String)

    A unique identifier for the task run.

Returns:

See Also:



1349
1350
1351
1352
# File 'gems/aws-sdk-glue/lib/aws-sdk-glue/client.rb', line 1349

def cancel_ml_task_run(params = {}, options = {})
  req = build_request(:cancel_ml_task_run, params)
  req.send_request(options)
end

#create_classifier(params = {}) ⇒ Struct

Creates a classifier in the user's account. This can be a GrokClassifier, an XMLClassifier, a JsonClassifier, or a CsvClassifier, depending on which field of the request is present.

Examples:

Request syntax with placeholder values


resp = client.create_classifier({
  grok_classifier: {
    classification: "Classification", # required
    name: "NameString", # required
    grok_pattern: "GrokPattern", # required
    custom_patterns: "CustomPatterns",
  },
  xml_classifier: {
    classification: "Classification", # required
    name: "NameString", # required
    row_tag: "RowTag",
  },
  json_classifier: {
    name: "NameString", # required
    json_path: "JsonPath", # required
  },
  csv_classifier: {
    name: "NameString", # required
    delimiter: "CsvColumnDelimiter",
    quote_symbol: "CsvQuoteSymbol",
    contains_header: "UNKNOWN", # accepts UNKNOWN, PRESENT, ABSENT
    header: ["NameString"],
    disable_value_trimming: false,
    allow_single_column: false,
  },
})

Parameters:

  • params (Hash) (defaults to: {})

    ({})

Options Hash (params):

Returns:

  • (Struct)

    Returns an empty response.

See Also:



1405
1406
1407
1408
# File 'gems/aws-sdk-glue/lib/aws-sdk-glue/client.rb', line 1405

def create_classifier(params = {}, options = {})
  req = build_request(:create_classifier, params)
  req.send_request(options)
end

#create_connection(params = {}) ⇒ Struct

Creates a connection definition in the Data Catalog.

Examples:

Request syntax with placeholder values


resp = client.create_connection({
  catalog_id: "CatalogIdString",
  connection_input: { # required
    name: "NameString", # required
    description: "DescriptionString",
    connection_type: "JDBC", # required, accepts JDBC, SFTP, MONGODB, KAFKA, NETWORK
    match_criteria: ["NameString"],
    connection_properties: { # required
      "HOST" => "ValueString",
    },
    physical_connection_requirements: {
      subnet_id: "NameString",
      security_group_id_list: ["NameString"],
      availability_zone: "NameString",
    },
  },
})

Parameters:

  • params (Hash) (defaults to: {})

    ({})

Options Hash (params):

  • :catalog_id (String)

    The ID of the Data Catalog in which to create the connection. If none is provided, the AWS account ID is used by default.

  • :connection_input (required, Types::ConnectionInput)

    A ConnectionInput object defining the connection to create.

Returns:

  • (Struct)

    Returns an empty response.

See Also:



1445
1446
1447
1448
# File 'gems/aws-sdk-glue/lib/aws-sdk-glue/client.rb', line 1445

def create_connection(params = {}, options = {})
  req = build_request(:create_connection, params)
  req.send_request(options)
end

#create_crawler(params = {}) ⇒ Struct

Creates a new crawler with specified targets, role, configuration, and optional schedule. At least one crawl target must be specified, in the s3Targets field, the jdbcTargets field, or the DynamoDBTargets field.

Examples:

Request syntax with placeholder values


resp = client.create_crawler({
  name: "NameString", # required
  role: "Role", # required
  database_name: "DatabaseName",
  description: "DescriptionString",
  targets: { # required
    s3_targets: [
      {
        path: "Path",
        exclusions: ["Path"],
        connection_name: "ConnectionName",
      },
    ],
    jdbc_targets: [
      {
        connection_name: "ConnectionName",
        path: "Path",
        exclusions: ["Path"],
      },
    ],
    mongo_db_targets: [
      {
        connection_name: "ConnectionName",
        path: "Path",
        scan_all: false,
      },
    ],
    dynamo_db_targets: [
      {
        path: "Path",
        scan_all: false,
        scan_rate: 1.0,
      },
    ],
    catalog_targets: [
      {
        database_name: "NameString", # required
        tables: ["NameString"], # required
      },
    ],
  },
  schedule: "CronExpression",
  classifiers: ["NameString"],
  table_prefix: "TablePrefix",
  schema_change_policy: {
    update_behavior: "LOG", # accepts LOG, UPDATE_IN_DATABASE
    delete_behavior: "LOG", # accepts LOG, DELETE_FROM_DATABASE, DEPRECATE_IN_DATABASE
  },
  recrawl_policy: {
    recrawl_behavior: "CRAWL_EVERYTHING", # accepts CRAWL_EVERYTHING, CRAWL_NEW_FOLDERS_ONLY
  },
  configuration: "CrawlerConfiguration",
  crawler_security_configuration: "CrawlerSecurityConfiguration",
  tags: {
    "TagKey" => "TagValue",
  },
})

Parameters:

  • params (Hash) (defaults to: {})

    ({})

Options Hash (params):

  • :name (required, String)

    Name of the new crawler.

  • :role (required, String)

    The IAM role or Amazon Resource Name (ARN) of an IAM role used by the new crawler to access customer resources.

  • :database_name (String)

    The AWS Glue database where results are written, such as: arn:aws:daylight:us-east-1::database/sometable/*.

  • :description (String)

    A description of the new crawler.

  • :targets (required, Types::CrawlerTargets)

    A list of collection of targets to crawl.

  • :schedule (String)

    A cron expression used to specify the schedule (see Time-Based Schedules for Jobs and Crawlers. For example, to run something every day at 12:15 UTC, you would specify: cron(15 12 * * ? *).

  • :classifiers (Array<String>)

    A list of custom classifiers that the user has registered. By default, all built-in classifiers are included in a crawl, but these custom classifiers always override the default classifiers for a given classification.

  • :table_prefix (String)

    The table prefix used for catalog tables that are created.

  • :schema_change_policy (Types::SchemaChangePolicy)

    The policy for the crawler's update and deletion behavior.

  • :recrawl_policy (Types::RecrawlPolicy)

    A policy that specifies whether to crawl the entire dataset again, or to crawl only folders that were added since the last crawler run.

  • :configuration (String)

    Crawler configuration information. This versioned JSON string allows users to specify aspects of a crawler's behavior. For more information, see Configuring a Crawler.

  • :crawler_security_configuration (String)

    The name of the SecurityConfiguration structure to be used by this crawler.

  • :tags (Hash<String,String>)

    The tags to use with this crawler request. You may use tags to limit access to the crawler. For more information about tags in AWS Glue, see AWS Tags in AWS Glue in the developer guide.

Returns:

  • (Struct)

    Returns an empty response.

See Also:



1585
1586
1587
1588
# File 'gems/aws-sdk-glue/lib/aws-sdk-glue/client.rb', line 1585

def create_crawler(params = {}, options = {})
  req = build_request(:create_crawler, params)
  req.send_request(options)
end

#create_database(params = {}) ⇒ Struct

Creates a new database in a Data Catalog.

Examples:

Request syntax with placeholder values


resp = client.create_database({
  catalog_id: "CatalogIdString",
  database_input: { # required
    name: "NameString", # required
    description: "DescriptionString",
    location_uri: "URI",
    parameters: {
      "KeyString" => "ParametersMapValue",
    },
    create_table_default_permissions: [
      {
        principal: {
          data_lake_principal_identifier: "DataLakePrincipalString",
        },
        permissions: ["ALL"], # accepts ALL, SELECT, ALTER, DROP, DELETE, INSERT, CREATE_DATABASE, CREATE_TABLE, DATA_LOCATION_ACCESS
      },
    ],
    target_database: {
      catalog_id: "CatalogIdString",
      database_name: "NameString",
    },
  },
})

Parameters:

  • params (Hash) (defaults to: {})

    ({})

Options Hash (params):

  • :catalog_id (String)

    The ID of the Data Catalog in which to create the database. If none is provided, the AWS account ID is used by default.

  • :database_input (required, Types::DatabaseInput)

    The metadata for the database.

Returns:

  • (Struct)

    Returns an empty response.

See Also:



1631
1632
1633
1634
# File 'gems/aws-sdk-glue/lib/aws-sdk-glue/client.rb', line 1631

def create_database(params = {}, options = {})
  req = build_request(:create_database, params)
  req.send_request(options)
end

#create_dev_endpoint(params = {}) ⇒ Types::CreateDevEndpointResponse

Creates a new development endpoint.

Examples:

Request syntax with placeholder values


resp = client.create_dev_endpoint({
  endpoint_name: "GenericString", # required
  role_arn: "RoleArn", # required
  security_group_ids: ["GenericString"],
  subnet_id: "GenericString",
  public_key: "GenericString",
  public_keys: ["GenericString"],
  number_of_nodes: 1,
  worker_type: "Standard", # accepts Standard, G.1X, G.2X
  glue_version: "GlueVersionString",
  number_of_workers: 1,
  extra_python_libs_s3_path: "GenericString",
  extra_jars_s3_path: "GenericString",
  security_configuration: "NameString",
  tags: {
    "TagKey" => "TagValue",
  },
  arguments: {
    "GenericString" => "GenericString",
  },
})

Response structure


resp.endpoint_name #=> String
resp.status #=> String
resp.security_group_ids #=> Array
resp.security_group_ids[0] #=> String
resp.subnet_id #=> String
resp.role_arn #=> String
resp.yarn_endpoint_address #=> String
resp.zeppelin_remote_spark_interpreter_port #=> Integer
resp.number_of_nodes #=> Integer
resp.worker_type #=> String, one of "Standard", "G.1X", "G.2X"
resp.glue_version #=> String
resp.number_of_workers #=> Integer
resp.availability_zone #=> String
resp.vpc_id #=> String
resp.extra_python_libs_s3_path #=> String
resp.extra_jars_s3_path #=> String
resp.failure_reason #=> String
resp.security_configuration #=> String
resp.created_timestamp #=> Time
resp.arguments #=> Hash
resp.arguments["GenericString"] #=> String

Parameters:

  • params (Hash) (defaults to: {})

    ({})

Options Hash (params):

  • :endpoint_name (required, String)

    The name to be assigned to the new DevEndpoint.

  • :role_arn (required, String)

    The IAM role for the DevEndpoint.

  • :security_group_ids (Array<String>)

    Security group IDs for the security groups to be used by the new DevEndpoint.

  • :subnet_id (String)

    The subnet ID for the new DevEndpoint to use.

  • :public_key (String)

    The public key to be used by this DevEndpoint for authentication. This attribute is provided for backward compatibility because the recommended attribute to use is public keys.

  • :public_keys (Array<String>)

    A list of public keys to be used by the development endpoints for authentication. The use of this attribute is preferred over a single public key because the public keys allow you to have a different private key per client.

    If you previously created an endpoint with a public key, you must remove that key to be able to set a list of public keys. Call the UpdateDevEndpoint API with the public key content in the deletePublicKeys attribute, and the list of new keys in the addPublicKeys attribute.

  • :number_of_nodes (Integer)

    The number of AWS Glue Data Processing Units (DPUs) to allocate to this DevEndpoint.

  • :worker_type (String)

    The type of predefined worker that is allocated to the development endpoint. Accepts a value of Standard, G.1X, or G.2X.

    • For the Standard worker type, each worker provides 4 vCPU, 16 GB of memory and a 50GB disk, and 2 executors per worker.

    • For the G.1X worker type, each worker maps to 1 DPU (4 vCPU, 16 GB of memory, 64 GB disk), and provides 1 executor per worker. We recommend this worker type for memory-intensive jobs.

    • For the G.2X worker type, each worker maps to 2 DPU (8 vCPU, 32 GB of memory, 128 GB disk), and provides 1 executor per worker. We recommend this worker type for memory-intensive jobs.

    Known issue: when a development endpoint is created with the G.2X WorkerType configuration, the Spark drivers for the development endpoint will run on 4 vCPU, 16 GB of memory, and a 64 GB disk.

  • :glue_version (String)

    Glue version determines the versions of Apache Spark and Python that AWS Glue supports. The Python version indicates the version supported for running your ETL scripts on development endpoints.

    For more information about the available AWS Glue versions and corresponding Spark and Python versions, see Glue version in the developer guide.

    Development endpoints that are created without specifying a Glue version default to Glue 0.9.

    You can specify a version of Python support for development endpoints by using the Arguments parameter in the CreateDevEndpoint or UpdateDevEndpoint APIs. If no arguments are provided, the version defaults to Python 2.

  • :number_of_workers (Integer)

    The number of workers of a defined workerType that are allocated to the development endpoint.

    The maximum number of workers you can define are 299 for G.1X, and 149 for G.2X.

  • :extra_python_libs_s3_path (String)

    The paths to one or more Python libraries in an Amazon S3 bucket that should be loaded in your DevEndpoint. Multiple values must be complete paths separated by a comma.

    You can only use pure Python libraries with a DevEndpoint. Libraries that rely on C extensions, such as the pandas Python data analysis library, are not yet supported.

  • :extra_jars_s3_path (String)

    The path to one or more Java .jar files in an S3 bucket that should be loaded in your DevEndpoint.

  • :security_configuration (String)

    The name of the SecurityConfiguration structure to be used with this DevEndpoint.

  • :tags (Hash<String,String>)

    The tags to use with this DevEndpoint. You may use tags to limit access to the DevEndpoint. For more information about tags in AWS Glue, see AWS Tags in AWS Glue in the developer guide.

  • :arguments (Hash<String,String>)

    A map of arguments used to configure the DevEndpoint.

Returns:

See Also:



1830
1831
1832
1833
# File 'gems/aws-sdk-glue/lib/aws-sdk-glue/client.rb', line 1830

def create_dev_endpoint(params = {}, options = {})
  req = build_request(:create_dev_endpoint, params)
  req.send_request(options)
end

#create_job(params = {}) ⇒ Types::CreateJobResponse

Creates a new job definition.

Examples:

Request syntax with placeholder values


resp = client.create_job({
  name: "NameString", # required
  description: "DescriptionString",
  log_uri: "UriString",
  role: "RoleString", # required
  execution_property: {
    max_concurrent_runs: 1,
  },
  command: { # required
    name: "GenericString",
    script_location: "ScriptLocationString",
    python_version: "PythonVersionString",
  },
  default_arguments: {
    "GenericString" => "GenericString",
  },
  non_overridable_arguments: {
    "GenericString" => "GenericString",
  },
  connections: {
    connections: ["GenericString"],
  },
  max_retries: 1,
  allocated_capacity: 1,
  timeout: 1,
  max_capacity: 1.0,
  security_configuration: "NameString",
  tags: {
    "TagKey" => "TagValue",
  },
  notification_property: {
    notify_delay_after: 1,
  },
  glue_version: "GlueVersionString",
  number_of_workers: 1,
  worker_type: "Standard", # accepts Standard, G.1X, G.2X
})

Response structure


resp.name #=> String

Parameters:

  • params (Hash) (defaults to: {})

    ({})

Options Hash (params):

  • :name (required, String)

    The name you assign to this job definition. It must be unique in your account.

  • :description (String)

    Description of the job being defined.

  • :log_uri (String)

    This field is reserved for future use.

  • :role (required, String)

    The name or Amazon Resource Name (ARN) of the IAM role associated with this job.

  • :execution_property (Types::ExecutionProperty)

    An ExecutionProperty specifying the maximum number of concurrent runs allowed for this job.

  • :command (required, Types::JobCommand)

    The JobCommand that executes this job.

  • :default_arguments (Hash<String,String>)

    The default arguments for this job.

    You can specify arguments here that your own job-execution script consumes, as well as arguments that AWS Glue itself consumes.

    For information about how to specify and consume your own Job arguments, see the Calling AWS Glue APIs in Python topic in the developer guide.

    For information about the key-value pairs that AWS Glue consumes to set up your job, see the Special Parameters Used by AWS Glue topic in the developer guide.

  • :non_overridable_arguments (Hash<String,String>)

    Non-overridable arguments for this job, specified as name-value pairs.

  • :connections (Types::ConnectionsList)

    The connections used for this job.

  • :max_retries (Integer)

    The maximum number of times to retry this job if it fails.

  • :allocated_capacity (Integer)

    This parameter is deprecated. Use MaxCapacity instead.

    The number of AWS Glue data processing units (DPUs) to allocate to this Job. You can allocate from 2 to 100 DPUs; the default is 10. A DPU is a relative measure of processing power that consists of 4 vCPUs of compute capacity and 16 GB of memory. For more information, see the AWS Glue pricing page.

  • :timeout (Integer)

    The job timeout in minutes. This is the maximum time that a job run can consume resources before it is terminated and enters TIMEOUT status. The default is 2,880 minutes (48 hours).

  • :max_capacity (Float)

    The number of AWS Glue data processing units (DPUs) that can be allocated when this job runs. A DPU is a relative measure of processing power that consists of 4 vCPUs of compute capacity and 16 GB of memory. For more information, see the AWS Glue pricing page.

    Do not set Max Capacity if using WorkerType and NumberOfWorkers.

    The value that can be allocated for MaxCapacity depends on whether you are running a Python shell job or an Apache Spark ETL job:

    • When you specify a Python shell job (JobCommand.Name="pythonshell"), you can allocate either 0.0625 or 1 DPU. The default is 0.0625 DPU.

    • When you specify an Apache Spark ETL job (JobCommand.Name="glueetl") or Apache Spark streaming ETL job (JobCommand.Name="gluestreaming"), you can allocate from 2 to 100 DPUs. The default is 10 DPUs. This job type cannot have a fractional DPU allocation.

  • :security_configuration (String)

    The name of the SecurityConfiguration structure to be used with this job.

  • :tags (Hash<String,String>)

    The tags to use with this job. You may use tags to limit access to the job. For more information about tags in AWS Glue, see AWS Tags in AWS Glue in the developer guide.

  • :notification_property (Types::NotificationProperty)

    Specifies configuration properties of a job notification.

  • :glue_version (String)

    Glue version determines the versions of Apache Spark and Python that AWS Glue supports. The Python version indicates the version supported for jobs of type Spark.

    For more information about the available AWS Glue versions and corresponding Spark and Python versions, see Glue version in the developer guide.

    Jobs that are created without specifying a Glue version default to Glue 0.9.

  • :number_of_workers (Integer)

    The number of workers of a defined workerType that are allocated when a job runs.

    The maximum number of workers you can define are 299 for G.1X, and 149 for G.2X.

  • :worker_type (String)

    The type of predefined worker that is allocated when a job runs. Accepts a value of Standard, G.1X, or G.2X.

    • For the Standard worker type, each worker provides 4 vCPU, 16 GB of memory and a 50GB disk, and 2 executors per worker.

    • For the G.1X worker type, each worker maps to 1 DPU (4 vCPU, 16 GB of memory, 64 GB disk), and provides 1 executor per worker. We recommend this worker type for memory-intensive jobs.

    • For the G.2X worker type, each worker maps to 2 DPU (8 vCPU, 32 GB of memory, 128 GB disk), and provides 1 executor per worker. We recommend this worker type for memory-intensive jobs.

Returns:

See Also:



2036
2037
2038
2039
# File 'gems/aws-sdk-glue/lib/aws-sdk-glue/client.rb', line 2036

def create_job(params = {}, options = {})
  req = build_request(:create_job, params)
  req.send_request(options)
end

#create_ml_transform(params = {}) ⇒ Types::CreateMLTransformResponse

Creates an AWS Glue machine learning transform. This operation creates the transform and all the necessary parameters to train it.

Call this operation as the first step in the process of using a machine learning transform (such as the FindMatches transform) for deduplicating data. You can provide an optional Description, in addition to the parameters that you want to use for your algorithm.

You must also specify certain parameters for the tasks that AWS Glue runs on your behalf as part of learning from your data and creating a high-quality machine learning transform. These parameters include Role, and optionally, AllocatedCapacity, Timeout, and MaxRetries. For more information, see Jobs.

Examples:

Request syntax with placeholder values


resp = client.create_ml_transform({
  name: "NameString", # required
  description: "DescriptionString",
  input_record_tables: [ # required
    {
      database_name: "NameString", # required
      table_name: "NameString", # required
      catalog_id: "NameString",
      connection_name: "NameString",
    },
  ],
  parameters: { # required
    transform_type: "FIND_MATCHES", # required, accepts FIND_MATCHES
    find_matches_parameters: {
      primary_key_column_name: "ColumnNameString",
      precision_recall_tradeoff: 1.0,
      accuracy_cost_tradeoff: 1.0,
      enforce_provided_labels: false,
    },
  },
  role: "RoleString", # required
  glue_version: "GlueVersionString",
  max_capacity: 1.0,
  worker_type: "Standard", # accepts Standard, G.1X, G.2X
  number_of_workers: 1,
  timeout: 1,
  max_retries: 1,
  tags: {
    "TagKey" => "TagValue",
  },
})

Response structure


resp.transform_id #=> String

Parameters:

  • params (Hash) (defaults to: {})

    ({})

Options Hash (params):

  • :name (required, String)

    The unique name that you give the transform when you create it.

  • :description (String)

    A description of the machine learning transform that is being defined. The default is an empty string.

  • :input_record_tables (required, Array<Types::GlueTable>)

    A list of AWS Glue table definitions used by the transform.

  • :parameters (required, Types::TransformParameters)

    The algorithmic parameters that are specific to the transform type used. Conditionally dependent on the transform type.

  • :role (required, String)

    The name or Amazon Resource Name (ARN) of the IAM role with the required permissions. The required permissions include both AWS Glue service role permissions to AWS Glue resources, and Amazon S3 permissions required by the transform.

    • This role needs AWS Glue service role permissions to allow access to resources in AWS Glue. See Attach a Policy to IAM Users That Access AWS Glue.

    • This role needs permission to your Amazon Simple Storage Service (Amazon S3) sources, targets, temporary directory, scripts, and any libraries used by the task run for this transform.

  • :glue_version (String)

    This value determines which version of AWS Glue this machine learning transform is compatible with. Glue 1.0 is recommended for most customers. If the value is not set, the Glue compatibility defaults to Glue 0.9. For more information, see AWS Glue Versions in the developer guide.

  • :max_capacity (Float)

    The number of AWS Glue data processing units (DPUs) that are allocated to task runs for this transform. You can allocate from 2 to 100 DPUs; the default is 10. A DPU is a relative measure of processing power that consists of 4 vCPUs of compute capacity and 16 GB of memory. For more information, see the AWS Glue pricing page.

    MaxCapacity is a mutually exclusive option with NumberOfWorkers and WorkerType.

    • If either NumberOfWorkers or WorkerType is set, then MaxCapacity cannot be set.

    • If MaxCapacity is set then neither NumberOfWorkers or WorkerType can be set.

    • If WorkerType is set, then NumberOfWorkers is required (and vice versa).

    • MaxCapacity and NumberOfWorkers must both be at least 1.

    When the WorkerType field is set to a value other than Standard, the MaxCapacity field is set automatically and becomes read-only.

    When the WorkerType field is set to a value other than Standard, the MaxCapacity field is set automatically and becomes read-only.

  • :worker_type (String)

    The type of predefined worker that is allocated when this task runs. Accepts a value of Standard, G.1X, or G.2X.

    • For the Standard worker type, each worker provides 4 vCPU, 16 GB of memory and a 50GB disk, and 2 executors per worker.

    • For the G.1X worker type, each worker provides 4 vCPU, 16 GB of memory and a 64GB disk, and 1 executor per worker.

    • For the G.2X worker type, each worker provides 8 vCPU, 32 GB of memory and a 128GB disk, and 1 executor per worker.

    MaxCapacity is a mutually exclusive option with NumberOfWorkers and WorkerType.

    • If either NumberOfWorkers or WorkerType is set, then MaxCapacity cannot be set.

    • If MaxCapacity is set then neither NumberOfWorkers or WorkerType can be set.

    • If WorkerType is set, then NumberOfWorkers is required (and vice versa).

    • MaxCapacity and NumberOfWorkers must both be at least 1.

  • :number_of_workers (Integer)

    The number of workers of a defined workerType that are allocated when this task runs.

    If WorkerType is set, then NumberOfWorkers is required (and vice versa).

  • :timeout (Integer)

    The timeout of the task run for this transform in minutes. This is the maximum time that a task run for this transform can consume resources before it is terminated and enters TIMEOUT status. The default is 2,880 minutes (48 hours).

  • :max_retries (Integer)

    The maximum number of times to retry a task for this transform after a task run fails.

  • :tags (Hash<String,String>)

    The tags to use with this machine learning transform. You may use tags to limit access to the machine learning transform. For more information about tags in AWS Glue, see AWS Tags in AWS Glue in the developer guide.

Returns:

See Also:



2233
2234
2235
2236
# File 'gems/aws-sdk-glue/lib/aws-sdk-glue/client.rb', line 2233

def create_ml_transform(params = {}, options = {})
  req = build_request(:create_ml_transform, params)
  req.send_request(options)
end

#create_partition(params = {}) ⇒ Struct

Creates a new partition.

Examples:

Request syntax with placeholder values


resp = client.create_partition({
  catalog_id: "CatalogIdString",
  database_name: "NameString", # required
  table_name: "NameString", # required
  partition_input: { # required
    values: ["ValueString"],
    last_access_time: Time.now,
    storage_descriptor: {
      columns: [
        {
          name: "NameString", # required
          type: "ColumnTypeString",
          comment: "CommentString",
          parameters: {
            "KeyString" => "ParametersMapValue",
          },
        },
      ],
      location: "LocationString",
      input_format: "FormatString",
      output_format: "FormatString",
      compressed: false,
      number_of_buckets: 1,
      serde_info: {
        name: "NameString",
        serialization_library: "NameString",
        parameters: {
          "KeyString" => "ParametersMapValue",
        },
      },
      bucket_columns: ["NameString"],
      sort_columns: [
        {
          column: "NameString", # required
          sort_order: 1, # required
        },
      ],
      parameters: {
        "KeyString" => "ParametersMapValue",
      },
      skewed_info: {
        skewed_column_names: ["NameString"],
        skewed_column_values: ["ColumnValuesString"],
        skewed_column_value_location_maps: {
          "ColumnValuesString" => "ColumnValuesString",
        },
      },
      stored_as_sub_directories: false,
    },
    parameters: {
      "KeyString" => "ParametersMapValue",
    },
    last_analyzed_time: Time.now,
  },
})

Parameters:

  • params (Hash) (defaults to: {})

    ({})

Options Hash (params):

  • :catalog_id (String)

    The AWS account ID of the catalog in which the partition is to be created.

  • :database_name (required, String)

    The name of the metadata database in which the partition is to be created.

  • :table_name (required, String)

    The name of the metadata table in which the partition is to be created.

  • :partition_input (required, Types::PartitionInput)

    A PartitionInput structure defining the partition to be created.

Returns:

  • (Struct)

    Returns an empty response.

See Also:



2319
2320
2321
2322
# File 'gems/aws-sdk-glue/lib/aws-sdk-glue/client.rb', line 2319

def create_partition(params = {}, options = {})
  req = build_request(:create_partition, params)
  req.send_request(options)
end

#create_script(params = {}) ⇒ Types::CreateScriptResponse

Transforms a directed acyclic graph (DAG) into code.

Examples:

Request syntax with placeholder values


resp = client.create_script({
  dag_nodes: [
    {
      id: "CodeGenIdentifier", # required
      node_type: "CodeGenNodeType", # required
      args: [ # required
        {
          name: "CodeGenArgName", # required
          value: "CodeGenArgValue", # required
          param: false,
        },
      ],
      line_number: 1,
    },
  ],
  dag_edges: [
    {
      source: "CodeGenIdentifier", # required
      target: "CodeGenIdentifier", # required
      target_parameter: "CodeGenArgName",
    },
  ],
  language: "PYTHON", # accepts PYTHON, SCALA
})

Response structure


resp.python_script #=> String
resp.scala_code #=> String

Parameters:

  • params (Hash) (defaults to: {})

    ({})

Options Hash (params):

  • :dag_nodes (Array<Types::CodeGenNode>)

    A list of the nodes in the DAG.

  • :dag_edges (Array<Types::CodeGenEdge>)

    A list of the edges in the DAG.

  • :language (String)

    The programming language of the resulting code from the DAG.

Returns:

See Also:



2376
2377
2378
2379
# File 'gems/aws-sdk-glue/lib/aws-sdk-glue/client.rb', line 2376

def create_script(params = {}, options = {})
  req = build_request(:create_script, params)
  req.send_request(options)
end

#create_security_configuration(params = {}) ⇒ Types::CreateSecurityConfigurationResponse

Creates a new security configuration. A security configuration is a set of security properties that can be used by AWS Glue. You can use a security configuration to encrypt data at rest. For information about using security configurations in AWS Glue, see Encrypting Data Written by Crawlers, Jobs, and Development Endpoints.

Examples:

Request syntax with placeholder values


resp = client.create_security_configuration({
  name: "NameString", # required
  encryption_configuration: { # required
    s3_encryption: [
      {
        s3_encryption_mode: "DISABLED", # accepts DISABLED, SSE-KMS, SSE-S3
        kms_key_arn: "KmsKeyArn",
      },
    ],
    cloud_watch_encryption: {
      cloud_watch_encryption_mode: "DISABLED", # accepts DISABLED, SSE-KMS
      kms_key_arn: "KmsKeyArn",
    },
    job_bookmarks_encryption: {
      job_bookmarks_encryption_mode: "DISABLED", # accepts DISABLED, CSE-KMS
      kms_key_arn: "KmsKeyArn",
    },
  },
})

Response structure


resp.name #=> String
resp.created_timestamp #=> Time

Parameters:

  • params (Hash) (defaults to: {})

    ({})

Options Hash (params):

  • :name (required, String)

    The name for the new security configuration.

  • :encryption_configuration (required, Types::EncryptionConfiguration)

    The encryption configuration for the new security configuration.

Returns:

See Also:



2433
2434
2435
2436
# File 'gems/aws-sdk-glue/lib/aws-sdk-glue/client.rb', line 2433

def create_security_configuration(params = {}, options = {})
  req = build_request(:create_security_configuration, params)
  req.send_request(options)
end

#create_table(params = {}) ⇒ Struct

Creates a new table definition in the Data Catalog.

Examples:

Request syntax with placeholder values


resp = client.create_table({
  catalog_id: "CatalogIdString",
  database_name: "NameString", # required
  table_input: { # required
    name: "NameString", # required
    description: "DescriptionString",
    owner: "NameString",
    last_access_time: Time.now,
    last_analyzed_time: Time.now,
    retention: 1,
    storage_descriptor: {
      columns: [
        {
          name: "NameString", # required
          type: "ColumnTypeString",
          comment: "CommentString",
          parameters: {
            "KeyString" => "ParametersMapValue",
          },
        },
      ],
      location: "LocationString",
      input_format: "FormatString",
      output_format: "FormatString",
      compressed: false,
      number_of_buckets: 1,
      serde_info: {
        name: "NameString",
        serialization_library: "NameString",
        parameters: {
          "KeyString" => "ParametersMapValue",
        },
      },
      bucket_columns: ["NameString"],
      sort_columns: [
        {
          column: "NameString", # required
          sort_order: 1, # required
        },
      ],
      parameters: {
        "KeyString" => "ParametersMapValue",
      },
      skewed_info: {
        skewed_column_names: ["NameString"],
        skewed_column_values: ["ColumnValuesString"],
        skewed_column_value_location_maps: {
          "ColumnValuesString" => "ColumnValuesString",
        },
      },
      stored_as_sub_directories: false,
    },
    partition_keys: [
      {
        name: "NameString", # required
        type: "ColumnTypeString",
        comment: "CommentString",
        parameters: {
          "KeyString" => "ParametersMapValue",
        },
      },
    ],
    view_original_text: "ViewTextString",
    view_expanded_text: "ViewTextString",
    table_type: "TableTypeString",
    parameters: {
      "KeyString" => "ParametersMapValue",
    },
    target_table: {
      catalog_id: "CatalogIdString",
      database_name: "NameString",
      name: "NameString",
    },
  },
  partition_indexes: [
    {
      keys: ["NameString"], # required
      index_name: "NameString", # required
    },
  ],
})

Parameters:

  • params (Hash) (defaults to: {})

    ({})

Options Hash (params):

  • :catalog_id (String)

    The ID of the Data Catalog in which to create the Table. If none is supplied, the AWS account ID is used by default.

  • :database_name (required, String)

    The catalog database in which to create the new table. For Hive compatibility, this name is entirely lowercase.

  • :table_input (required, Types::TableInput)

    The TableInput object that defines the metadata table to create in the catalog.

  • :partition_indexes (Array<Types::PartitionIndex>)

    A list of partition indexes, PartitionIndex structures, to create in the table.

Returns:

  • (Struct)

    Returns an empty response.

See Also:



2546
2547
2548
2549
# File 'gems/aws-sdk-glue/lib/aws-sdk-glue/client.rb', line 2546

def create_table(params = {}, options = {})
  req = build_request(:create_table, params)
  req.send_request(options)
end

#create_trigger(params = {}) ⇒ Types::CreateTriggerResponse

Creates a new trigger.

Examples:

Request syntax with placeholder values


resp = client.create_trigger({
  name: "NameString", # required
  workflow_name: "NameString",
  type: "SCHEDULED", # required, accepts SCHEDULED, CONDITIONAL, ON_DEMAND
  schedule: "GenericString",
  predicate: {
    logical: "AND", # accepts AND, ANY
    conditions: [
      {
        logical_operator: "EQUALS", # accepts EQUALS
        job_name: "NameString",
        state: "STARTING", # accepts STARTING, RUNNING, STOPPING, STOPPED, SUCCEEDED, FAILED, TIMEOUT
        crawler_name: "NameString",
        crawl_state: "RUNNING", # accepts RUNNING, CANCELLING, CANCELLED, SUCCEEDED, FAILED
      },
    ],
  },
  actions: [ # required
    {
      job_name: "NameString",
      arguments: {
        "GenericString" => "GenericString",
      },
      timeout: 1,
      security_configuration: "NameString",
      notification_property: {
        notify_delay_after: 1,
      },
      crawler_name: "NameString",
    },
  ],
  description: "DescriptionString",
  start_on_creation: false,
  tags: {
    "TagKey" => "TagValue",
  },
})

Response structure


resp.name #=> String

Parameters:

  • params (Hash) (defaults to: {})

    ({})

Options Hash (params):

  • :name (required, String)

    The name of the trigger.

  • :workflow_name (String)

    The name of the workflow associated with the trigger.

  • :type (required, String)

    The type of the new trigger.

  • :schedule (String)

    A cron expression used to specify the schedule (see Time-Based Schedules for Jobs and Crawlers. For example, to run something every day at 12:15 UTC, you would specify: cron(15 12 * * ? *).

    This field is required when the trigger type is SCHEDULED.

  • :predicate (Types::Predicate)

    A predicate to specify when the new trigger should fire.

    This field is required when the trigger type is CONDITIONAL.

  • :actions (required, Array<Types::Action>)

    The actions initiated by this trigger when it fires.

  • :description (String)

    A description of the new trigger.

  • :start_on_creation (Boolean)

    Set to true to start SCHEDULED and CONDITIONAL triggers when created. True is not supported for ON_DEMAND triggers.

  • :tags (Hash<String,String>)

    The tags to use with this trigger. You may use tags to limit access to the trigger. For more information about tags in AWS Glue, see AWS Tags in AWS Glue in the developer guide.

Returns:

See Also:



2649
2650
2651
2652
# File 'gems/aws-sdk-glue/lib/aws-sdk-glue/client.rb', line 2649

def create_trigger(params = {}, options = {})
  req = build_request(:create_trigger, params)
  req.send_request(options)
end

#create_user_defined_function(params = {}) ⇒ Struct

Creates a new function definition in the Data Catalog.

Examples:

Request syntax with placeholder values


resp = client.create_user_defined_function({
  catalog_id: "CatalogIdString",
  database_name: "NameString", # required
  function_input: { # required
    function_name: "NameString",
    class_name: "NameString",
    owner_name: "NameString",
    owner_type: "USER", # accepts USER, ROLE, GROUP
    resource_uris: [
      {
        resource_type: "JAR", # accepts JAR, FILE, ARCHIVE
        uri: "URI",
      },
    ],
  },
})

Parameters:

  • params (Hash) (defaults to: {})

    ({})

Options Hash (params):

  • :catalog_id (String)

    The ID of the Data Catalog in which to create the function. If none is provided, the AWS account ID is used by default.

  • :database_name (required, String)

    The name of the catalog database in which to create the function.

  • :function_input (required, Types::UserDefinedFunctionInput)

    A FunctionInput object that defines the function to create in the Data Catalog.

Returns:

  • (Struct)

    Returns an empty response.

See Also:



2692
2693
2694
2695
# File 'gems/aws-sdk-glue/lib/aws-sdk-glue/client.rb', line 2692

def create_user_defined_function(params = {}, options = {})
  req = build_request(:create_user_defined_function, params)
  req.send_request(options)
end

#create_workflow(params = {}) ⇒ Types::CreateWorkflowResponse

Creates a new workflow.

Examples:

Request syntax with placeholder values


resp = client.create_workflow({
  name: "NameString", # required
  description: "GenericString",
  default_run_properties: {
    "IdString" => "GenericString",
  },
  tags: {
    "TagKey" => "TagValue",
  },
  max_concurrent_runs: 1,
})

Response structure


resp.name #=> String

Parameters:

  • params (Hash) (defaults to: {})

    ({})

Options Hash (params):

  • :name (required, String)

    The name to be assigned to the workflow. It should be unique within your account.

  • :description (String)

    A description of the workflow.

  • :default_run_properties (Hash<String,String>)

    A collection of properties to be used as part of each execution of the workflow.

  • :tags (Hash<String,String>)

    The tags to be used with this workflow.

  • :max_concurrent_runs (Integer)

    You can use this parameter to prevent unwanted multiple updates to data, to control costs, or in some cases, to prevent exceeding the maximum number of concurrent runs of any of the component jobs. If you leave this parameter blank, there is no limit to the number of concurrent workflow runs.

Returns:

See Also:



2746
2747
2748
2749
# File 'gems/aws-sdk-glue/lib/aws-sdk-glue/client.rb', line 2746

def create_workflow(params = {}, options = {})
  req = build_request(:create_workflow, params)
  req.send_request(options)
end

#delete_classifier(params = {}) ⇒ Struct

Removes a classifier from the Data Catalog.

Examples:

Request syntax with placeholder values


resp = client.delete_classifier({
  name: "NameString", # required
})

Parameters:

  • params (Hash) (defaults to: {})

    ({})

Options Hash (params):

  • :name (required, String)

    Name of the classifier to remove.

Returns:

  • (Struct)

    Returns an empty response.

See Also:



2768
2769
2770
2771
# File 'gems/aws-sdk-glue/lib/aws-sdk-glue/client.rb', line 2768

def delete_classifier(params = {}, options = {})
  req = build_request(:delete_classifier, params)
  req.send_request(options)
end

#delete_column_statistics_for_partition(params = {}) ⇒ Struct

Delete the partition column statistics of a column.

Examples:

Request syntax with placeholder values


resp = client.delete_column_statistics_for_partition({
  catalog_id: "CatalogIdString",
  database_name: "NameString", # required
  table_name: "NameString", # required
  partition_values: ["ValueString"], # required
  column_name: "NameString", # required
})

Parameters:

  • params (Hash) (defaults to: {})

    ({})

Options Hash (params):

  • :catalog_id (String)

    The ID of the Data Catalog where the partitions in question reside. If none is supplied, the AWS account ID is used by default.

  • :database_name (required, String)

    The name of the catalog database where the partitions reside.

  • :table_name (required, String)

    The name of the partitions' table.

  • :partition_values (required, Array<String>)

    A list of partition values identifying the partition.

  • :column_name (required, String)

    Name of the column.

Returns:

  • (Struct)

    Returns an empty response.

See Also:



2807
2808
2809
2810
# File 'gems/aws-sdk-glue/lib/aws-sdk-glue/client.rb', line 2807

def delete_column_statistics_for_partition(params = {}, options = {})
  req = build_request(:delete_column_statistics_for_partition, params)
  req.send_request(options)
end

#delete_column_statistics_for_table(params = {}) ⇒ Struct

Retrieves table statistics of columns.

Examples:

Request syntax with placeholder values


resp = client.delete_column_statistics_for_table({
  catalog_id: "CatalogIdString",
  database_name: "NameString", # required
  table_name: "NameString", # required
  column_name: "NameString", # required
})

Parameters:

  • params (Hash) (defaults to: {})

    ({})

Options Hash (params):

  • :catalog_id (String)

    The ID of the Data Catalog where the partitions in question reside. If none is supplied, the AWS account ID is used by default.

  • :database_name (required, String)

    The name of the catalog database where the partitions reside.

  • :table_name (required, String)

    The name of the partitions' table.

  • :column_name (required, String)

    The name of the column.

Returns:

  • (Struct)

    Returns an empty response.

See Also:



2842
2843
2844
2845
# File 'gems/aws-sdk-glue/lib/aws-sdk-glue/client.rb', line 2842

def delete_column_statistics_for_table(params = {}, options = {})
  req = build_request(:delete_column_statistics_for_table, params)
  req.send_request(options)
end

#delete_connection(params = {}) ⇒ Struct

Deletes a connection from the Data Catalog.

Examples:

Request syntax with placeholder values


resp = client.delete_connection({
  catalog_id: "CatalogIdString",
  connection_name: "NameString", # required
})

Parameters:

  • params (Hash) (defaults to: {})

    ({})

Options Hash (params):

  • :catalog_id (String)

    The ID of the Data Catalog in which the connection resides. If none is provided, the AWS account ID is used by default.

  • :connection_name (required, String)

    The name of the connection to delete.

Returns:

  • (Struct)

    Returns an empty response.

See Also:



2869
2870
2871
2872
# File 'gems/aws-sdk-glue/lib/aws-sdk-glue/client.rb', line 2869

def delete_connection(params = {}, options = {})
  req = build_request(:delete_connection, params)
  req.send_request(options)
end

#delete_crawler(params = {}) ⇒ Struct

Removes a specified crawler from the AWS Glue Data Catalog, unless the crawler state is RUNNING.

Examples:

Request syntax with placeholder values


resp = client.delete_crawler({
  name: "NameString", # required
})

Parameters:

  • params (Hash) (defaults to: {})

    ({})

Options Hash (params):

  • :name (required, String)

    The name of the crawler to remove.

Returns:

  • (Struct)

    Returns an empty response.

See Also:



2892
2893
2894
2895
# File 'gems/aws-sdk-glue/lib/aws-sdk-glue/client.rb', line 2892

def delete_crawler(params = {}, options = {})
  req = build_request(:delete_crawler, params)
  req.send_request(options)
end

#delete_database(params = {}) ⇒ Struct

Removes a specified database from a Data Catalog.

After completing this operation, you no longer have access to the tables (and all table versions and partitions that might belong to the tables) and the user-defined functions in the deleted database. AWS Glue deletes these "orphaned" resources asynchronously in a timely manner, at the discretion of the service.

To ensure the immediate deletion of all related resources, before calling DeleteDatabase, use DeleteTableVersion or BatchDeleteTableVersion, DeletePartition or BatchDeletePartition, DeleteUserDefinedFunction, and DeleteTable or BatchDeleteTable, to delete any resources that belong to the database.

Examples:

Request syntax with placeholder values


resp = client.delete_database({
  catalog_id: "CatalogIdString",
  name: "NameString", # required
})

Parameters:

  • params (Hash) (defaults to: {})

    ({})

Options Hash (params):

  • :catalog_id (String)

    The ID of the Data Catalog in which the database resides. If none is provided, the AWS account ID is used by default.

  • :name (required, String)

    The name of the database to delete. For Hive compatibility, this must be all lowercase.

Returns:

  • (Struct)

    Returns an empty response.

See Also:



2935
2936
2937
2938
# File 'gems/aws-sdk-glue/lib/aws-sdk-glue/client.rb', line 2935

def delete_database(params = {}, options = {})
  req = build_request(:delete_database, params)
  req.send_request(options)
end

#delete_dev_endpoint(params = {}) ⇒ Struct

Deletes a specified development endpoint.

Examples:

Request syntax with placeholder values


resp = client.delete_dev_endpoint({
  endpoint_name: "GenericString", # required
})

Parameters:

  • params (Hash) (defaults to: {})

    ({})

Options Hash (params):

  • :endpoint_name (required, String)

    The name of the DevEndpoint.

Returns:

  • (Struct)

    Returns an empty response.

See Also:



2957
2958
2959
2960
# File 'gems/aws-sdk-glue/lib/aws-sdk-glue/client.rb', line 2957

def delete_dev_endpoint(params = {}, options = {})
  req = build_request(:delete_dev_endpoint, params)
  req.send_request(options)
end

#delete_job(params = {}) ⇒ Types::DeleteJobResponse

Deletes a specified job definition. If the job definition is not found, no exception is thrown.

Examples:

Request syntax with placeholder values


resp = client.delete_job({
  job_name: "NameString", # required
})

Response structure


resp.job_name #=> String

Parameters:

  • params (Hash) (defaults to: {})

    ({})

Options Hash (params):

  • :job_name (required, String)

    The name of the job definition to delete.

Returns:

See Also:



2986
2987
2988
2989
# File 'gems/aws-sdk-glue/lib/aws-sdk-glue/client.rb', line 2986

def delete_job(params = {}, options = {})
  req = build_request(:delete_job, params)
  req.send_request(options)
end

#delete_ml_transform(params = {}) ⇒ Types::DeleteMLTransformResponse

Deletes an AWS Glue machine learning transform. Machine learning transforms are a special type of transform that use machine learning to learn the details of the transformation to be performed by learning from examples provided by humans. These transformations are then saved by AWS Glue. If you no longer need a transform, you can delete it by calling DeleteMLTransforms. However, any AWS Glue jobs that still reference the deleted transform will no longer succeed.

Examples:

Request syntax with placeholder values


resp = client.delete_ml_transform({
  transform_id: "HashString", # required
})

Response structure


resp.transform_id #=> String

Parameters:

  • params (Hash) (defaults to: {})

    ({})

Options Hash (params):

  • :transform_id (required, String)

    The unique identifier of the transform to delete.

Returns:

See Also:



3020
3021
3022
3023
# File 'gems/aws-sdk-glue/lib/aws-sdk-glue/client.rb', line 3020

def delete_ml_transform(params = {}, options = {})
  req = build_request(:delete_ml_transform, params)
  req.send_request(options)
end

#delete_partition(params = {}) ⇒ Struct

Deletes a specified partition.

Examples:

Request syntax with placeholder values


resp = client.delete_partition({
  catalog_id: "CatalogIdString",
  database_name: "NameString", # required
  table_name: "NameString", # required
  partition_values: ["ValueString"], # required
})

Parameters:

  • params (Hash) (defaults to: {})

    ({})

Options Hash (params):

  • :catalog_id (String)

    The ID of the Data Catalog where the partition to be deleted resides. If none is provided, the AWS account ID is used by default.

  • :database_name (required, String)

    The name of the catalog database in which the table in question resides.

  • :table_name (required, String)

    The name of the table that contains the partition to be deleted.

  • :partition_values (required, Array<String>)

    The values that define the partition.

Returns:

  • (Struct)

    Returns an empty response.

See Also:



3056
3057
3058
3059
# File 'gems/aws-sdk-glue/lib/aws-sdk-glue/client.rb', line 3056

def delete_partition(params = {}, options = {})
  req = build_request(:delete_partition, params)
  req.send_request(options)
end

#delete_resource_policy(params = {}) ⇒ Struct

Deletes a specified policy.

Examples:

Request syntax with placeholder values


resp = client.delete_resource_policy({
  policy_hash_condition: "HashString",
  resource_arn: "GlueResourceArn",
})

Parameters:

  • params (Hash) (defaults to: {})

    ({})

Options Hash (params):

  • :policy_hash_condition (String)

    The hash value returned when this policy was set.

  • :resource_arn (String)

    The ARN of the AWS Glue resource for the resource policy to be deleted.

Returns:

  • (Struct)

    Returns an empty response.

See Also:



3083
3084
3085
3086
# File 'gems/aws-sdk-glue/lib/aws-sdk-glue/client.rb', line 3083

def delete_resource_policy(params = {}, options = {})
  req = build_request(:delete_resource_policy, params)
  req.send_request(options)
end

#delete_security_configuration(params = {}) ⇒ Struct

Deletes a specified security configuration.

Examples:

Request syntax with placeholder values


resp = client.delete_security_configuration({
  name: "NameString", # required
})

Parameters:

  • params (Hash) (defaults to: {})

    ({})

Options Hash (params):

  • :name (required, String)

    The name of the security configuration to delete.

Returns:

  • (Struct)

    Returns an empty response.

See Also:



3105
3106
3107
3108
# File 'gems/aws-sdk-glue/lib/aws-sdk-glue/client.rb', line 3105

def delete_security_configuration(params = {}, options = {})
  req = build_request(:delete_security_configuration, params)
  req.send_request(options)
end

#delete_table(params = {}) ⇒ Struct

Removes a table definition from the Data Catalog.

After completing this operation, you no longer have access to the table versions and partitions that belong to the deleted table. AWS Glue deletes these "orphaned" resources asynchronously in a timely manner, at the discretion of the service.

To ensure the immediate deletion of all related resources, before calling DeleteTable, use DeleteTableVersion or BatchDeleteTableVersion, and DeletePartition or BatchDeletePartition, to delete any resources that belong to the table.

Examples:

Request syntax with placeholder values


resp = client.delete_table({
  catalog_id: "CatalogIdString",
  database_name: "NameString", # required
  name: "NameString", # required
})

Parameters:

  • params (Hash) (defaults to: {})

    ({})

Options Hash (params):

  • :catalog_id (String)

    The ID of the Data Catalog where the table resides. If none is provided, the AWS account ID is used by default.

  • :database_name (required, String)

    The name of the catalog database in which the table resides. For Hive compatibility, this name is entirely lowercase.

  • :name (required, String)

    The name of the table to be deleted. For Hive compatibility, this name is entirely lowercase.

Returns:

  • (Struct)

    Returns an empty response.

See Also:



3151
3152
3153
3154
# File 'gems/aws-sdk-glue/lib/aws-sdk-glue/client.rb', line 3151

def delete_table(params = {}, options = {})
  req = build_request(:delete_table, params)
  req.send_request(options)
end

#delete_table_version(params = {}) ⇒ Struct

Deletes a specified version of a table.

Examples:

Request syntax with placeholder values


resp = client.delete_table_version({
  catalog_id: "CatalogIdString",
  database_name: "NameString", # required
  table_name: "NameString", # required
  version_id: "VersionString", # required
})

Parameters:

  • params (Hash) (defaults to: {})

    ({})

Options Hash (params):

  • :catalog_id (String)

    The ID of the Data Catalog where the tables reside. If none is provided, the AWS account ID is used by default.

  • :database_name (required, String)

    The database in the catalog in which the table resides. For Hive compatibility, this name is entirely lowercase.

  • :table_name (required, String)

    The name of the table. For Hive compatibility, this name is entirely lowercase.

  • :version_id (required, String)

    The ID of the table version to be deleted. A VersionID is a string representation of an integer. Each version is incremented by 1.

Returns:

  • (Struct)

    Returns an empty response.

See Also:



3189
3190
3191
3192
# File 'gems/aws-sdk-glue/lib/aws-sdk-glue/client.rb', line 3189

def delete_table_version(params = {}, options = {})
  req = build_request(:delete_table_version, params)
  req.send_request(options)
end

#delete_trigger(params = {}) ⇒ Types::DeleteTriggerResponse

Deletes a specified trigger. If the trigger is not found, no exception is thrown.

Examples:

Request syntax with placeholder values


resp = client.delete_trigger({
  name: "NameString", # required
})

Response structure


resp.name #=> String

Parameters:

  • params (Hash) (defaults to: {})

    ({})

Options Hash (params):

  • :name (required, String)

    The name of the trigger to delete.

Returns:

See Also:



3218
3219
3220
3221
# File 'gems/aws-sdk-glue/lib/aws-sdk-glue/client.rb', line 3218

def delete_trigger(params = {}, options = {})
  req = build_request(:delete_trigger, params)
  req.send_request(options)
end

#delete_user_defined_function(params = {}) ⇒ Struct

Deletes an existing function definition from the Data Catalog.

Examples:

Request syntax with placeholder values


resp = client.delete_user_defined_function({
  catalog_id: "CatalogIdString",
  database_name: "NameString", # required
  function_name: "NameString", # required
})

Parameters:

  • params (Hash) (defaults to: {})

    ({})

Options Hash (params):

  • :catalog_id (String)

    The ID of the Data Catalog where the function to be deleted is located. If none is supplied, the AWS account ID is used by default.

  • :database_name (required, String)

    The name of the catalog database where the function is located.

  • :function_name (required, String)

    The name of the function definition to be deleted.

Returns:

  • (Struct)

    Returns an empty response.

See Also:



3249
3250
3251
3252
# File 'gems/aws-sdk-glue/lib/aws-sdk-glue/client.rb', line 3249

def delete_user_defined_function(params = {}, options = {})
  req = build_request(:delete_user_defined_function, params)
  req.send_request(options)
end

#delete_workflow(params = {}) ⇒ Types::DeleteWorkflowResponse

Deletes a workflow.

Examples:

Request syntax with placeholder values


resp = client.delete_workflow({
  name: "NameString", # required
})

Response structure


resp.name #=> String

Parameters:

  • params (Hash) (defaults to: {})

    ({})

Options Hash (params):

  • :name (required, String)

    Name of the workflow to be deleted.

Returns:

See Also:



3277
3278
3279
3280
# File 'gems/aws-sdk-glue/lib/aws-sdk-glue/client.rb', line 3277

def delete_workflow(params = {}, options = {})
  req = build_request(:delete_workflow, params)
  req.send_request(options)
end

#get_catalog_import_status(params = {}) ⇒ Types::GetCatalogImportStatusResponse

Retrieves the status of a migration operation.

Examples:

Request syntax with placeholder values


resp = client.get_catalog_import_status({
  catalog_id: "CatalogIdString",
})

Response structure


resp.import_status.import_completed #=> Boolean
resp.import_status.import_time #=> Time
resp.import_status.imported_by #=> String

Parameters:

  • params (Hash) (defaults to: {})

    ({})

Options Hash (params):

  • :catalog_id (String)

    The ID of the catalog to migrate. Currently, this should be the AWS account ID.

Returns:

See Also:



3308
3309
3310
3311
# File 'gems/aws-sdk-glue/lib/aws-sdk-glue/client.rb', line 3308

def get_catalog_import_status(params = {}, options = {})
  req = build_request(:get_catalog_import_status, params)
  req.send_request(options)
end

#get_classifier(params = {}) ⇒ Types::GetClassifierResponse

Retrieve a classifier by name.

Examples:

Request syntax with placeholder values


resp = client.get_classifier({
  name: "NameString", # required
})

Response structure


resp.classifier.grok_classifier.name #=> String
resp.classifier.grok_classifier.classification #=> String
resp.classifier.grok_classifier.creation_time #=> Time
resp.classifier.grok_classifier.last_updated #=> Time
resp.classifier.grok_classifier.version #=> Integer
resp.classifier.grok_classifier.grok_pattern #=> String
resp.classifier.grok_classifier.custom_patterns #=> String
resp.classifier.xml_classifier.name #=> String
resp.classifier.xml_classifier.classification #=> String
resp.classifier.xml_classifier.creation_time #=> Time
resp.classifier.xml_classifier.last_updated #=> Time
resp.classifier.xml_classifier.version #=> Integer
resp.classifier.xml_classifier.row_tag #=> String
resp.classifier.json_classifier.name #=> String
resp.classifier.json_classifier.creation_time #=> Time
resp.classifier.json_classifier.last_updated #=> Time
resp.classifier.json_classifier.version #=> Integer
resp.classifier.json_classifier.json_path #=> String
resp.classifier.csv_classifier.name #=> String
resp.classifier.csv_classifier.creation_time #=> Time
resp.classifier.csv_classifier.last_updated #=> Time
resp.classifier.csv_classifier.version #=> Integer
resp.classifier.csv_classifier.delimiter #=> String
resp.classifier.csv_classifier.quote_symbol #=> String
resp.classifier.csv_classifier.contains_header #=> String, one of "UNKNOWN", "PRESENT", "ABSENT"
resp.classifier.csv_classifier.header #=> Array
resp.classifier.csv_classifier.header[0] #=> String
resp.classifier.csv_classifier.disable_value_trimming #=> Boolean
resp.classifier.csv_classifier.allow_single_column #=> Boolean

Parameters:

  • params (Hash) (defaults to: {})

    ({})

Options Hash (params):

  • :name (required, String)

    Name of the classifier to retrieve.

Returns:

See Also:



3364
3365
3366
3367
# File 'gems/aws-sdk-glue/lib/aws-sdk-glue/client.rb', line 3364

def get_classifier(params = {}, options = {})
  req = build_request(:get_classifier, params)
  req.send_request(options)
end

#get_classifiers(params = {}) ⇒ Types::GetClassifiersResponse

Lists all classifier objects in the Data Catalog.

The returned response is a pageable response and is Enumerable. For details on usage see PageableResponse.

Examples:

Request syntax with placeholder values


resp = client.get_classifiers({
  max_results: 1,
  next_token: "Token",
})

Response structure


resp.classifiers #=> Array
resp.classifiers[0].grok_classifier.name #=> String
resp.classifiers[0].grok_classifier.classification #=> String
resp.classifiers[0].grok_classifier.creation_time #=> Time
resp.classifiers[0].grok_classifier.last_updated #=> Time
resp.classifiers[0].grok_classifier.version #=> Integer
resp.classifiers[0].grok_classifier.grok_pattern #=> String
resp.classifiers[0].grok_classifier.custom_patterns #=> String
resp.classifiers[0].xml_classifier.name #=> String
resp.classifiers[0].xml_classifier.classification #=> String
resp.classifiers[0].xml_classifier.creation_time #=> Time
resp.classifiers[0].xml_classifier.last_updated #=> Time
resp.classifiers[0].xml_classifier.version #=> Integer
resp.classifiers[0].xml_classifier.row_tag #=> String
resp.classifiers[0].json_classifier.name #=> String
resp.classifiers[0].json_classifier.creation_time #=> Time
resp.classifiers[0].json_classifier.last_updated #=> Time
resp.classifiers[0].json_classifier.version #=> Integer
resp.classifiers[0].json_classifier.json_path #=> String
resp.classifiers[0].csv_classifier.name #=> String
resp.classifiers[0].csv_classifier.creation_time #=> Time
resp.classifiers[0].csv_classifier.last_updated #=> Time
resp.classifiers[0].csv_classifier.version #=> Integer
resp.classifiers[0].csv_classifier.delimiter #=> String
resp.classifiers[0].csv_classifier.quote_symbol #=> String
resp.classifiers[0].csv_classifier.contains_header #=> String, one of "UNKNOWN", "PRESENT", "ABSENT"
resp.classifiers[0].csv_classifier.header #=> Array
resp.classifiers[0].csv_classifier.header[0] #=> String
resp.classifiers[0].csv_classifier.disable_value_trimming #=> Boolean
resp.classifiers[0].csv_classifier.allow_single_column #=> Boolean
resp.next_token #=> String

Parameters:

  • params (Hash) (defaults to: {})

    ({})

Options Hash (params):

  • :max_results (Integer)

    The size of the list to return (optional).

  • :next_token (String)

    An optional continuation token.

Returns:

See Also:



3429
3430
3431
3432
# File 'gems/aws-sdk-glue/lib/aws-sdk-glue/client.rb', line 3429

def get_classifiers(params = {}, options = {})
  req = build_request(:get_classifiers, params)
  req.send_request(options)
end

#get_column_statistics_for_partition(params = {}) ⇒ Types::GetColumnStatisticsForPartitionResponse

Retrieves partition statistics of columns.

Examples:

Request syntax with placeholder values


resp = client.get_column_statistics_for_partition({
  catalog_id: "CatalogIdString",
  database_name: "NameString", # required
  table_name: "NameString", # required
  partition_values: ["ValueString"], # required
  column_names: ["NameString"], # required
})

Response structure


resp.column_statistics_list #=> Array
resp.column_statistics_list[0].column_name #=> String
resp.column_statistics_list[0].column_type #=> String
resp.column_statistics_list[0].analyzed_time #=> Time
resp.column_statistics_list[0].statistics_data.type #=> String, one of "BOOLEAN", "DATE", "DECIMAL", "DOUBLE", "LONG", "STRING", "BINARY"
resp.column_statistics_list[0].statistics_data.boolean_column_statistics_data.number_of_trues #=> Integer
resp.column_statistics_list[0].statistics_data.boolean_column_statistics_data.number_of_falses #=> Integer
resp.column_statistics_list[0].statistics_data.boolean_column_statistics_data.number_of_nulls #=> Integer
resp.column_statistics_list[0].statistics_data.date_column_statistics_data.minimum_value #=> Time
resp.column_statistics_list[0].statistics_data.date_column_statistics_data.maximum_value #=> Time
resp.column_statistics_list[0].statistics_data.date_column_statistics_data.number_of_nulls #=> Integer
resp.column_statistics_list[0].statistics_data.date_column_statistics_data.number_of_distinct_values #=> Integer
resp.column_statistics_list[0].statistics_data.decimal_column_statistics_data.minimum_value.unscaled_value #=> String
resp.column_statistics_list[0].statistics_data.decimal_column_statistics_data.minimum_value.scale #=> Integer
resp.column_statistics_list[0].statistics_data.decimal_column_statistics_data.maximum_value.unscaled_value #=> String
resp.column_statistics_list[0].statistics_data.decimal_column_statistics_data.maximum_value.scale #=> Integer
resp.column_statistics_list[0].statistics_data.decimal_column_statistics_data.number_of_nulls #=> Integer
resp.column_statistics_list[0].statistics_data.decimal_column_statistics_data.number_of_distinct_values #=> Integer
resp.column_statistics_list[0].statistics_data.double_column_statistics_data.minimum_value #=> Float
resp.column_statistics_list[0].statistics_data.double_column_statistics_data.maximum_value #=> Float
resp.column_statistics_list[0].statistics_data.double_column_statistics_data.number_of_nulls #=> Integer
resp.column_statistics_list[0].statistics_data.double_column_statistics_data.number_of_distinct_values #=> Integer
resp.column_statistics_list[0].statistics_data.long_column_statistics_data.minimum_value #=> Integer
resp.column_statistics_list[0].statistics_data.long_column_statistics_data.maximum_value #=> Integer
resp.column_statistics_list[0].statistics_data.long_column_statistics_data.number_of_nulls #=> Integer
resp.column_statistics_list[0].statistics_data.long_column_statistics_data.number_of_distinct_values #=> Integer
resp.column_statistics_list[0].statistics_data.string_column_statistics_data.maximum_length #=> Integer
resp.column_statistics_list[0].statistics_data.string_column_statistics_data.average_length #=> Float
resp.column_statistics_list[0].statistics_data.string_column_statistics_data.number_of_nulls #=> Integer
resp.column_statistics_list[0].statistics_data.string_column_statistics_data.number_of_distinct_values #=> Integer
resp.column_statistics_list[0].statistics_data.binary_column_statistics_data.maximum_length #=> Integer
resp.column_statistics_list[0].statistics_data.binary_column_statistics_data.average_length #=> Float
resp.column_statistics_list[0].statistics_data.binary_column_statistics_data.number_of_nulls #=> Integer
resp.errors #=> Array
resp.errors[0].column_name #=> String
resp.errors[0].error.error_code #=> String
resp.errors[0].error.error_message #=> String

Parameters:

  • params (Hash) (defaults to: {})

    ({})

Options Hash (params):

  • :catalog_id (String)

    The ID of the Data Catalog where the partitions in question reside. If none is supplied, the AWS account ID is used by default.

  • :database_name (required, String)

    The name of the catalog database where the partitions reside.

  • :table_name (required, String)

    The name of the partitions' table.

  • :partition_values (required, Array<String>)

    A list of partition values identifying the partition.

  • :column_names (required, Array<String>)

    A list of the column names.

Returns:

See Also:



3511
3512
3513
3514
# File 'gems/aws-sdk-glue/lib/aws-sdk-glue/client.rb', line 3511

def get_column_statistics_for_partition(params = {}, options = {})
  req = build_request(:get_column_statistics_for_partition, params)
  req.send_request(options)
end

#get_column_statistics_for_table(params = {}) ⇒ Types::GetColumnStatisticsForTableResponse

Retrieves table statistics of columns.

Examples:

Request syntax with placeholder values


resp = client.get_column_statistics_for_table({
  catalog_id: "CatalogIdString",
  database_name: "NameString", # required
  table_name: "NameString", # required
  column_names: ["NameString"], # required
})

Response structure


resp.column_statistics_list #=> Array
resp.column_statistics_list[0].column_name #=> String
resp.column_statistics_list[0].column_type #=> String
resp.column_statistics_list[0].analyzed_time #=> Time
resp.column_statistics_list[0].statistics_data.type #=> String, one of "BOOLEAN", "DATE", "DECIMAL", "DOUBLE", "LONG", "STRING", "BINARY"
resp.column_statistics_list[0].statistics_data.boolean_column_statistics_data.number_of_trues #=> Integer
resp.column_statistics_list[0].statistics_data.boolean_column_statistics_data.number_of_falses #=> Integer
resp.column_statistics_list[0].statistics_data.boolean_column_statistics_data.number_of_nulls #=> Integer
resp.column_statistics_list[0].statistics_data.date_column_statistics_data.minimum_value #=> Time
resp.column_statistics_list[0].statistics_data.date_column_statistics_data.maximum_value #=> Time
resp.column_statistics_list[0].statistics_data.date_column_statistics_data.number_of_nulls #=> Integer
resp.column_statistics_list[0].statistics_data.date_column_statistics_data.number_of_distinct_values #=> Integer
resp.column_statistics_list[0].statistics_data.decimal_column_statistics_data.minimum_value.unscaled_value #=> String
resp.column_statistics_list[0].statistics_data.decimal_column_statistics_data.minimum_value.scale #=> Integer
resp.column_statistics_list[0].statistics_data.decimal_column_statistics_data.maximum_value.unscaled_value #=> String
resp.column_statistics_list[0].statistics_data.decimal_column_statistics_data.maximum_value.scale #=> Integer
resp.column_statistics_list[0].statistics_data.decimal_column_statistics_data.number_of_nulls #=> Integer
resp.column_statistics_list[0].statistics_data.decimal_column_statistics_data.number_of_distinct_values #=> Integer
resp.column_statistics_list[0].statistics_data.double_column_statistics_data.minimum_value #=> Float
resp.column_statistics_list[0].statistics_data.double_column_statistics_data.maximum_value #=> Float
resp.column_statistics_list[0].statistics_data.double_column_statistics_data.number_of_nulls #=> Integer
resp.column_statistics_list[0].statistics_data.double_column_statistics_data.number_of_distinct_values #=> Integer
resp.column_statistics_list[0].statistics_data.long_column_statistics_data.minimum_value #=> Integer
resp.column_statistics_list[0].statistics_data.long_column_statistics_data.maximum_value #=> Integer
resp.column_statistics_list[0].statistics_data.long_column_statistics_data.number_of_nulls #=> Integer
resp.column_statistics_list[0].statistics_data.long_column_statistics_data.number_of_distinct_values #=> Integer
resp.column_statistics_list[0].statistics_data.string_column_statistics_data.maximum_length #=> Integer
resp.column_statistics_list[0].statistics_data.string_column_statistics_data.average_length #=> Float
resp.column_statistics_list[0].statistics_data.string_column_statistics_data.number_of_nulls #=> Integer
resp.column_statistics_list[0].statistics_data.string_column_statistics_data.number_of_distinct_values #=> Integer
resp.column_statistics_list[0].statistics_data.binary_column_statistics_data.maximum_length #=> Integer
resp.column_statistics_list[0].statistics_data.binary_column_statistics_data.average_length #=> Float
resp.column_statistics_list[0].statistics_data.binary_column_statistics_data.number_of_nulls #=> Integer
resp.errors #=> Array
resp.errors[0].column_name #=> String
resp.errors[0].error.error_code #=> String
resp.errors[0].error.error_message #=> String

Parameters:

  • params (Hash) (defaults to: {})

    ({})

Options Hash (params):

  • :catalog_id (String)

    The ID of the Data Catalog where the partitions in question reside. If none is supplied, the AWS account ID is used by default.

  • :database_name (required, String)

    The name of the catalog database where the partitions reside.

  • :table_name (required, String)

    The name of the partitions' table.

  • :column_names (required, Array<String>)

    A list of the column names.

Returns:

See Also:



3589
3590
3591
3592
# File 'gems/aws-sdk-glue/lib/aws-sdk-glue/client.rb', line 3589

def get_column_statistics_for_table(params = {}, options = {})
  req = build_request(:get_column_statistics_for_table, params)
  req.send_request(options)
end

#get_connection(params = {}) ⇒ Types::GetConnectionResponse

Retrieves a connection definition from the Data Catalog.

Examples:

Request syntax with placeholder values


resp = client.get_connection({
  catalog_id: "CatalogIdString",
  name: "NameString", # required
  hide_password: false,
})

Response structure


resp.connection.name #=> String
resp.connection.description #=> String
resp.connection.connection_type #=> String, one of "JDBC", "SFTP", "MONGODB", "KAFKA", "NETWORK"
resp.connection.match_criteria #=> Array
resp.connection.match_criteria[0] #=> String
resp.connection.connection_properties #=> Hash
resp.connection.connection_properties["ConnectionPropertyKey"] #=> String
resp.connection.physical_connection_requirements.subnet_id #=> String
resp.connection.physical_connection_requirements.security_group_id_list #=> Array
resp.connection.physical_connection_requirements.security_group_id_list[0] #=> String
resp.connection.physical_connection_requirements.availability_zone #=> String
resp.connection.creation_time #=> Time
resp.connection.last_updated_time #=> Time
resp.connection.last_updated_by #=> String

Parameters:

  • params (Hash) (defaults to: {})

    ({})

Options Hash (params):

  • :catalog_id (String)

    The ID of the Data Catalog in which the connection resides. If none is provided, the AWS account ID is used by default.

  • :name (required, String)

    The name of the connection definition to retrieve.

  • :hide_password (Boolean)

    Allows you to retrieve the connection metadata without returning the password. For instance, the AWS Glue console uses this flag to retrieve the connection, and does not display the password. Set this parameter when the caller might not have permission to use the AWS KMS key to decrypt the password, but it does have permission to access the rest of the connection properties.

Returns:

See Also:



3644
3645
3646
3647
# File 'gems/aws-sdk-glue/lib/aws-sdk-glue/client.rb', line 3644

def get_connection(params = {}, options = {})
  req = build_request(:get_connection, params)
  req.send_request(options)
end

#get_connections(params = {}) ⇒ Types::GetConnectionsResponse

Retrieves a list of connection definitions from the Data Catalog.

The returned response is a pageable response and is Enumerable. For details on usage see PageableResponse.

Examples:

Request syntax with placeholder values


resp = client.get_connections({
  catalog_id: "CatalogIdString",
  filter: {
    match_criteria: ["NameString"],
    connection_type: "JDBC", # accepts JDBC, SFTP, MONGODB, KAFKA, NETWORK
  },
  hide_password: false,
  next_token: "Token",
  max_results: 1,
})

Response structure


resp.connection_list #=> Array
resp.connection_list[0].name #=> String
resp.connection_list[0].description #=> String
resp.connection_list[0].connection_type #=> String, one of "JDBC", "SFTP", "MONGODB", "KAFKA", "NETWORK"
resp.connection_list[0].match_criteria #=> Array
resp.connection_list[0].match_criteria[0] #=> String
resp.connection_list[0].connection_properties #=> Hash
resp.connection_list[0].connection_properties["ConnectionPropertyKey"] #=> String
resp.connection_list[0].physical_connection_requirements.subnet_id #=> String
resp.connection_list[0].physical_connection_requirements.security_group_id_list #=> Array
resp.connection_list[0].physical_connection_requirements.security_group_id_list[0] #=> String
resp.connection_list[0].physical_connection_requirements.availability_zone #=> String
resp.connection_list[0].creation_time #=> Time
resp.connection_list[0].last_updated_time #=> Time
resp.connection_list[0].last_updated_by #=> String
resp.next_token #=> String

Parameters:

  • params (Hash) (defaults to: {})

    ({})

Options Hash (params):

  • :catalog_id (String)

    The ID of the Data Catalog in which the connections reside. If none is provided, the AWS account ID is used by default.

  • :filter (Types::GetConnectionsFilter)

    A filter that controls which connections are returned.

  • :hide_password (Boolean)

    Allows you to retrieve the connection metadata without returning the password. For instance, the AWS Glue console uses this flag to retrieve the connection, and does not display the password. Set this parameter when the caller might not have permission to use the AWS KMS key to decrypt the password, but it does have permission to access the rest of the connection properties.

  • :next_token (String)

    A continuation token, if this is a continuation call.

  • :max_results (Integer)

    The maximum number of connections to return in one response.

Returns:

See Also:



3715
3716
3717
3718
# File 'gems/aws-sdk-glue/lib/aws-sdk-glue/client.rb', line 3715

def get_connections(params = {}, options = {})
  req = build_request(:get_connections, params)
  req.send_request(options)
end

#get_crawler(params = {}) ⇒ Types::GetCrawlerResponse

Retrieves metadata for a specified crawler.

Examples:

Request syntax with placeholder values


resp = client.get_crawler({
  name: "NameString", # required
})

Response structure


resp.crawler.name #=> String
resp.crawler.role #=> String
resp.crawler.targets.s3_targets #=> Array
resp.crawler.targets.s3_targets[0].path #=> String
resp.crawler.targets.s3_targets[0].exclusions #=> Array
resp.crawler.targets.s3_targets[0].exclusions[0] #=> String
resp.crawler.targets.s3_targets[0].connection_name #=> String
resp.crawler.targets.jdbc_targets #=> Array
resp.crawler.targets.jdbc_targets[0].connection_name #=> String
resp.crawler.targets.jdbc_targets[0].path #=> String
resp.crawler.targets.jdbc_targets[0].exclusions #=> Array
resp.crawler.targets.jdbc_targets[0].exclusions[0] #=> String
resp.crawler.targets.mongo_db_targets #=> Array
resp.crawler.targets.mongo_db_targets[0].connection_name #=> String
resp.crawler.targets.mongo_db_targets[0].path #=> String
resp.crawler.targets.mongo_db_targets[0].scan_all #=> Boolean
resp.crawler.targets.dynamo_db_targets #=> Array
resp.crawler.targets.dynamo_db_targets[0].path #=> String
resp.crawler.targets.dynamo_db_targets[0].scan_all #=> Boolean
resp.crawler.targets.dynamo_db_targets[0].scan_rate #=> Float
resp.crawler.targets.catalog_targets #=> Array
resp.crawler.targets.catalog_targets[0].database_name #=> String
resp.crawler.targets.catalog_targets[0].tables #=> Array
resp.crawler.targets.catalog_targets[0].tables[0] #=> String
resp.crawler.database_name #=> String
resp.crawler.description #=> String
resp.crawler.classifiers #=> Array
resp.crawler.classifiers[0] #=> String
resp.crawler.recrawl_policy.recrawl_behavior #=> String, one of "CRAWL_EVERYTHING", "CRAWL_NEW_FOLDERS_ONLY"
resp.crawler.schema_change_policy.update_behavior #=> String, one of "LOG", "UPDATE_IN_DATABASE"
resp.crawler.schema_change_policy.delete_behavior #=> String, one of "LOG", "DELETE_FROM_DATABASE", "DEPRECATE_IN_DATABASE"
resp.crawler.state #=> String, one of "READY", "RUNNING", "STOPPING"
resp.crawler.table_prefix #=> String
resp.crawler.schedule.schedule_expression #=> String
resp.crawler.schedule.state #=> String, one of "SCHEDULED", "NOT_SCHEDULED", "TRANSITIONING"
resp.crawler.crawl_elapsed_time #=> Integer
resp.crawler.creation_time #=> Time
resp.crawler.last_updated #=> Time
resp.crawler.last_crawl.status #=> String, one of "SUCCEEDED", "CANCELLED", "FAILED"
resp.crawler.last_crawl.error_message #=> String
resp.crawler.last_crawl.log_group #=> String
resp.crawler.last_crawl.log_stream #=> String
resp.crawler.last_crawl.message_prefix #=> String
resp.crawler.last_crawl.start_time #=> Time
resp.crawler.version #=> Integer
resp.crawler.configuration #=> String
resp.crawler.crawler_security_configuration #=> String

Parameters:

  • params (Hash) (defaults to: {})

    ({})

Options Hash (params):

  • :name (required, String)

    The name of the crawler to retrieve metadata for.

Returns:

See Also:



3789
3790
3791
3792
# File 'gems/aws-sdk-glue/lib/aws-sdk-glue/client.rb', line 3789

def get_crawler(params = {}, options = {})
  req = build_request(:get_crawler, params)
  req.send_request(options)
end

#get_crawler_metrics(params = {}) ⇒ Types::GetCrawlerMetricsResponse

Retrieves metrics about specified crawlers.

The returned response is a pageable response and is Enumerable. For details on usage see PageableResponse.

Examples:

Request syntax with placeholder values


resp = client.get_crawler_metrics({
  crawler_name_list: ["NameString"],
  max_results: 1,
  next_token: "Token",
})

Response structure


resp.crawler_metrics_list #=> Array
resp.crawler_metrics_list[0].crawler_name #=> String
resp.crawler_metrics_list[0].time_left_seconds #=> Float
resp.crawler_metrics_list[0].still_estimating #=> Boolean
resp.crawler_metrics_list[0].last_runtime_seconds #=> Float
resp.crawler_metrics_list[0].median_runtime_seconds #=> Float
resp.crawler_metrics_list[0].tables_created #=> Integer
resp.crawler_metrics_list[0].tables_updated #=> Integer
resp.crawler_metrics_list[0].tables_deleted #=> Integer
resp.next_token #=> String

Parameters:

  • params (Hash) (defaults to: {})

    ({})

Options Hash (params):

  • :crawler_name_list (Array<String>)

    A list of the names of crawlers about which to retrieve metrics.

  • :max_results (Integer)

    The maximum size of a list to return.

  • :next_token (String)

    A continuation token, if this is a continuation call.

Returns:

See Also:



3837
3838
3839
3840
# File 'gems/aws-sdk-glue/lib/aws-sdk-glue/client.rb', line 3837

def get_crawler_metrics(params = {}, options = {})
  req = build_request(:get_crawler_metrics, params)
  req.send_request(options)
end

#get_crawlers(params = {}) ⇒ Types::GetCrawlersResponse

Retrieves metadata for all crawlers defined in the customer account.

The returned response is a pageable response and is Enumerable. For details on usage see PageableResponse.

Examples:

Request syntax with placeholder values


resp = client.get_crawlers({
  max_results: 1,
  next_token: "Token",
})

Response structure


resp.crawlers #=> Array
resp.crawlers[0].name #=> String
resp.crawlers[0].role #=> String
resp.crawlers[0].targets.s3_targets #=> Array
resp.crawlers[0].targets.s3_targets[0].path #=> String
resp.crawlers[0].targets.s3_targets[0].exclusions #=> Array
resp.crawlers[0].targets.s3_targets[0].exclusions[0] #=> String
resp.crawlers[0].targets.s3_targets[0].connection_name #=> String
resp.crawlers[0].targets.jdbc_targets #=> Array
resp.crawlers[0].targets.jdbc_targets[0].connection_name #=> String
resp.crawlers[0].targets.jdbc_targets[0].path #=> String
resp.crawlers[0].targets.jdbc_targets[0].exclusions #=> Array
resp.crawlers[0].targets.jdbc_targets[0].exclusions[0] #=> String
resp.crawlers[0].targets.mongo_db_targets #=> Array
resp.crawlers[0].targets.mongo_db_targets[0].connection_name #=> String
resp.crawlers[0].targets.mongo_db_targets[0].path #=> String
resp.crawlers[0].targets.mongo_db_targets[0].scan_all #=> Boolean
resp.crawlers[0].targets.dynamo_db_targets #=> Array
resp.crawlers[0].targets.dynamo_db_targets[0].path #=> String
resp.crawlers[0].targets.dynamo_db_targets[0].scan_all #=> Boolean
resp.crawlers[0].targets.dynamo_db_targets[0].scan_rate #=> Float
resp.crawlers[0].targets.catalog_targets #=> Array
resp.crawlers[0].targets.catalog_targets[0].database_name #=> String
resp.crawlers[0].targets.catalog_targets[0].tables #=> Array
resp.crawlers[0].targets.catalog_targets[0].tables[0] #=> String
resp.crawlers[0].database_name #=> String
resp.crawlers[0].description #=> String
resp.crawlers[0].classifiers #=> Array
resp.crawlers[0].classifiers[0] #=> String
resp.crawlers[0].recrawl_policy.recrawl_behavior #=> String, one of "CRAWL_EVERYTHING", "CRAWL_NEW_FOLDERS_ONLY"
resp.crawlers[0].schema_change_policy.update_behavior #=> String, one of "LOG", "UPDATE_IN_DATABASE"
resp.crawlers[0].schema_change_policy.delete_behavior #=> String, one of "LOG", "DELETE_FROM_DATABASE", "DEPRECATE_IN_DATABASE"
resp.crawlers[0].state #=> String, one of "READY", "RUNNING", "STOPPING"
resp.crawlers[0].table_prefix #=> String
resp.crawlers[0].schedule.schedule_expression #=> String
resp.crawlers[0].schedule.state #=> String, one of "SCHEDULED", "NOT_SCHEDULED", "TRANSITIONING"
resp.crawlers[0].crawl_elapsed_time #=> Integer
resp.crawlers[0].creation_time #=> Time
resp.crawlers[0].last_updated #=> Time
resp.crawlers[0].last_crawl.status #=> String, one of "SUCCEEDED", "CANCELLED", "FAILED"
resp.crawlers[0].last_crawl.error_message #=> String
resp.crawlers[0].last_crawl.log_group #=> String
resp.crawlers[0].last_crawl.log_stream #=> String
resp.crawlers[0].last_crawl.message_prefix #=> String
resp.crawlers[0].last_crawl.start_time #=> Time
resp.crawlers[0].version #=> Integer
resp.crawlers[0].configuration #=> String
resp.crawlers[0].crawler_security_configuration #=> String
resp.next_token #=> String

Parameters:

  • params (Hash) (defaults to: {})

    ({})

Options Hash (params):

  • :max_results (Integer)

    The number of crawlers to return on each call.

  • :next_token (String)

    A continuation token, if this is a continuation request.

Returns:

See Also:



3920
3921
3922
3923
# File 'gems/aws-sdk-glue/lib/aws-sdk-glue/client.rb', line 3920

def get_crawlers(params = {}, options = {})
  req = build_request(:get_crawlers, params)
  req.send_request(options)
end

#get_data_catalog_encryption_settings(params = {}) ⇒ Types::GetDataCatalogEncryptionSettingsResponse

Retrieves the security configuration for a specified catalog.

Examples:

Request syntax with placeholder values


resp = client.get_data_catalog_encryption_settings({
  catalog_id: "CatalogIdString",
})

Response structure


resp.data_catalog_encryption_settings.encryption_at_rest.catalog_encryption_mode #=> String, one of "DISABLED", "SSE-KMS"
resp.data_catalog_encryption_settings.encryption_at_rest.sse_aws_kms_key_id #=> String
resp.data_catalog_encryption_settings.connection_password_encryption.return_connection_password_encrypted #=> Boolean
resp.data_catalog_encryption_settings.connection_password_encryption.aws_kms_key_id #=> String

Parameters:

  • params (Hash) (defaults to: {})

    ({})

Options Hash (params):

  • :catalog_id (String)

    The ID of the Data Catalog to retrieve the security configuration for. If none is provided, the AWS account ID is used by default.

Returns:

See Also:



3952
3953
3954
3955
# File 'gems/aws-sdk-glue/lib/aws-sdk-glue/client.rb', line 3952

def get_data_catalog_encryption_settings(params = {}, options = {})
  req = build_request(:get_data_catalog_encryption_settings, params)
  req.send_request(options)
end

#get_database(params = {}) ⇒ Types::GetDatabaseResponse

Retrieves the definition of a specified database.

Examples:

Request syntax with placeholder values


resp = client.get_database({
  catalog_id: "CatalogIdString",
  name: "NameString", # required
})

Response structure


resp.database.name #=> String
resp.database.description #=> String
resp.database.location_uri #=> String
resp.database.parameters #=> Hash
resp.database.parameters["KeyString"] #=> String
resp.database.create_time #=> Time
resp.database.create_table_default_permissions #=> Array
resp.database.create_table_default_permissions[0].principal.data_lake_principal_identifier #=> String
resp.database.create_table_default_permissions[0].permissions #=> Array
resp.database.create_table_default_permissions[0].permissions[0] #=> String, one of "ALL", "SELECT", "ALTER", "DROP", "DELETE", "INSERT", "CREATE_DATABASE", "CREATE_TABLE", "DATA_LOCATION_ACCESS"
resp.database.target_database.catalog_id #=> String
resp.database.target_database.database_name #=> String
resp.database.catalog_id #=> String

Parameters:

  • params (Hash) (defaults to: {})

    ({})

Options Hash (params):

  • :catalog_id (String)

    The ID of the Data Catalog in which the database resides. If none is provided, the AWS account ID is used by default.

  • :name (required, String)

    The name of the database to retrieve. For Hive compatibility, this should be all lowercase.

Returns:

See Also:



3998
3999
4000
4001
# File 'gems/aws-sdk-glue/lib/aws-sdk-glue/client.rb', line 3998

def get_database(params = {}, options = {})
  req = build_request(:get_database, params)
  req.send_request(options)
end

#get_databases(params = {}) ⇒ Types::GetDatabasesResponse

Retrieves all databases defined in a given Data Catalog.

The returned response is a pageable response and is Enumerable. For details on usage see PageableResponse.

Examples:

Request syntax with placeholder values


resp = client.get_databases({
  catalog_id: "CatalogIdString",
  next_token: "Token",
  max_results: 1,
  resource_share_type: "FOREIGN", # accepts FOREIGN, ALL
})

Response structure


resp.database_list #=> Array
resp.database_list[0].name #=> String
resp.database_list[0].description #=> String
resp.database_list[0].location_uri #=> String
resp.database_list[0].parameters #=> Hash
resp.database_list[0].parameters["KeyString"] #=> String
resp.database_list[0].create_time #=> Time
resp.database_list[0].create_table_default_permissions #=> Array
resp.database_list[0].create_table_default_permissions[0].principal.data_lake_principal_identifier #=> String
resp.database_list[0].create_table_default_permissions[0].permissions #=> Array
resp.database_list[0].create_table_default_permissions[0].permissions[0] #=> String, one of "ALL", "SELECT", "ALTER", "DROP", "DELETE", "INSERT", "CREATE_DATABASE", "CREATE_TABLE", "DATA_LOCATION_ACCESS"
resp.database_list[0].target_database.catalog_id #=> String
resp.database_list[0].target_database.database_name #=> String
resp.database_list[0].catalog_id #=> String
resp.next_token #=> String

Parameters:

  • params (Hash) (defaults to: {})

    ({})

Options Hash (params):

  • :catalog_id (String)

    The ID of the Data Catalog from which to retrieve Databases. If none is provided, the AWS account ID is used by default.

  • :next_token (String)

    A continuation token, if this is a continuation call.

  • :max_results (Integer)

    The maximum number of databases to return in one response.

  • :resource_share_type (String)

    Allows you to specify that you want to list the databases shared with your account. The allowable values are FOREIGN or ALL.

    • If set to FOREIGN, will list the databases shared with your account.

    • If set to ALL, will list the databases shared with your account, as well as the databases in yor local account.

Returns:

See Also:



4063
4064
4065
4066
# File 'gems/aws-sdk-glue/lib/aws-sdk-glue/client.rb', line 4063

def get_databases(params = {}, options = {})
  req = build_request(:get_databases, params)
  req.send_request(options)
end

#get_dataflow_graph(params = {}) ⇒ Types::GetDataflowGraphResponse

Transforms a Python script into a directed acyclic graph (DAG).

Examples:

Request syntax with placeholder values


resp = client.get_dataflow_graph({
  python_script: "PythonScript",
})

Response structure


resp.dag_nodes #=> Array
resp.dag_nodes[0].id #=> String
resp.dag_nodes[0].node_type #=> String
resp.dag_nodes[0].args #=> Array
resp.dag_nodes[0].args[0].name #=> String
resp.dag_nodes[0].args[0].value #=> String
resp.dag_nodes[0].args[0].param #=> Boolean
resp.dag_nodes[0].line_number #=> Integer
resp.dag_edges #=> Array
resp.dag_edges[0].source #=> String
resp.dag_edges[0].target #=> String
resp.dag_edges[0].target_parameter #=> String

Parameters:

  • params (Hash) (defaults to: {})

    ({})

Options Hash (params):

  • :python_script (String)

    The Python script to transform.

Returns:

See Also:



4103
4104
4105
4106
# File 'gems/aws-sdk-glue/lib/aws-sdk-glue/client.rb', line 4103

def get_dataflow_graph(params = {}, options = {})
  req = build_request(:get_dataflow_graph, params)
  req.send_request(options)
end

#get_dev_endpoint(params = {}) ⇒ Types::GetDevEndpointResponse

Retrieves information about a specified development endpoint.

When you create a development endpoint in a virtual private cloud (VPC), AWS Glue returns only a private IP address, and the public IP address field is not populated. When you create a non-VPC development endpoint, AWS Glue returns only a public IP address.

Examples:

Request syntax with placeholder values


resp = client.get_dev_endpoint({
  endpoint_name: "GenericString", # required
})

Response structure


resp.dev_endpoint.endpoint_name #=> String
resp.dev_endpoint.role_arn #=> String
resp.dev_endpoint.security_group_ids #=> Array
resp.dev_endpoint.security_group_ids[0] #=> String
resp.dev_endpoint.subnet_id #=> String
resp.dev_endpoint.yarn_endpoint_address #=> String
resp.dev_endpoint.private_address #=> String
resp.dev_endpoint.zeppelin_remote_spark_interpreter_port #=> Integer
resp.dev_endpoint.public_address #=> String
resp.dev_endpoint.status #=> String
resp.dev_endpoint.worker_type #=> String, one of "Standard", "G.1X", "G.2X"
resp.dev_endpoint.glue_version #=> String
resp.dev_endpoint.number_of_workers #=> Integer
resp.dev_endpoint.number_of_nodes #=> Integer
resp.dev_endpoint.availability_zone #=> String
resp.dev_endpoint.vpc_id #=> String
resp.dev_endpoint.extra_python_libs_s3_path #=> String
resp.dev_endpoint.extra_jars_s3_path #=> String
resp.dev_endpoint.failure_reason #=> String
resp.dev_endpoint.last_update_status #=> String
resp.dev_endpoint.created_timestamp #=> Time
resp.dev_endpoint.last_modified_timestamp #=> Time
resp.dev_endpoint.public_key #=> String
resp.dev_endpoint.public_keys #=> Array
resp.dev_endpoint.public_keys[0] #=> String
resp.dev_endpoint.security_configuration #=> String
resp.dev_endpoint.arguments #=> Hash
resp.dev_endpoint.arguments["GenericString"] #=> String

Parameters:

  • params (Hash) (defaults to: {})

    ({})

Options Hash (params):

  • :endpoint_name (required, String)

    Name of the DevEndpoint to retrieve information for.

Returns:

See Also:



4165
4166
4167
4168
# File 'gems/aws-sdk-glue/lib/aws-sdk-glue/client.rb', line 4165

def get_dev_endpoint(params = {}, options = {})
  req = build_request(:get_dev_endpoint, params)
  req.send_request(options)
end

#get_dev_endpoints(params = {}) ⇒ Types::GetDevEndpointsResponse

Retrieves all the development endpoints in this AWS account.

When you create a development endpoint in a virtual private cloud (VPC), AWS Glue returns only a private IP address and the public IP address field is not populated. When you create a non-VPC development endpoint, AWS Glue returns only a public IP address.

The returned response is a pageable response and is Enumerable. For details on usage see PageableResponse.

Examples:

Request syntax with placeholder values


resp = client.get_dev_endpoints({
  max_results: 1,
  next_token: "GenericString",
})

Response structure


resp.dev_endpoints #=> Array
resp.dev_endpoints[0].endpoint_name #=> String
resp.dev_endpoints[0].role_arn #=> String
resp.dev_endpoints[0].security_group_ids #=> Array
resp.dev_endpoints[0].security_group_ids[0] #=> String
resp.dev_endpoints[0].subnet_id #=> String
resp.dev_endpoints[0].yarn_endpoint_address #=> String
resp.dev_endpoints[0].private_address #=> String
resp.dev_endpoints[0].zeppelin_remote_spark_interpreter_port #=> Integer
resp.dev_endpoints[0].public_address #=> String
resp.dev_endpoints[0].status #=> String
resp.dev_endpoints[0].worker_type #=> String, one of "Standard", "G.1X", "G.2X"
resp.dev_endpoints[0].glue_version #=> String
resp.dev_endpoints[0].number_of_workers #=> Integer
resp.dev_endpoints[0].number_of_nodes #=> Integer
resp.dev_endpoints[0].availability_zone #=> String
resp.dev_endpoints[0].vpc_id #=> String
resp.dev_endpoints[0].extra_python_libs_s3_path #=> String
resp.dev_endpoints[0].extra_jars_s3_path #=> String
resp.dev_endpoints[0].failure_reason #=> String
resp.dev_endpoints[0].last_update_status #=> String
resp.dev_endpoints[0].created_timestamp #=> Time
resp.dev_endpoints[0].last_modified_timestamp #=> Time
resp.dev_endpoints[0].public_key #=> String
resp.dev_endpoints[0].public_keys #=> Array
resp.dev_endpoints[0].public_keys[0] #=> String
resp.dev_endpoints[0].security_configuration #=> String
resp.dev_endpoints[0].arguments #=> Hash
resp.dev_endpoints[0].arguments["GenericString"] #=> String
resp.next_token #=> String

Parameters:

  • params (Hash) (defaults to: {})

    ({})

Options Hash (params):

  • :max_results (Integer)

    The maximum size of information to return.

  • :next_token (String)

    A continuation token, if this is a continuation call.

Returns:

See Also:



4236
4237
4238
4239
# File 'gems/aws-sdk-glue/lib/aws-sdk-glue/client.rb', line 4236

def get_dev_endpoints(params = {}, options = {})
  req = build_request(:get_dev_endpoints, params)
  req.send_request(options)
end

#get_job(params = {}) ⇒ Types::GetJobResponse

Retrieves an existing job definition.

Examples:

Request syntax with placeholder values


resp = client.get_job({
  job_name: "NameString", # required
})

Response structure


resp.job.name #=> String
resp.job.description #=> String
resp.job.log_uri #=> String
resp.job.role #=> String
resp.job.created_on #=> Time
resp.job.last_modified_on #=> Time
resp.job.execution_property.max_concurrent_runs #=> Integer
resp.job.command.name #=> String
resp.job.command.script_location #=> String
resp.job.command.python_version #=> String
resp.job.default_arguments #=> Hash
resp.job.default_arguments["GenericString"] #=> String
resp.job.non_overridable_arguments #=> Hash
resp.job.non_overridable_arguments["GenericString"] #=> String
resp.job.connections.connections #=> Array
resp.job.connections.connections[0] #=> String
resp.job.max_retries #=> Integer
resp.job.allocated_capacity #=> Integer
resp.job.timeout #=> Integer
resp.job.max_capacity #=> Float
resp.job.worker_type #=> String, one of "Standard", "G.1X", "G.2X"
resp.job.number_of_workers #=> Integer
resp.job.security_configuration #=> String
resp.job.notification_property.notify_delay_after #=> Integer
resp.job.glue_version #=> String

Parameters:

  • params (Hash) (defaults to: {})

    ({})

Options Hash (params):

  • :job_name (required, String)

    The name of the job definition to retrieve.

Returns:

See Also:



4288
4289
4290
4291
# File 'gems/aws-sdk-glue/lib/aws-sdk-glue/client.rb', line 4288

def get_job(params = {}, options = {})
  req = build_request(:get_job, params)
  req.send_request(options)
end

#get_job_bookmark(params = {}) ⇒ Types::GetJobBookmarkResponse

Returns information on a job bookmark entry.

Examples:

Request syntax with placeholder values


resp = client.get_job_bookmark({
  job_name: "JobName", # required
  run_id: "RunId",
})

Response structure


resp.job_bookmark_entry.job_name #=> String
resp.job_bookmark_entry.version #=> Integer
resp.job_bookmark_entry.run #=> Integer
resp.job_bookmark_entry.attempt #=> Integer
resp.job_bookmark_entry.previous_run_id #=> String
resp.job_bookmark_entry.run_id #=> String
resp.job_bookmark_entry.job_bookmark #=> String

Parameters:

  • params (Hash) (defaults to: {})

    ({})

Options Hash (params):

  • :job_name (required, String)

    The name of the job in question.

  • :run_id (String)

    The unique run identifier associated with this job run.

Returns:

See Also:



4326
4327
4328
4329
# File 'gems/aws-sdk-glue/lib/aws-sdk-glue/client.rb', line 4326

def get_job_bookmark(params = {}, options = {})
  req = build_request(:get_job_bookmark, params)
  req.send_request(options)
end

#get_job_run(params = {}) ⇒ Types::GetJobRunResponse

Retrieves the metadata for a given job run.

Examples:

Request syntax with placeholder values


resp = client.get_job_run({
  job_name: "NameString", # required
  run_id: "IdString", # required
  predecessors_included: false,
})

Response structure


resp.job_run.id #=> String
resp.job_run.attempt #=> Integer
resp.job_run.previous_run_id #=> String
resp.job_run.trigger_name #=> String
resp.job_run.job_name #=> String
resp.job_run.started_on #=> Time
resp.job_run.last_modified_on #=> Time
resp.job_run.completed_on #=> Time
resp.job_run.job_run_state #=> String, one of "STARTING", "RUNNING", "STOPPING", "STOPPED", "SUCCEEDED", "FAILED", "TIMEOUT"
resp.job_run.arguments #=> Hash
resp.job_run.arguments["GenericString"] #=> String
resp.job_run.error_message #=> String
resp.job_run.predecessor_runs #=> Array
resp.job_run.predecessor_runs[0].job_name #=> String
resp.job_run.predecessor_runs[0].run_id #=> String
resp.job_run.allocated_capacity #=> Integer
resp.job_run.execution_time #=> Integer
resp.job_run.timeout #=> Integer
resp.job_run.max_capacity #=> Float
resp.job_run.worker_type #=> String, one of "Standard", "G.1X", "G.2X"
resp.job_run.number_of_workers #=> Integer
resp.job_run.security_configuration #=> String
resp.job_run.log_group_name #=> String
resp.job_run.notification_property.notify_delay_after #=> Integer
resp.job_run.glue_version #=> String

Parameters:

  • params (Hash) (defaults to: {})

    ({})

Options Hash (params):

  • :job_name (required, String)

    Name of the job definition being run.

  • :run_id (required, String)

    The ID of the job run.

  • :predecessors_included (Boolean)

    True if a list of predecessor runs should be returned.

Returns:

See Also:



4386
4387
4388
4389
# File 'gems/aws-sdk-glue/lib/aws-sdk-glue/client.rb', line 4386

def get_job_run(params = {}, options = {})
  req = build_request(:get_job_run, params)
  req.send_request(options)
end

#get_job_runs(params = {}) ⇒ Types::GetJobRunsResponse

Retrieves metadata for all runs of a given job definition.

The returned response is a pageable response and is Enumerable. For details on usage see PageableResponse.

Examples:

Request syntax with placeholder values


resp = client.get_job_runs({
  job_name: "NameString", # required
  next_token: "GenericString",
  max_results: 1,
})

Response structure


resp.job_runs #=> Array
resp.job_runs[0].id #=> String
resp.job_runs[0].attempt #=> Integer
resp.job_runs[0].previous_run_id #=> String
resp.job_runs[0].trigger_name #=> String
resp.job_runs[0].job_name #=> String
resp.job_runs[0].started_on #=> Time
resp.job_runs[0].last_modified_on #=> Time
resp.job_runs[0].completed_on #=> Time
resp.job_runs[0].job_run_state #=> String, one of "STARTING", "RUNNING", "STOPPING", "STOPPED", "SUCCEEDED", "FAILED", "TIMEOUT"
resp.job_runs[0].arguments #=> Hash
resp.job_runs[0].arguments["GenericString"] #=> String
resp.job_runs[0].error_message #=> String
resp.job_runs[0].predecessor_runs #=> Array
resp.job_runs[0].predecessor_runs[0].job_name #=> String
resp.job_runs[0].predecessor_runs[0].run_id #=> String
resp.job_runs[0].allocated_capacity #=> Integer
resp.job_runs[0].execution_time #=> Integer
resp.job_runs[0].timeout #=> Integer
resp.job_runs[0].max_capacity #=> Float
resp.job_runs[0].worker_type #=> String, one of "Standard", "G.1X", "G.2X"
resp.job_runs[0].number_of_workers #=> Integer
resp.job_runs[0].security_configuration #=> String
resp.job_runs[0].log_group_name #=> String
resp.job_runs[0].notification_property.notify_delay_after #=> Integer
resp.job_runs[0].glue_version #=> String
resp.next_token #=> String

Parameters:

  • params (Hash) (defaults to: {})

    ({})

Options Hash (params):

  • :job_name (required, String)

    The name of the job definition for which to retrieve all job runs.

  • :next_token (String)

    A continuation token, if this is a continuation call.

  • :max_results (Integer)

    The maximum size of the response.

Returns:

See Also:



4451
4452
4453
4454
# File 'gems/aws-sdk-glue/lib/aws-sdk-glue/client.rb', line 4451

def get_job_runs(params = {}, options = {})
  req = build_request(:get_job_runs, params)
  req.send_request(options)
end

#get_jobs(params = {}) ⇒ Types::GetJobsResponse

Retrieves all current job definitions.

The returned response is a pageable response and is Enumerable. For details on usage see PageableResponse.

Examples:

Request syntax with placeholder values


resp = client.get_jobs({
  next_token: "GenericString",
  max_results: 1,
})

Response structure


resp.jobs #=> Array
resp.jobs[0].name #=> String
resp.jobs[0].description #=> String
resp.jobs[0].log_uri #=> String
resp.jobs[0].role #=> String
resp.jobs[0].created_on #=> Time
resp.jobs[0].last_modified_on #=> Time
resp.jobs[0].execution_property.max_concurrent_runs #=> Integer
resp.jobs[0].command.name #=> String
resp.jobs[0].command.script_location #=> String
resp.jobs[0].command.python_version #=> String
resp.jobs[0].default_arguments #=> Hash
resp.jobs[0].default_arguments["GenericString"] #=> String
resp.jobs[0].non_overridable_arguments #=> Hash
resp.jobs[0].non_overridable_arguments["GenericString"] #=> String
resp.jobs[0].connections.connections #=> Array
resp.jobs[0].connections.connections[0] #=> String
resp.jobs[0].max_retries #=> Integer
resp.jobs[0].allocated_capacity #=> Integer
resp.jobs[0].timeout #=> Integer
resp.jobs[0].max_capacity #=> Float
resp.jobs[0].worker_type #=> String, one of "Standard", "G.1X", "G.2X"
resp.jobs[0].number_of_workers #=> Integer
resp.jobs[0].security_configuration #=> String
resp.jobs[0].notification_property.notify_delay_after #=> Integer
resp.jobs[0].glue_version #=> String
resp.next_token #=> String

Parameters:

  • params (Hash) (defaults to: {})

    ({})

Options Hash (params):

  • :next_token (String)

    A continuation token, if this is a continuation call.

  • :max_results (Integer)

    The maximum size of the response.

Returns:

See Also:



4512
4513
4514
4515
# File 'gems/aws-sdk-glue/lib/aws-sdk-glue/client.rb', line 4512

def get_jobs(params = {}, options = {})
  req = build_request(:get_jobs, params)
  req.send_request(options)
end

#get_mapping(params = {}) ⇒ Types::GetMappingResponse

Creates mappings.

Examples:

Request syntax with placeholder values


resp = client.get_mapping({
  source: { # required
    database_name: "NameString", # required
    table_name: "NameString", # required
  },
  sinks: [
    {
      database_name: "NameString", # required
      table_name: "NameString", # required
    },
  ],
  location: {
    jdbc: [
      {
        name: "CodeGenArgName", # required
        value: "CodeGenArgValue", # required
        param: false,
      },
    ],
    s3: [
      {
        name: "CodeGenArgName", # required
        value: "CodeGenArgValue", # required
        param: false,
      },
    ],
    dynamo_db: [
      {
        name: "CodeGenArgName", # required
        value: "CodeGenArgValue", # required
        param: false,
      },
    ],
  },
})

Response structure


resp.mapping #=> Array
resp.mapping[0].source_table #=> String
resp.mapping[0].source_path #=> String
resp.mapping[0].source_type #=> String
resp.mapping[0].target_table #=> String
resp.mapping[0].target_path #=> String
resp.mapping[0].target_type #=> String

Parameters:

  • params (Hash) (defaults to: {})

    ({})

Options Hash (params):

Returns:

See Also:



4916
4917
4918
4919
# File 'gems/aws-sdk-glue/lib/aws-sdk-glue/client.rb', line 4916

def get_mapping(params = {}, options = {})
  req = build_request(:get_mapping, params)
  req.send_request(options)
end

#get_ml_task_run(params = {}) ⇒ Types::GetMLTaskRunResponse

Gets details for a specific task run on a machine learning transform. Machine learning task runs are asynchronous tasks that AWS Glue runs on your behalf as part of various machine learning workflows. You can check the stats of any task run by calling GetMLTaskRun with the TaskRunID and its parent transform's TransformID.

Examples:

Request syntax with placeholder values


resp = client.get_ml_task_run({
  transform_id: "HashString", # required
  task_run_id: "HashString", # required
})

Response structure


resp.transform_id #=> String
resp.task_run_id #=> String
resp.status #=> String, one of "STARTING", "RUNNING", "STOPPING", "STOPPED", "SUCCEEDED", "FAILED", "TIMEOUT"
resp.log_group_name #=> String
resp.properties.task_type #=> String, one of "EVALUATION", "LABELING_SET_GENERATION", "IMPORT_LABELS", "EXPORT_LABELS", "FIND_MATCHES"
resp.properties.import_labels_task_run_properties.input_s3_path #=> String
resp.properties.import_labels_task_run_properties.replace #=> Boolean
resp.properties.export_labels_task_run_properties.output_s3_path #=> String
resp.