Class: Aws::DataPipeline::Client

Inherits:
Seahorse::Client::Base show all
Includes:
ClientStubs
Defined in:
gems/aws-sdk-datapipeline/lib/aws-sdk-datapipeline/client.rb

Overview

An API client for DataPipeline. To construct a client, you need to configure a :region and :credentials.

client = Aws::DataPipeline::Client.new(
  region: region_name,
  credentials: credentials,
  # ...
)

For details on configuring region and credentials see the developer guide.

See #initialize for a full list of supported configuration options.

Instance Attribute Summary

Attributes inherited from Seahorse::Client::Base

#config, #handlers

API Operations collapse

Instance Method Summary collapse

Methods included from ClientStubs

#api_requests, #stub_data, #stub_responses

Methods inherited from Seahorse::Client::Base

add_plugin, api, clear_plugins, define, new, #operation_names, plugins, remove_plugin, set_api, set_plugins

Methods included from Seahorse::Client::HandlerBuilder

#handle, #handle_request, #handle_response

Constructor Details

#initialize(options) ⇒ Client

Returns a new instance of Client.

Parameters:

  • options (Hash)

Options Hash (options):

  • :credentials (required, Aws::CredentialProvider)

    Your AWS credentials. This can be an instance of any one of the following classes:

    • Aws::Credentials - Used for configuring static, non-refreshing credentials.

    • Aws::SharedCredentials - Used for loading static credentials from a shared file, such as ~/.aws/config.

    • Aws::AssumeRoleCredentials - Used when you need to assume a role.

    • Aws::AssumeRoleWebIdentityCredentials - Used when you need to assume a role after providing credentials via the web.

    • Aws::SSOCredentials - Used for loading credentials from AWS SSO using an access token generated from aws login.

    • Aws::ProcessCredentials - Used for loading credentials from a process that outputs to stdout.

    • Aws::InstanceProfileCredentials - Used for loading credentials from an EC2 IMDS on an EC2 instance.

    • Aws::ECSCredentials - Used for loading credentials from instances running in ECS.

    • Aws::CognitoIdentityCredentials - Used for loading credentials from the Cognito Identity service.

    When :credentials are not configured directly, the following locations will be searched for credentials:

    • Aws.config[:credentials]
    • The :access_key_id, :secret_access_key, and :session_token options.
    • ENV['AWS_ACCESS_KEY_ID'], ENV['AWS_SECRET_ACCESS_KEY']
    • ~/.aws/credentials
    • ~/.aws/config
    • EC2/ECS IMDS instance profile - When used by default, the timeouts are very aggressive. Construct and pass an instance of Aws::InstanceProfileCredentails or Aws::ECSCredentials to enable retries and extended timeouts. Instance profile credential fetching can be disabled by setting ENV['AWS_EC2_METADATA_DISABLED'] to true.
  • :region (required, String)

    The AWS region to connect to. The configured :region is used to determine the service :endpoint. When not passed, a default :region is searched for in the following locations:

    • Aws.config[:region]
    • ENV['AWS_REGION']
    • ENV['AMAZON_REGION']
    • ENV['AWS_DEFAULT_REGION']
    • ~/.aws/credentials
    • ~/.aws/config
  • :access_key_id (String)
  • :active_endpoint_cache (Boolean) — default: false

    When set to true, a thread polling for endpoints will be running in the background every 60 secs (default). Defaults to false.

  • :adaptive_retry_wait_to_fill (Boolean) — default: true

    Used only in adaptive retry mode. When true, the request will sleep until there is sufficent client side capacity to retry the request. When false, the request will raise a RetryCapacityNotAvailableError and will not retry instead of sleeping.

  • :client_side_monitoring (Boolean) — default: false

    When true, client-side metrics will be collected for all API requests from this client.

  • :client_side_monitoring_client_id (String) — default: ""

    Allows you to provide an identifier for this client which will be attached to all generated client side metrics. Defaults to an empty string.

  • :client_side_monitoring_host (String) — default: "127.0.0.1"

    Allows you to specify the DNS hostname or IPv4 or IPv6 address that the client side monitoring agent is running on, where client metrics will be published via UDP.

  • :client_side_monitoring_port (Integer) — default: 31000

    Required for publishing client metrics. The port that the client side monitoring agent is running on, where client metrics will be published via UDP.

  • :client_side_monitoring_publisher (Aws::ClientSideMonitoring::Publisher) — default: Aws::ClientSideMonitoring::Publisher

    Allows you to provide a custom client-side monitoring publisher class. By default, will use the Client Side Monitoring Agent Publisher.

  • :convert_params (Boolean) — default: true

    When true, an attempt is made to coerce request parameters into the required types.

  • :correct_clock_skew (Boolean) — default: true

    Used only in standard and adaptive retry modes. Specifies whether to apply a clock skew correction and retry requests with skewed client clocks.

  • :defaults_mode (String) — default: "legacy"

    See Aws::DefaultsModeConfiguration for a list of the accepted modes and the configuration defaults that are included.

  • :disable_host_prefix_injection (Boolean) — default: false

    Set to true to disable SDK automatically adding host prefix to default service endpoint when available.

  • :disable_request_compression (Boolean) — default: false

    When set to 'true' the request body will not be compressed for supported operations.

  • :endpoint (String)

    The client endpoint is normally constructed from the :region option. You should only configure an :endpoint when connecting to test or custom endpoints. This should be a valid HTTP(S) URI.

  • :endpoint_cache_max_entries (Integer) — default: 1000

    Used for the maximum size limit of the LRU cache storing endpoints data for endpoint discovery enabled operations. Defaults to 1000.

  • :endpoint_cache_max_threads (Integer) — default: 10

    Used for the maximum threads in use for polling endpoints to be cached, defaults to 10.

  • :endpoint_cache_poll_interval (Integer) — default: 60

    When :endpoint_discovery and :active_endpoint_cache is enabled, Use this option to config the time interval in seconds for making requests fetching endpoints information. Defaults to 60 sec.

  • :endpoint_discovery (Boolean) — default: false

    When set to true, endpoint discovery will be enabled for operations when available.

  • :ignore_configured_endpoint_urls (Boolean)

    Setting to true disables use of endpoint URLs provided via environment variables and the shared configuration file.

  • :log_formatter (Aws::Log::Formatter) — default: Aws::Log::Formatter.default

    The log formatter.

  • :log_level (Symbol) — default: :info

    The log level to send messages to the :logger at.

  • :logger (Logger)

    The Logger instance to send log messages to. If this option is not set, logging will be disabled.

  • :max_attempts (Integer) — default: 3

    An integer representing the maximum number attempts that will be made for a single request, including the initial attempt. For example, setting this value to 5 will result in a request being retried up to 4 times. Used in standard and adaptive retry modes.

  • :profile (String) — default: "default"

    Used when loading credentials from the shared credentials file at HOME/.aws/credentials. When not specified, 'default' is used.

  • :request_min_compression_size_bytes (Integer) — default: 10240

    The minimum size in bytes that triggers compression for request bodies. The value must be non-negative integer value between 0 and 10485780 bytes inclusive.

  • :retry_backoff (Proc)

    A proc or lambda used for backoff. Defaults to 2**retries * retry_base_delay. This option is only used in the legacy retry mode.

  • :retry_base_delay (Float) — default: 0.3

    The base delay in seconds used by the default backoff function. This option is only used in the legacy retry mode.

  • :retry_jitter (Symbol) — default: :none

    A delay randomiser function used by the default backoff function. Some predefined functions can be referenced by name - :none, :equal, :full, otherwise a Proc that takes and returns a number. This option is only used in the legacy retry mode.

    @see https://www.awsarchitectureblog.com/2015/03/backoff.html

  • :retry_limit (Integer) — default: 3

    The maximum number of times to retry failed requests. Only ~ 500 level server errors and certain ~ 400 level client errors are retried. Generally, these are throttling errors, data checksum errors, networking errors, timeout errors, auth errors, endpoint discovery, and errors from expired credentials. This option is only used in the legacy retry mode.

  • :retry_max_delay (Integer) — default: 0

    The maximum number of seconds to delay between retries (0 for no limit) used by the default backoff function. This option is only used in the legacy retry mode.

  • :retry_mode (String) — default: "legacy"

    Specifies which retry algorithm to use. Values are:

    • legacy - The pre-existing retry behavior. This is default value if no retry mode is provided.

    • standard - A standardized set of retry rules across the AWS SDKs. This includes support for retry quotas, which limit the number of unsuccessful retries a client can make.

    • adaptive - An experimental retry mode that includes all the functionality of standard mode along with automatic client side throttling. This is a provisional mode that may change behavior in the future.

  • :sdk_ua_app_id (String)

    A unique and opaque application ID that is appended to the User-Agent header as app/. It should have a maximum length of 50.

  • :secret_access_key (String)
  • :session_token (String)
  • :simple_json (Boolean) — default: false

    Disables request parameter conversion, validation, and formatting. Also disable response data type conversions. This option is useful when you want to ensure the highest level of performance by avoiding overhead of walking request parameters and response data structures.

    When :simple_json is enabled, the request parameters hash must be formatted exactly as the DynamoDB API expects.

  • :stub_responses (Boolean) — default: false

    Causes the client to return stubbed responses. By default fake responses are generated and returned. You can specify the response data to return or errors to raise by calling ClientStubs#stub_responses. See ClientStubs for more information.

    Please note When response stubbing is enabled, no HTTP requests are made, and retries are disabled.

  • :token_provider (Aws::TokenProvider)

    A Bearer Token Provider. This can be an instance of any one of the following classes:

    • Aws::StaticTokenProvider - Used for configuring static, non-refreshing tokens.

    • Aws::SSOTokenProvider - Used for loading tokens from AWS SSO using an access token generated from aws login.

    When :token_provider is not configured directly, the Aws::TokenProviderChain will be used to search for tokens configured for your profile in shared configuration files.

  • :use_dualstack_endpoint (Boolean)

    When set to true, dualstack enabled endpoints (with .aws TLD) will be used if available.

  • :use_fips_endpoint (Boolean)

    When set to true, fips compatible endpoints will be used if available. When a fips region is used, the region is normalized and this config is set to true.

  • :validate_params (Boolean) — default: true

    When true, request parameters are validated before sending the request.

  • :endpoint_provider (Aws::DataPipeline::EndpointProvider)

    The endpoint provider used to resolve endpoints. Any object that responds to #resolve_endpoint(parameters) where parameters is a Struct similar to Aws::DataPipeline::EndpointParameters

  • :http_proxy (URI::HTTP, String)

    A proxy to send requests through. Formatted like 'http://proxy.com:123'.

  • :http_open_timeout (Float) — default: 15

    The number of seconds to wait when opening a HTTP session before raising a Timeout::Error.

  • :http_read_timeout (Float) — default: 60

    The default number of seconds to wait for response data. This value can safely be set per-request on the session.

  • :http_idle_timeout (Float) — default: 5

    The number of seconds a connection is allowed to sit idle before it is considered stale. Stale connections are closed and removed from the pool before making a request.

  • :http_continue_timeout (Float) — default: 1

    The number of seconds to wait for a 100-continue response before sending the request body. This option has no effect unless the request has "Expect" header set to "100-continue". Defaults to nil which disables this behaviour. This value can safely be set per request on the session.

  • :ssl_timeout (Float) — default: nil

    Sets the SSL timeout in seconds.

  • :http_wire_trace (Boolean) — default: false

    When true, HTTP debug output will be sent to the :logger.

  • :ssl_verify_peer (Boolean) — default: true

    When true, SSL peer certificates are verified when establishing a connection.

  • :ssl_ca_bundle (String)

    Full path to the SSL certificate authority bundle file that should be used when verifying peer certificates. If you do not pass :ssl_ca_bundle or :ssl_ca_directory the the system default will be used if available.

  • :ssl_ca_directory (String)

    Full path of the directory that contains the unbundled SSL certificate authority files for verifying peer certificates. If you do not pass :ssl_ca_bundle or :ssl_ca_directory the the system default will be used if available.



395
396
397
# File 'gems/aws-sdk-datapipeline/lib/aws-sdk-datapipeline/client.rb', line 395

def initialize(*args)
  super
end

Instance Method Details

#activate_pipeline(params = {}) ⇒ Struct

Validates the specified pipeline and starts processing pipeline tasks. If the pipeline does not pass validation, activation fails.

If you need to pause the pipeline to investigate an issue with a component, such as a data source or script, call DeactivatePipeline.

To activate a finished pipeline, modify the end date for the pipeline and then activate it.

Examples:

Request syntax with placeholder values


resp = client.activate_pipeline({
  pipeline_id: "id", # required
  parameter_values: [
    {
      id: "fieldNameString", # required
      string_value: "fieldStringValue", # required
    },
  ],
  start_timestamp: Time.now,
})

Parameters:

  • params (Hash) (defaults to: {})

    ({})

Options Hash (params):

  • :pipeline_id (required, String)

    The ID of the pipeline.

  • :parameter_values (Array<Types::ParameterValue>)

    A list of parameter values to pass to the pipeline at activation.

  • :start_timestamp (Time, DateTime, Date, Integer, String)

    The date and time to resume the pipeline. By default, the pipeline resumes from the last completed execution.

Returns:

  • (Struct)

    Returns an empty response.

See Also:



439
440
441
442
# File 'gems/aws-sdk-datapipeline/lib/aws-sdk-datapipeline/client.rb', line 439

def activate_pipeline(params = {}, options = {})
  req = build_request(:activate_pipeline, params)
  req.send_request(options)
end

#add_tags(params = {}) ⇒ Struct

Adds or modifies tags for the specified pipeline.

Examples:

Request syntax with placeholder values


resp = client.add_tags({
  pipeline_id: "id", # required
  tags: [ # required
    {
      key: "tagKey", # required
      value: "tagValue", # required
    },
  ],
})

Parameters:

  • params (Hash) (defaults to: {})

    ({})

Options Hash (params):

  • :pipeline_id (required, String)

    The ID of the pipeline.

  • :tags (required, Array<Types::Tag>)

    The tags to add, as key/value pairs.

Returns:

  • (Struct)

    Returns an empty response.

See Also:



470
471
472
473
# File 'gems/aws-sdk-datapipeline/lib/aws-sdk-datapipeline/client.rb', line 470

def add_tags(params = {}, options = {})
  req = build_request(:add_tags, params)
  req.send_request(options)
end

#create_pipeline(params = {}) ⇒ Types::CreatePipelineOutput

Creates a new, empty pipeline. Use PutPipelineDefinition to populate the pipeline.

Examples:

Request syntax with placeholder values


resp = client.create_pipeline({
  name: "id", # required
  unique_id: "id", # required
  description: "string",
  tags: [
    {
      key: "tagKey", # required
      value: "tagValue", # required
    },
  ],
})

Response structure


resp.pipeline_id #=> String

Parameters:

  • params (Hash) (defaults to: {})

    ({})

Options Hash (params):

  • :name (required, String)

    The name for the pipeline. You can use the same name for multiple pipelines associated with your AWS account, because AWS Data Pipeline assigns each pipeline a unique pipeline identifier.

  • :unique_id (required, String)

    A unique identifier. This identifier is not the same as the pipeline identifier assigned by AWS Data Pipeline. You are responsible for defining the format and ensuring the uniqueness of this identifier. You use this parameter to ensure idempotency during repeated calls to CreatePipeline. For example, if the first call to CreatePipeline does not succeed, you can pass in the same unique identifier and pipeline name combination on a subsequent call to CreatePipeline. CreatePipeline ensures that if a pipeline already exists with the same name and unique identifier, a new pipeline is not created. Instead, you'll receive the pipeline identifier from the previous attempt. The uniqueness of the name and unique identifier combination is scoped to the AWS account or IAM user credentials.

  • :description (String)

    The description for the pipeline.

  • :tags (Array<Types::Tag>)

    A list of tags to associate with the pipeline at creation. Tags let you control access to pipelines. For more information, see Controlling User Access to Pipelines in the AWS Data Pipeline Developer Guide.

Returns:

See Also:



536
537
538
539
# File 'gems/aws-sdk-datapipeline/lib/aws-sdk-datapipeline/client.rb', line 536

def create_pipeline(params = {}, options = {})
  req = build_request(:create_pipeline, params)
  req.send_request(options)
end

#deactivate_pipeline(params = {}) ⇒ Struct

Deactivates the specified running pipeline. The pipeline is set to the DEACTIVATING state until the deactivation process completes.

To resume a deactivated pipeline, use ActivatePipeline. By default, the pipeline resumes from the last completed execution. Optionally, you can specify the date and time to resume the pipeline.

Examples:

Request syntax with placeholder values


resp = client.deactivate_pipeline({
  pipeline_id: "id", # required
  cancel_active: false,
})

Parameters:

  • params (Hash) (defaults to: {})

    ({})

Options Hash (params):

  • :pipeline_id (required, String)

    The ID of the pipeline.

  • :cancel_active (Boolean)

    Indicates whether to cancel any running objects. The default is true, which sets the state of any running objects to CANCELED. If this value is false, the pipeline is deactivated after all running objects finish.

Returns:

  • (Struct)

    Returns an empty response.

See Also:



570
571
572
573
# File 'gems/aws-sdk-datapipeline/lib/aws-sdk-datapipeline/client.rb', line 570

def deactivate_pipeline(params = {}, options = {})
  req = build_request(:deactivate_pipeline, params)
  req.send_request(options)
end

#delete_pipeline(params = {}) ⇒ Struct

Deletes a pipeline, its pipeline definition, and its run history. AWS Data Pipeline attempts to cancel instances associated with the pipeline that are currently being processed by task runners.

Deleting a pipeline cannot be undone. You cannot query or restore a deleted pipeline. To temporarily pause a pipeline instead of deleting it, call SetStatus with the status set to PAUSE on individual components. Components that are paused by SetStatus can be resumed.

Examples:

Request syntax with placeholder values


resp = client.delete_pipeline({
  pipeline_id: "id", # required
})

Parameters:

  • params (Hash) (defaults to: {})

    ({})

Options Hash (params):

  • :pipeline_id (required, String)

    The ID of the pipeline.

Returns:

  • (Struct)

    Returns an empty response.

See Also:



599
600
601
602
# File 'gems/aws-sdk-datapipeline/lib/aws-sdk-datapipeline/client.rb', line 599

def delete_pipeline(params = {}, options = {})
  req = build_request(:delete_pipeline, params)
  req.send_request(options)
end

#describe_objects(params = {}) ⇒ Types::DescribeObjectsOutput

Gets the object definitions for a set of objects associated with the pipeline. Object definitions are composed of a set of fields that define the properties of the object.

The returned response is a pageable response and is Enumerable. For details on usage see PageableResponse.

Examples:

Request syntax with placeholder values


resp = client.describe_objects({
  pipeline_id: "id", # required
  object_ids: ["id"], # required
  evaluate_expressions: false,
  marker: "string",
})

Response structure


resp.pipeline_objects #=> Array
resp.pipeline_objects[0].id #=> String
resp.pipeline_objects[0].name #=> String
resp.pipeline_objects[0].fields #=> Array
resp.pipeline_objects[0].fields[0].key #=> String
resp.pipeline_objects[0].fields[0].string_value #=> String
resp.pipeline_objects[0].fields[0].ref_value #=> String
resp.marker #=> String
resp.has_more_results #=> Boolean

Parameters:

  • params (Hash) (defaults to: {})

    ({})

Options Hash (params):

  • :pipeline_id (required, String)

    The ID of the pipeline that contains the object definitions.

  • :object_ids (required, Array<String>)

    The IDs of the pipeline objects that contain the definitions to be described. You can pass as many as 25 identifiers in a single call to DescribeObjects.

  • :evaluate_expressions (Boolean)

    Indicates whether any expressions in the object should be evaluated when the object descriptions are returned.

  • :marker (String)

    The starting point for the results to be returned. For the first call, this value should be empty. As long as there are more results, continue to call DescribeObjects with the marker value from the previous call to retrieve the next set of results.

Returns:

See Also:



659
660
661
662
# File 'gems/aws-sdk-datapipeline/lib/aws-sdk-datapipeline/client.rb', line 659

def describe_objects(params = {}, options = {})
  req = build_request(:describe_objects, params)
  req.send_request(options)
end

#describe_pipelines(params = {}) ⇒ Types::DescribePipelinesOutput

Retrieves metadata about one or more pipelines. The information retrieved includes the name of the pipeline, the pipeline identifier, its current state, and the user account that owns the pipeline. Using account credentials, you can retrieve metadata about pipelines that you or your IAM users have created. If you are using an IAM user account, you can retrieve metadata about only those pipelines for which you have read permissions.

To retrieve the full pipeline definition instead of metadata about the pipeline, call GetPipelineDefinition.

Examples:

Request syntax with placeholder values


resp = client.describe_pipelines({
  pipeline_ids: ["id"], # required
})

Response structure


resp.pipeline_description_list #=> Array
resp.pipeline_description_list[0].pipeline_id #=> String
resp.pipeline_description_list[0].name #=> String
resp.pipeline_description_list[0].fields #=> Array
resp.pipeline_description_list[0].fields[0].key #=> String
resp.pipeline_description_list[0].fields[0].string_value #=> String
resp.pipeline_description_list[0].fields[0].ref_value #=> String
resp.pipeline_description_list[0].description #=> String
resp.pipeline_description_list[0].tags #=> Array
resp.pipeline_description_list[0].tags[0].key #=> String
resp.pipeline_description_list[0].tags[0].value #=> String

Parameters:

  • params (Hash) (defaults to: {})

    ({})

Options Hash (params):

  • :pipeline_ids (required, Array<String>)

    The IDs of the pipelines to describe. You can pass as many as 25 identifiers in a single call. To obtain pipeline IDs, call ListPipelines.

Returns:

See Also:



708
709
710
711
# File 'gems/aws-sdk-datapipeline/lib/aws-sdk-datapipeline/client.rb', line 708

def describe_pipelines(params = {}, options = {})
  req = build_request(:describe_pipelines, params)
  req.send_request(options)
end

#evaluate_expression(params = {}) ⇒ Types::EvaluateExpressionOutput

Task runners call EvaluateExpression to evaluate a string in the context of the specified object. For example, a task runner can evaluate SQL queries stored in Amazon S3.

Examples:

Request syntax with placeholder values


resp = client.evaluate_expression({
  pipeline_id: "id", # required
  object_id: "id", # required
  expression: "longString", # required
})

Response structure


resp.evaluated_expression #=> String

Parameters:

  • params (Hash) (defaults to: {})

    ({})

Options Hash (params):

  • :pipeline_id (required, String)

    The ID of the pipeline.

  • :object_id (required, String)

    The ID of the object.

  • :expression (required, String)

    The expression to evaluate.

Returns:

See Also:



746
747
748
749
# File 'gems/aws-sdk-datapipeline/lib/aws-sdk-datapipeline/client.rb', line 746

def evaluate_expression(params = {}, options = {})
  req = build_request(:evaluate_expression, params)
  req.send_request(options)
end

#get_pipeline_definition(params = {}) ⇒ Types::GetPipelineDefinitionOutput

Gets the definition of the specified pipeline. You can call GetPipelineDefinition to retrieve the pipeline definition that you provided using PutPipelineDefinition.

Examples:

Request syntax with placeholder values


resp = client.get_pipeline_definition({
  pipeline_id: "id", # required
  version: "string",
})

Response structure


resp.pipeline_objects #=> Array
resp.pipeline_objects[0].id #=> String
resp.pipeline_objects[0].name #=> String
resp.pipeline_objects[0].fields #=> Array
resp.pipeline_objects[0].fields[0].key #=> String
resp.pipeline_objects[0].fields[0].string_value #=> String
resp.pipeline_objects[0].fields[0].ref_value #=> String
resp.parameter_objects #=> Array
resp.parameter_objects[0].id #=> String
resp.parameter_objects[0].attributes #=> Array
resp.parameter_objects[0].attributes[0].key #=> String
resp.parameter_objects[0].attributes[0].string_value #=> String
resp.parameter_values #=> Array
resp.parameter_values[0].id #=> String
resp.parameter_values[0].string_value #=> String

Parameters:

  • params (Hash) (defaults to: {})

    ({})

Options Hash (params):

  • :pipeline_id (required, String)

    The ID of the pipeline.

  • :version (String)

    The version of the pipeline definition to retrieve. Set this parameter to latest (default) to use the last definition saved to the pipeline or active to use the last definition that was activated.

Returns:

See Also:



798
799
800
801
# File 'gems/aws-sdk-datapipeline/lib/aws-sdk-datapipeline/client.rb', line 798

def get_pipeline_definition(params = {}, options = {})
  req = build_request(:get_pipeline_definition, params)
  req.send_request(options)
end

#list_pipelines(params = {}) ⇒ Types::ListPipelinesOutput

Lists the pipeline identifiers for all active pipelines that you have permission to access.

The returned response is a pageable response and is Enumerable. For details on usage see PageableResponse.

Examples:

Request syntax with placeholder values


resp = client.list_pipelines({
  marker: "string",
})

Response structure


resp.pipeline_id_list #=> Array
resp.pipeline_id_list[0].id #=> String
resp.pipeline_id_list[0].name #=> String
resp.marker #=> String
resp.has_more_results #=> Boolean

Parameters:

  • params (Hash) (defaults to: {})

    ({})

Options Hash (params):

  • :marker (String)

    The starting point for the results to be returned. For the first call, this value should be empty. As long as there are more results, continue to call ListPipelines with the marker value from the previous call to retrieve the next set of results.

Returns:

See Also:



838
839
840
841
# File 'gems/aws-sdk-datapipeline/lib/aws-sdk-datapipeline/client.rb', line 838

def list_pipelines(params = {}, options = {})
  req = build_request(:list_pipelines, params)
  req.send_request(options)
end

#poll_for_task(params = {}) ⇒ Types::PollForTaskOutput

Task runners call PollForTask to receive a task to perform from AWS Data Pipeline. The task runner specifies which tasks it can perform by setting a value for the workerGroup parameter. The task returned can come from any of the pipelines that match the workerGroup value passed in by the task runner and that was launched using the IAM user credentials specified by the task runner.

If tasks are ready in the work queue, PollForTask returns a response immediately. If no tasks are available in the queue, PollForTask uses long-polling and holds on to a poll connection for up to a 90 seconds, during which time the first newly scheduled task is handed to the task runner. To accomodate this, set the socket timeout in your task runner to 90 seconds. The task runner should not call PollForTask again on the same workerGroup until it receives a response, and this can take up to 90 seconds.

Examples:

Request syntax with placeholder values


resp = client.poll_for_task({
  worker_group: "string", # required
  hostname: "id",
  instance_identity: {
    document: "string",
    signature: "string",
  },
})

Response structure


resp.task_object.task_id #=> String
resp.task_object.pipeline_id #=> String
resp.task_object.attempt_id #=> String
resp.task_object.objects #=> Hash
resp.task_object.objects["id"].id #=> String
resp.task_object.objects["id"].name #=> String
resp.task_object.objects["id"].fields #=> Array
resp.task_object.objects["id"].fields[0].key #=> String
resp.task_object.objects["id"].fields[0].string_value #=> String
resp.task_object.objects["id"].fields[0].ref_value #=> String

Parameters:

  • params (Hash) (defaults to: {})

    ({})

Options Hash (params):

  • :worker_group (required, String)

    The type of task the task runner is configured to accept and process. The worker group is set as a field on objects in the pipeline when they are created. You can only specify a single value for workerGroup in the call to PollForTask. There are no wildcard values permitted in workerGroup; the string must be an exact, case-sensitive, match.

  • :hostname (String)

    The public DNS name of the calling task runner.

  • :instance_identity (Types::InstanceIdentity)

    Identity information for the EC2 instance that is hosting the task runner. You can get this value from the instance using http://169.254.169.254/latest/meta-data/instance-id. For more information, see Instance Metadata in the Amazon Elastic Compute Cloud User Guide. Passing in this value proves that your task runner is running on an EC2 instance, and ensures the proper AWS Data Pipeline service charges are applied to your pipeline.

Returns:

See Also:



915
916
917
918
# File 'gems/aws-sdk-datapipeline/lib/aws-sdk-datapipeline/client.rb', line 915

def poll_for_task(params = {}, options = {})
  req = build_request(:poll_for_task, params)
  req.send_request(options)
end

#put_pipeline_definition(params = {}) ⇒ Types::PutPipelineDefinitionOutput

Adds tasks, schedules, and preconditions to the specified pipeline. You can use PutPipelineDefinition to populate a new pipeline.

PutPipelineDefinition also validates the configuration as it adds it to the pipeline. Changes to the pipeline are saved unless one of the following three validation errors exists in the pipeline.

  1. An object is missing a name or identifier field.
  2. A string or reference field is empty.
  3. The number of objects in the pipeline exceeds the maximum allowed objects.
  4. The pipeline is in a FINISHED state.

Pipeline object definitions are passed to the PutPipelineDefinition action and returned by the GetPipelineDefinition action.

Examples:

Request syntax with placeholder values


resp = client.put_pipeline_definition({
  pipeline_id: "id", # required
  pipeline_objects: [ # required
    {
      id: "id", # required
      name: "id", # required
      fields: [ # required
        {
          key: "fieldNameString", # required
          string_value: "fieldStringValue",
          ref_value: "fieldNameString",
        },
      ],
    },
  ],
  parameter_objects: [
    {
      id: "fieldNameString", # required
      attributes: [ # required
        {
          key: "attributeNameString", # required
          string_value: "attributeValueString", # required
        },
      ],
    },
  ],
  parameter_values: [
    {
      id: "fieldNameString", # required
      string_value: "fieldStringValue", # required
    },
  ],
})

Response structure


resp.validation_errors #=> Array
resp.validation_errors[0].id #=> String
resp.validation_errors[0].errors #=> Array
resp.validation_errors[0].errors[0] #=> String
resp.validation_warnings #=> Array
resp.validation_warnings[0].id #=> String
resp.validation_warnings[0].warnings #=> Array
resp.validation_warnings[0].warnings[0] #=> String
resp.errored #=> Boolean

Parameters:

  • params (Hash) (defaults to: {})

    ({})

Options Hash (params):

  • :pipeline_id (required, String)

    The ID of the pipeline.

  • :pipeline_objects (required, Array<Types::PipelineObject>)

    The objects that define the pipeline. These objects overwrite the existing pipeline definition.

  • :parameter_objects (Array<Types::ParameterObject>)

    The parameter objects used with the pipeline.

  • :parameter_values (Array<Types::ParameterValue>)

    The parameter values used with the pipeline.

Returns:

See Also:



1007
1008
1009
1010
# File 'gems/aws-sdk-datapipeline/lib/aws-sdk-datapipeline/client.rb', line 1007

def put_pipeline_definition(params = {}, options = {})
  req = build_request(:put_pipeline_definition, params)
  req.send_request(options)
end

#query_objects(params = {}) ⇒ Types::QueryObjectsOutput

Queries the specified pipeline for the names of objects that match the specified set of conditions.

The returned response is a pageable response and is Enumerable. For details on usage see PageableResponse.

Examples:

Request syntax with placeholder values


resp = client.query_objects({
  pipeline_id: "id", # required
  query: {
    selectors: [
      {
        field_name: "string",
        operator: {
          type: "EQ", # accepts EQ, REF_EQ, LE, GE, BETWEEN
          values: ["string"],
        },
      },
    ],
  },
  sphere: "string", # required
  marker: "string",
  limit: 1,
})

Response structure


resp.ids #=> Array
resp.ids[0] #=> String
resp.marker #=> String
resp.has_more_results #=> Boolean

Parameters:

  • params (Hash) (defaults to: {})

    ({})

Options Hash (params):

  • :pipeline_id (required, String)

    The ID of the pipeline.

  • :query (Types::Query)

    The query that defines the objects to be returned. The Query object can contain a maximum of ten selectors. The conditions in the query are limited to top-level String fields in the object. These filters can be applied to components, instances, and attempts.

  • :sphere (required, String)

    Indicates whether the query applies to components or instances. The possible values are: COMPONENT, INSTANCE, and ATTEMPT.

  • :marker (String)

    The starting point for the results to be returned. For the first call, this value should be empty. As long as there are more results, continue to call QueryObjects with the marker value from the previous call to retrieve the next set of results.

  • :limit (Integer)

    The maximum number of object names that QueryObjects will return in a single call. The default value is 100.

Returns:

See Also:



1077
1078
1079
1080
# File 'gems/aws-sdk-datapipeline/lib/aws-sdk-datapipeline/client.rb', line 1077

def query_objects(params = {}, options = {})
  req = build_request(:query_objects, params)
  req.send_request(options)
end

#remove_tags(params = {}) ⇒ Struct

Removes existing tags from the specified pipeline.

Examples:

Request syntax with placeholder values


resp = client.remove_tags({
  pipeline_id: "id", # required
  tag_keys: ["string"], # required
})

Parameters:

  • params (Hash) (defaults to: {})

    ({})

Options Hash (params):

  • :pipeline_id (required, String)

    The ID of the pipeline.

  • :tag_keys (required, Array<String>)

    The keys of the tags to remove.

Returns:

  • (Struct)

    Returns an empty response.

See Also:



1103
1104
1105
1106
# File 'gems/aws-sdk-datapipeline/lib/aws-sdk-datapipeline/client.rb', line 1103

def remove_tags(params = {}, options = {})
  req = build_request(:remove_tags, params)
  req.send_request(options)
end

#report_task_progress(params = {}) ⇒ Types::ReportTaskProgressOutput

Task runners call ReportTaskProgress when assigned a task to acknowledge that it has the task. If the web service does not receive this acknowledgement within 2 minutes, it assigns the task in a subsequent PollForTask call. After this initial acknowledgement, the task runner only needs to report progress every 15 minutes to maintain its ownership of the task. You can change this reporting time from 15 minutes by specifying a reportProgressTimeout field in your pipeline.

If a task runner does not report its status after 5 minutes, AWS Data Pipeline assumes that the task runner is unable to process the task and reassigns the task in a subsequent response to PollForTask. Task runners should call ReportTaskProgress every 60 seconds.

Examples:

Request syntax with placeholder values


resp = client.report_task_progress({
  task_id: "taskId", # required
  fields: [
    {
      key: "fieldNameString", # required
      string_value: "fieldStringValue",
      ref_value: "fieldNameString",
    },
  ],
})

Response structure


resp.canceled #=> Boolean

Parameters:

  • params (Hash) (defaults to: {})

    ({})

Options Hash (params):

  • :task_id (required, String)

    The ID of the task assigned to the task runner. This value is provided in the response for PollForTask.

  • :fields (Array<Types::Field>)

    Key-value pairs that define the properties of the ReportTaskProgressInput object.

Returns:

See Also:



1155
1156
1157
1158
# File 'gems/aws-sdk-datapipeline/lib/aws-sdk-datapipeline/client.rb', line 1155

def report_task_progress(params = {}, options = {})
  req = build_request(:report_task_progress, params)
  req.send_request(options)
end

#report_task_runner_heartbeat(params = {}) ⇒ Types::ReportTaskRunnerHeartbeatOutput

Task runners call ReportTaskRunnerHeartbeat every 15 minutes to indicate that they are operational. If the AWS Data Pipeline Task Runner is launched on a resource managed by AWS Data Pipeline, the web service can use this call to detect when the task runner application has failed and restart a new instance.

Examples:

Request syntax with placeholder values


resp = client.report_task_runner_heartbeat({
  taskrunner_id: "id", # required
  worker_group: "string",
  hostname: "id",
})

Response structure


resp.terminate #=> Boolean

Parameters:

  • params (Hash) (defaults to: {})

    ({})

Options Hash (params):

  • :taskrunner_id (required, String)

    The ID of the task runner. This value should be unique across your AWS account. In the case of AWS Data Pipeline Task Runner launched on a resource managed by AWS Data Pipeline, the web service provides a unique identifier when it launches the application. If you have written a custom task runner, you should assign a unique identifier for the task runner.

  • :worker_group (String)

    The type of task the task runner is configured to accept and process. The worker group is set as a field on objects in the pipeline when they are created. You can only specify a single value for workerGroup. There are no wildcard values permitted in workerGroup; the string must be an exact, case-sensitive, match.

  • :hostname (String)

    The public DNS name of the task runner.

Returns:

See Also:



1204
1205
1206
1207
# File 'gems/aws-sdk-datapipeline/lib/aws-sdk-datapipeline/client.rb', line 1204

def report_task_runner_heartbeat(params = {}, options = {})
  req = build_request(:report_task_runner_heartbeat, params)
  req.send_request(options)
end

#set_status(params = {}) ⇒ Struct

Requests that the status of the specified physical or logical pipeline objects be updated in the specified pipeline. This update might not occur immediately, but is eventually consistent. The status that can be set depends on the type of object (for example, DataNode or Activity). You cannot perform this operation on FINISHED pipelines and attempting to do so returns InvalidRequestException.

Examples:

Request syntax with placeholder values


resp = client.set_status({
  pipeline_id: "id", # required
  object_ids: ["id"], # required
  status: "string", # required
})

Parameters:

  • params (Hash) (defaults to: {})

    ({})

Options Hash (params):

  • :pipeline_id (required, String)

    The ID of the pipeline that contains the objects.

  • :object_ids (required, Array<String>)

    The IDs of the objects. The corresponding objects can be either physical or components, but not a mix of both types.

  • :status (required, String)

    The status to be set on all the objects specified in objectIds. For components, use PAUSE or RESUME. For instances, use TRY_CANCEL, RERUN, or MARK_FINISHED.

Returns:

  • (Struct)

    Returns an empty response.

See Also:



1242
1243
1244
1245
# File 'gems/aws-sdk-datapipeline/lib/aws-sdk-datapipeline/client.rb', line 1242

def set_status(params = {}, options = {})
  req = build_request(:set_status, params)
  req.send_request(options)
end

#set_task_status(params = {}) ⇒ Struct

Task runners call SetTaskStatus to notify AWS Data Pipeline that a task is completed and provide information about the final status. A task runner makes this call regardless of whether the task was sucessful. A task runner does not need to call SetTaskStatus for tasks that are canceled by the web service during a call to ReportTaskProgress.

Examples:

Request syntax with placeholder values


resp = client.set_task_status({
  task_id: "taskId", # required
  task_status: "FINISHED", # required, accepts FINISHED, FAILED, FALSE
  error_id: "string",
  error_message: "errorMessage",
  error_stack_trace: "string",
})

Parameters:

  • params (Hash) (defaults to: {})

    ({})

Options Hash (params):

  • :task_id (required, String)

    The ID of the task assigned to the task runner. This value is provided in the response for PollForTask.

  • :task_status (required, String)

    If FINISHED, the task successfully completed. If FAILED, the task ended unsuccessfully. Preconditions use false.

  • :error_id (String)

    If an error occurred during the task, this value specifies the error code. This value is set on the physical attempt object. It is used to display error information to the user. It should not start with string "Service_" which is reserved by the system.

  • :error_message (String)

    If an error occurred during the task, this value specifies a text description of the error. This value is set on the physical attempt object. It is used to display error information to the user. The web service does not parse this value.

  • :error_stack_trace (String)

    If an error occurred during the task, this value specifies the stack trace associated with the error. This value is set on the physical attempt object. It is used to display error information to the user. The web service does not parse this value.

Returns:

  • (Struct)

    Returns an empty response.

See Also:



1296
1297
1298
1299
# File 'gems/aws-sdk-datapipeline/lib/aws-sdk-datapipeline/client.rb', line 1296

def set_task_status(params = {}, options = {})
  req = build_request(:set_task_status, params)
  req.send_request(options)
end

#validate_pipeline_definition(params = {}) ⇒ Types::ValidatePipelineDefinitionOutput

Validates the specified pipeline definition to ensure that it is well formed and can be run without error.

Examples:

Request syntax with placeholder values


resp = client.validate_pipeline_definition({
  pipeline_id: "id", # required
  pipeline_objects: [ # required
    {
      id: "id", # required
      name: "id", # required
      fields: [ # required
        {
          key: "fieldNameString", # required
          string_value: "fieldStringValue",
          ref_value: "fieldNameString",
        },
      ],
    },
  ],
  parameter_objects: [
    {
      id: "fieldNameString", # required
      attributes: [ # required
        {
          key: "attributeNameString", # required
          string_value: "attributeValueString", # required
        },
      ],
    },
  ],
  parameter_values: [
    {
      id: "fieldNameString", # required
      string_value: "fieldStringValue", # required
    },
  ],
})

Response structure


resp.validation_errors #=> Array
resp.validation_errors[0].id #=> String
resp.validation_errors[0].errors #=> Array
resp.validation_errors[0].errors[0] #=> String
resp.validation_warnings #=> Array
resp.validation_warnings[0].id #=> String
resp.validation_warnings[0].warnings #=> Array
resp.validation_warnings[0].warnings[0] #=> String
resp.errored #=> Boolean

Parameters:

  • params (Hash) (defaults to: {})

    ({})

Options Hash (params):

  • :pipeline_id (required, String)

    The ID of the pipeline.

  • :pipeline_objects (required, Array<Types::PipelineObject>)

    The objects that define the pipeline changes to validate against the pipeline.

  • :parameter_objects (Array<Types::ParameterObject>)

    The parameter objects used with the pipeline.

  • :parameter_values (Array<Types::ParameterValue>)

    The parameter values used with the pipeline.

Returns:

See Also:



1375
1376
1377
1378
# File 'gems/aws-sdk-datapipeline/lib/aws-sdk-datapipeline/client.rb', line 1375

def validate_pipeline_definition(params = {}, options = {})
  req = build_request(:validate_pipeline_definition, params)
  req.send_request(options)
end