You are viewing documentation for version 1 of the AWS SDK for Ruby. Version 2 documentation can be found here.

Class: AWS::EMR::Client::V20090331

Inherits:
AWS::EMR::Client show all
Defined in:
lib/aws/emr/client.rb

Constant Summary

Constant Summary

Constants inherited from AWS::EMR::Client

API_VERSION

Instance Attribute Summary

Attributes inherited from Core::Client

#config

Instance Method Summary collapse

Methods inherited from Core::Client

#initialize, #log_warning, #operations, #with_http_handler, #with_options

Constructor Details

This class inherits a constructor from AWS::Core::Client

Instance Method Details

#add_instance_groups(options = {}) ⇒ Core::Response

Calls the AddInstanceGroups API operation.

Parameters:

  • options (Hash) (defaults to: {})
    • :instance_groups - required - (Array<) Instance Groups to add.
      • :name - (String) Friendly name given to the instance group.
      • :market - (String) Market type of the Amazon EC2 instances used to create a cluster node. Valid values include:
        • ON_DEMAND
        • SPOT
      • :instance_role - required - (String) The role of the instance group in the cluster. Valid values include:
        • MASTER
        • CORE
        • TASK
      • :bid_price - (String) Bid price for each Amazon EC2 instance in the instance group when launching nodes as Spot Instances, expressed in USD.
      • :instance_type - required - (String) The Amazon EC2 instance type for all instances in the instance group.
      • :instance_count - required - (Integer) Target number of instances for the instance group.
    • :job_flow_id - required - (String) Job flow in which to add the instance groups.

Returns:

  • (Core::Response)

    The #data method of the response object returns a hash with the following structure:

    • :job_flow_id - (String)
    • :instance_group_ids - (Array)

#add_job_flow_steps(options = {}) ⇒ Core::Response

Calls the AddJobFlowSteps API operation.

Parameters:

  • options (Hash) (defaults to: {})
    • :job_flow_id - required - (String) A string that uniquely identifies the job flow. This identifier is returned by RunJobFlow and can also be obtained from ListClusters.
    • :steps - required - (Array<) A list of StepConfig to be executed by the job flow.
      • :name - required - (String) The name of the job flow step.
      • :action_on_failure - (String) The action to take if the job flow step fails. Valid values include:
        • TERMINATE_JOB_FLOW
        • TERMINATE_CLUSTER
        • CANCEL_AND_WAIT
        • CONTINUE
      • :hadoop_jar_step - required - (Hash) The JAR file used for the job flow step.
        • :properties - (Array<) A list of Java properties that are set when the step runs. You can use these properties to pass key value pairs to your main function.
          • :key - (String) The unique identifier of a key value pair.
          • :value - (String) The value part of the identified key.
        • :jar - required - (String) A path to a JAR file run during the step.
        • :main_class - (String) The name of the main class in the specified Java file. If not specified, the JAR file should specify a Main-Class in its manifest file.
        • :args - (Array<) A list of command line arguments passed to the JAR file's main function when executed.

Returns:

  • (Core::Response)

    The #data method of the response object returns a hash with the following structure:

    • :step_ids - (Array)

#add_tags(options = {}) ⇒ Core::Response

Calls the AddTags API operation.

Parameters:

  • options (Hash) (defaults to: {})
    • :resource_id - required - (String) The Amazon EMR resource identifier to which tags will be added. This value must be a cluster identifier.
    • :tags - required - (Array<) A list of tags to associate with a cluster and propagate to Amazon EC2 instances. Tags are user-defined key/value pairs that consist of a required key string with a maximum of 128 characters, and an optional value string with a maximum of 256 characters.
      • :key - (String) A user-defined key, which is the minimum required information for a valid tag. For more information, see Tagging Amazon EMR Resources.
      • :value - (String) A user-defined value, which is optional in a tag. For more information, see Tagging Amazon EMR Resources.

Returns:

#describe_cluster(options = {}) ⇒ Core::Response

Calls the DescribeCluster API operation.

Parameters:

  • options (Hash) (defaults to: {})
    • :cluster_id - required - (String) The identifier of the cluster to describe.

Returns:

  • (Core::Response)

    The #data method of the response object returns a hash with the following structure:

    • :cluster - (Hash)
      • :id - (String)
      • :name - (String)
      • :status - (Hash)
        • :state - (String)
        • :state_change_reason - (Hash)
          • :code - (String)
          • :message - (String)
        • :timeline - (Hash)
          • :creation_date_time - (Time)
          • :ready_date_time - (Time)
          • :end_date_time - (Time)
      • :ec_2_instance_attributes - (Hash)
        • :ec2_key_name - (String)
        • :ec2_subnet_id - (String)
        • :ec_2_availability_zone - (String)
        • :iam_instance_profile - (String)
      • :log_uri - (String)
      • :requested_ami_version - (String)
      • :running_ami_version - (String)
      • :auto_terminate - (Boolean)
      • :termination_protected - (Boolean)
      • :visible_to_all_users - (Boolean)
      • :applications - (Array)
        • :name - (String)
        • :version - (String)
        • :args - (Array)
        • :additional_info - (Hash<String,String>)
      • :tags - (Array)
        • :key - (String)
        • :value - (String)
      • :service_role - (String)

#describe_job_flows(options = {}) ⇒ Core::Response

Calls the DescribeJobFlows API operation.

Parameters:

  • options (Hash) (defaults to: {})
    • :created_after - (String<) Return only job flows created after this date and time.
    • :created_before - (String<) Return only job flows created before this date and time.
    • :job_flow_ids - (Array<) Return only job flows whose job flow ID is contained in this list.
    • :job_flow_states - (Array<) Return only job flows whose state is contained in this list.

Returns:

  • (Core::Response)

    The #data method of the response object returns a hash with the following structure:

    • :job_flows - (Array)
      • :job_flow_id - (String)
      • :name - (String)
      • :log_uri - (String)
      • :ami_version - (String)
      • :execution_status_detail - (Hash)
        • :state - (String)
        • :creation_date_time - (Time)
        • :start_date_time - (Time)
        • :ready_date_time - (Time)
        • :end_date_time - (Time)
        • :last_state_change_reason - (String)
      • :instances - (Hash)
        • :master_instance_type - (String)
        • :master_public_dns_name - (String)
        • :master_instance_id - (String)
        • :slave_instance_type - (String)
        • :instance_count - (Integer)
        • :instance_groups - (Array)
          • :instance_group_id - (String)
          • :name - (String)
          • :market - (String)
          • :instance_role - (String)
          • :bid_price - (String)
          • :instance_type - (String)
          • :instance_request_count - (Integer)
          • :instance_running_count - (Integer)
          • :state - (String)
          • :last_state_change_reason - (String)
          • :creation_date_time - (Time)
          • :start_date_time - (Time)
          • :ready_date_time - (Time)
          • :end_date_time - (Time)
        • :normalized_instance_hours - (Integer)
        • :ec2_key_name - (String)
        • :ec2_subnet_id - (String)
        • :placement - (Hash)
          • :availability_zone - (String)
        • :keep_job_flow_alive_when_no_steps - (Boolean)
        • :termination_protected - (Boolean)
        • :hadoop_version - (String)
      • :steps - (Array)
        • :step_config - (Hash)
          • :name - (String)
          • :action_on_failure - (String)
          • :hadoop_jar_step - (Hash)
            • :properties - (Array)
              • :key - (String)
              • :value - (String)
            • :jar - (String)
            • :main_class - (String)
            • :args - (Array)
        • :execution_status_detail - (Hash)
          • :state - (String)
          • :creation_date_time - (Time)
          • :start_date_time - (Time)
          • :end_date_time - (Time)
          • :last_state_change_reason - (String)
      • :bootstrap_actions - (Array)
        • :bootstrap_action_config - (Hash)
          • :name - (String)
          • :script_bootstrap_action - (Hash)
            • :path - (String)
            • :args - (Array)
      • :supported_products - (Array)
      • :visible_to_all_users - (Boolean)
      • :job_flow_role - (String)
      • :service_role - (String)

#describe_step(options = {}) ⇒ Core::Response

Calls the DescribeStep API operation.

Parameters:

  • options (Hash) (defaults to: {})
    • :cluster_id - required - (String) The identifier of the cluster with steps to describe.
    • :step_id - required - (String) The identifier of the step to describe.

Returns:

  • (Core::Response)

    The #data method of the response object returns a hash with the following structure:

    • :step - (Hash)
      • :id - (String)
      • :name - (String)
      • :config - (Hash)
        • :jar - (String)
        • :properties - (Hash<String,String>)
        • :main_class - (String)
        • :args - (Array)
      • :action_on_failure - (String)
      • :status - (Hash)
        • :state - (String)
        • :state_change_reason - (Hash)
          • :code - (String)
          • :message - (String)
        • :timeline - (Hash)
          • :creation_date_time - (Time)
          • :start_date_time - (Time)
          • :end_date_time - (Time)

#list_bootstrap_actions(options = {}) ⇒ Core::Response

Calls the ListBootstrapActions API operation.

Parameters:

  • options (Hash) (defaults to: {})
    • :cluster_id - required - (String) The cluster identifier for the bootstrap actions to list .
    • :marker - (String) The pagination token that indicates the next set of results to retrieve .

Returns:

  • (Core::Response)

    The #data method of the response object returns a hash with the following structure:

    • :bootstrap_actions - (Array)
      • :name - (String)
      • :script_path - (String)
      • :args - (Array)
    • :marker - (String)

#list_clusters(options = {}) ⇒ Core::Response

Calls the ListClusters API operation.

Parameters:

  • options (Hash) (defaults to: {})
    • :created_after - (String<) The creation date and time beginning value filter for listing clusters .
    • :created_before - (String<) The creation date and time end value filter for listing clusters .
    • :cluster_states - (Array<) The cluster state filters to apply when listing clusters.
    • :marker - (String) The pagination token that indicates the next set of results to retrieve.

Returns:

  • (Core::Response)

    The #data method of the response object returns a hash with the following structure:

    • :clusters - (Array)
      • :id - (String)
      • :name - (String)
      • :status - (Hash)
        • :state - (String)
        • :state_change_reason - (Hash)
          • :code - (String)
          • :message - (String)
        • :timeline - (Hash)
          • :creation_date_time - (Time)
          • :ready_date_time - (Time)
          • :end_date_time - (Time)
    • :marker - (String)

#list_instance_groups(options = {}) ⇒ Core::Response

Calls the ListInstanceGroups API operation.

Parameters:

  • options (Hash) (defaults to: {})
    • :cluster_id - required - (String) The identifier of the cluster for which to list the instance groups.
    • :marker - (String) The pagination token that indicates the next set of results to retrieve.

Returns:

  • (Core::Response)

    The #data method of the response object returns a hash with the following structure:

    • :instance_groups - (Array)
      • :id - (String)
      • :name - (String)
      • :market - (String)
      • :instance_group_type - (String)
      • :bid_price - (String)
      • :instance_type - (String)
      • :requested_instance_count - (Integer)
      • :running_instance_count - (Integer)
      • :status - (Hash)
        • :state - (String)
        • :state_change_reason - (Hash)
          • :code - (String)
          • :message - (String)
        • :timeline - (Hash)
          • :creation_date_time - (Time)
          • :ready_date_time - (Time)
          • :end_date_time - (Time)
    • :marker - (String)

#list_instances(options = {}) ⇒ Core::Response

Calls the ListInstances API operation.

Parameters:

  • options (Hash) (defaults to: {})
    • :cluster_id - required - (String) The identifier of the cluster for which to list the instances.
    • :instance_group_id - (String) The identifier of the instance group for which to list the instances.
    • :instance_group_types - (Array<) The type of instance group for which to list the instances.
    • :marker - (String) The pagination token that indicates the next set of results to retrieve.

Returns:

  • (Core::Response)

    The #data method of the response object returns a hash with the following structure:

    • :instances - (Array)
      • :id - (String)
      • :ec2_instance_id - (String)
      • :public_dns_name - (String)
      • :public_ip_address - (String)
      • :private_dns_name - (String)
      • :private_ip_address - (String)
      • :status - (Hash)
        • :state - (String)
        • :state_change_reason - (Hash)
          • :code - (String)
          • :message - (String)
        • :timeline - (Hash)
          • :creation_date_time - (Time)
          • :ready_date_time - (Time)
          • :end_date_time - (Time)
    • :marker - (String)

#list_steps(options = {}) ⇒ Core::Response

Calls the ListSteps API operation.

Parameters:

  • options (Hash) (defaults to: {})
    • :cluster_id - required - (String) The identifier of the cluster for which to list the steps.
    • :step_states - (Array<) The filter to limit the step list based on certain states.
    • :marker - (String) The pagination token that indicates the next set of results to retrieve.

Returns:

  • (Core::Response)

    The #data method of the response object returns a hash with the following structure:

    • :steps - (Array)
      • :id - (String)
      • :name - (String)
      • :status - (Hash)
        • :state - (String)
        • :state_change_reason - (Hash)
          • :code - (String)
          • :message - (String)
        • :timeline - (Hash)
          • :creation_date_time - (Time)
          • :start_date_time - (Time)
          • :end_date_time - (Time)
    • :marker - (String)

#modify_instance_groups(options = {}) ⇒ Core::Response

Calls the ModifyInstanceGroups API operation.

Parameters:

  • options (Hash) (defaults to: {})
    • :instance_groups - (Array<) Instance groups to change.
      • :instance_group_id - required - (String) Unique ID of the instance group to expand or shrink.
      • :instance_count - (Integer) Target size for the instance group.
      • :ec2_instance_ids_to_terminate - (Array<) The EC2 InstanceIds to terminate. For advanced users only. Once you terminate the instances, the instance group will not return to its original requested size.

Returns:

#remove_tags(options = {}) ⇒ Core::Response

Calls the RemoveTags API operation.

Parameters:

  • options (Hash) (defaults to: {})
    • :resource_id - required - (String) The Amazon EMR resource identifier from which tags will be removed. This value must be a cluster identifier.
    • :tag_keys - required - (Array<) A list of tag keys to remove from a resource.

Returns:

#run_job_flow(options = {}) ⇒ Core::Response

Calls the RunJobFlow API operation.

Parameters:

  • options (Hash) (defaults to: {})
    • :name - required - (String) The name of the job flow.
    • :log_uri - (String) The location in Amazon S3 to write the log files of the job flow. If a value is not provided, logs are not created.
    • :additional_info - (String) A JSON string for selecting additional features.
    • :ami_version - (String) The version of the Amazon Machine Image (AMI) to use when launching Amazon EC2 instances in the job flow. The following values are valid: "latest" (uses the latest AMI) The version number of the AMI to use, for example, "2.0" If the AMI supports multiple versions of Hadoop (for example, AMI 1.0 supports both Hadoop 0.18 and 0.20) you can use the JobFlowInstancesConfig HadoopVersion parameter to modify the version of Hadoop from the defaults shown above. For details about the AMI versions currently supported by Amazon Elastic MapReduce, go to AMI Versions Supported in Elastic MapReduce in the Amazon Elastic MapReduce Developer's Guide.
    • :instances - required - (Hash) A specification of the number and type of Amazon EC2 instances on which to run the job flow.
      • :master_instance_type - (String) The EC2 instance type of the master node.
      • :slave_instance_type - (String) The EC2 instance type of the slave nodes.
      • :instance_count - (Integer) The number of Amazon EC2 instances used to execute the job flow.
      • :instance_groups - (Array<) Configuration for the job flow's instance groups.
        • :name - (String) Friendly name given to the instance group.
        • :market - (String) Market type of the Amazon EC2 instances used to create a cluster node. Valid values include:
          • ON_DEMAND
          • SPOT
        • :instance_role - required - (String) The role of the instance group in the cluster. Valid values include:
          • MASTER
          • CORE
          • TASK
        • :bid_price - (String) Bid price for each Amazon EC2 instance in the instance group when launching nodes as Spot Instances, expressed in USD.
        • :instance_type - required - (String) The Amazon EC2 instance type for all instances in the instance group.
        • :instance_count - required - (Integer) Target number of instances for the instance group.
      • :ec2_key_name - (String) The name of the Amazon EC2 key pair that can be used to ssh to the master node as the user called "hadoop."
      • :placement - (Hash) The Availability Zone the job flow will run in.
        • :availability_zone - required - (String) The Amazon EC2 Availability Zone for the job flow.
      • :keep_job_flow_alive_when_no_steps - (Boolean) Specifies whether the job flow should terminate after completing all steps.
      • :termination_protected - (Boolean) Specifies whether to lock the job flow to prevent the Amazon EC2 instances from being terminated by API call, user intervention, or in the event of a job flow error.
      • :hadoop_version - (String) The Hadoop version for the job flow. Valid inputs are "0.18", "0.20", or "0.20.205". If you do not set this value, the default of 0.18 is used, unless the AmiVersion parameter is set in the RunJobFlow call, in which case the default version of Hadoop for that AMI version is used.
      • :ec2_subnet_id - (String) To launch the job flow in Amazon Virtual Private Cloud (Amazon VPC), set this parameter to the identifier of the Amazon VPC subnet where you want the job flow to launch. If you do not specify this value, the job flow is launched in the normal Amazon Web Services cloud, outside of an Amazon VPC. Amazon VPC currently does not support cluster compute quadruple extra large (cc1.4xlarge) instances. Thus you cannot specify the cc1.4xlarge instance type for nodes of a job flow launched in a Amazon VPC.
    • :steps - (Array<) A list of steps to be executed by the job flow.
      • :name - required - (String) The name of the job flow step.
      • :action_on_failure - (String) The action to take if the job flow step fails. Valid values include:
        • TERMINATE_JOB_FLOW
        • TERMINATE_CLUSTER
        • CANCEL_AND_WAIT
        • CONTINUE
      • :hadoop_jar_step - required - (Hash) The JAR file used for the job flow step.
        • :properties - (Array<) A list of Java properties that are set when the step runs. You can use these properties to pass key value pairs to your main function.
          • :key - (String) The unique identifier of a key value pair.
          • :value - (String) The value part of the identified key.
        • :jar - required - (String) A path to a JAR file run during the step.
        • :main_class - (String) The name of the main class in the specified Java file. If not specified, the JAR file should specify a Main-Class in its manifest file.
        • :args - (Array<) A list of command line arguments passed to the JAR file's main function when executed.
    • :bootstrap_actions - (Array<) A list of bootstrap actions that will be run before Hadoop is started on the cluster nodes.
      • :name - required - (String) The name of the bootstrap action.
      • :script_bootstrap_action - required - (Hash) The script run by the bootstrap action.
        • :path - required - (String) Location of the script to run during a bootstrap action. Can be either a location in Amazon S3 or on a local file system.
        • :args - (Array<) A list of command line arguments to pass to the bootstrap action script.
    • :supported_products - (Array<) A list of strings that indicates third-party software to use with the job flow. For more information, go to Use Third Party Applications with Amazon EMR. Currently supported values are: "mapr-m3" - launch the job flow using MapR M3 Edition. "mapr-m5" - launch the job flow using MapR M5 Edition.
    • :new_supported_products - (Array<) A list of strings that indicates third-party software to use with the job flow that accepts a user argument list. EMR accepts and forwards the argument list to the corresponding installation script as bootstrap action arguments. For more information, see Launch a Job Flow on the MapR Distribution for Hadoop. Currently supported values are: "mapr-m3" - launch the job flow using MapR M3 Edition. "mapr-m5" - launch the job flow using MapR M5 Edition. "mapr" with the user arguments specifying "--edition,m3" or "--edition,m5" - launch the job flow using MapR M3 or M5 Edition respectively.
      • :name - (String) The name of the product configuration.
      • :args - (Array<) The list of user-supplied arguments.
    • :visible_to_all_users - (Boolean) Whether the job flow is visible to all IAM users of the AWS account associated with the job flow. If this value is set to true , all IAM users of that AWS account can view and (if they have the proper policy permissions set) manage the job flow. If it is set to false , only the IAM user that created the job flow can view and manage it.
    • :job_flow_role - (String) An IAM role for the job flow. The EC2 instances of the job flow assume this role. The default role is EMRJobflowDefault. In order to use the default role, you must have already created it using the CLI.
    • :service_role - (String) IAM role that Amazon ElasticMapReduce will assume to work with AWS resources on your behalf. You may set this parameter to the name of an existing IAM role.
    • :tags - (Array<) A list of tags to associate with a cluster and propagate to Amazon EC2 instances.
      • :key - (String) A user-defined key, which is the minimum required information for a valid tag. For more information, see Tagging Amazon EMR Resources.
      • :value - (String) A user-defined value, which is optional in a tag. For more information, see Tagging Amazon EMR Resources.

Returns:

  • (Core::Response)

    The #data method of the response object returns a hash with the following structure:

    • :job_flow_id - (String)

#set_termination_protection(options = {}) ⇒ Core::Response

Calls the SetTerminationProtection API operation.

Parameters:

  • options (Hash) (defaults to: {})
    • :job_flow_ids - required - (Array<) A list of strings that uniquely identify the job flows to protect. This identifier is returned by RunJobFlow and can also be obtained from DescribeJobFlows .
    • :termination_protected - required - (Boolean) A Boolean that indicates whether to protect the job flow and prevent the Amazon EC2 instances in the cluster from shutting down due to API calls, user intervention, or job-flow error.

Returns:

#set_visible_to_all_users(options = {}) ⇒ Core::Response

Calls the SetVisibleToAllUsers API operation.

Parameters:

  • options (Hash) (defaults to: {})
    • :job_flow_ids - required - (Array<) Identifiers of the job flows to receive the new visibility setting.
    • :visible_to_all_users - required - (Boolean) Whether the specified job flows are visible to all IAM users of the AWS account associated with the job flow. If this value is set to True, all IAM users of that AWS account can view and, if they have the proper IAM policy permissions set, manage the job flows. If it is set to False, only the IAM user that created a job flow can view and manage it.

Returns:

#terminate_job_flows(options = {}) ⇒ Core::Response

Calls the TerminateJobFlows API operation.

Parameters:

  • options (Hash) (defaults to: {})
    • :job_flow_ids - required - (Array<) A list of job flows to be shutdown.

Returns: