Document History - AWS Data Pipeline

Document History

This documentation is associated with the 2012-10-29 version of AWS Data Pipeline.

Change Description Release Date
Added documentation for performing certain procedures using the AWS CLI. Removed AWS Data Pipeline console related procedures.

For more information, see Cloning Your Pipeline, Viewing Pipeline Logs, and Create a pipeline from Data Pipeline templates using the CLI.

26 May 2023
Added more content and samples for migrating from AWS Data Pipeline to other alternative services.

Updated the topic for migrating AWS Data Pipeline to either AWS Glue, AWS Step Functions, or Amazon MWAA with more information on each alternative, concept mappings between the services, and samples. For more information, see Migrating workloads from AWS Data Pipeline.

31 March 2023
Added information on AWS Data Pipeline support of IMDSv2.

AWS Data Pipeline supports IMDSv2 for Amazon EMR and Amazon EC2 resources. For more information, see Data Protection in AWS Data Pipeline, EmrCluster, and Ec2Resource.

16 December 2022
Added a topic for migrating from AWS Data Pipeline to other alternative services.

There are now other AWS services that offer customers a better data integration experience. You can migrate typical use cases of AWS Data Pipeline to either AWS Glue, AWS Step Functions, or Amazon MWAA. For more information, see Migrating workloads from AWS Data Pipeline.

16 December 2022

Updated the lists of supported Amazon EC2 and Amazon EMR instances.

Updated the list of IDs of the HVM (Hardware Virtual Machine) AMIs used for the instances.

Updated the lists of supported Amazon EC2 and Amazon EMR instances. For more information, see Supported Instance Types for Pipeline Work Activities.

Updated the list of IDs of the HVM (Hardware Virtual Machine) AMIs used for the instances. For more information, see Syntax and search for imageId.

9 November 2018
Added configuration for attaching Amazon EBS volumes to cluster nodes, and for launching an Amazon EMR cluster into a private subnet.

Added configuration options to an EMRcluster object. You can use these options in pipelines that use Amazon EMR clusters.

Use the coreEbsConfiguration, masterEbsConfiguration, and TaskEbsConfiguration fields to configure the attachment of Amazon EBS volumes to core, master, and task nodes in the Amazon EMR cluster. For more information, see Attach EBS volumes to cluster nodes.

Use the emrManagedMasterSecurityGroupId, emrManagedSlaveSecurityGroupId, and ServiceAccessSecurityGroupId fields to configure an Amazon EMR cluster in a private subnet. For more information, see Configure an Amazon EMR cluster in a private subnet.

For more information about EMRcluster syntax, see EmrCluster.

19 April 2018
Added the list of supported Amazon EC2 and Amazon EMR instances.

Added the list of instances that AWS Data Pipeline creates by default, if you do not specify an instance type in the pipeline definition. Added a list of supported Amazon EC2 and Amazon EMR instances. For more information, see Supported Instance Types for Pipeline Work Activities.

22 March 2018
Added support for On-demand pipelines.
  • Added support for On-demand pipelines, which allows you to re-run a pipeline by activating it again.

22 February 2016
Additional support for RDS databases
  • Added rdsInstanceId, region, and jdbcDriverJarUri to RdsDatabase.

  • Updated database in SqlActivity to also support RdsDatabase.

17 August 2015
Additional JDBC support
7 July 2015
HadoopActivity, Availability Zone, and Spot Support
  • Added support for submitting parallel work to Hadoop clusters. For more information, see HadoopActivity.

  • Added the ability to request Spot Instances with Ec2Resource and EmrCluster.

  • Added the ability to launch EmrCluster resources in a specified Availability Zone.

1 June 2015
Deactivating pipelines

Added support for deactivating active pipelines. For more information, see Deactivating Your Pipeline.

7 April 2015
Updated templates and console

Added new templates. Updated the Getting Started chapter to use the Getting Started with ShellCommandActivity template. For more information, see Create a pipeline from Data Pipeline templates using the CLI.

25 November 2014
VPC support

Added support for launching resources into a virtual private cloud (VPC).

12 March 2014
Region support

Added support for multiple service regions. In addition to us-east-1, AWS Data Pipeline is supported in eu-west-1, ap-northeast-1, ap-southeast-2, and us-west-2.

20 February 2014
Amazon Redshift support

Added support for Amazon Redshift in AWS Data Pipeline, including a new console template (Copy to Redshift) and a tutorial to demonstrate the template. For more information, see Copy Data to Amazon Redshift Using AWS Data Pipeline, RedshiftDataNode, RedshiftDatabase, and RedshiftCopyActivity.

6 November 2013
PigActivity

Added PigActivity, which provides native support for Pig. For more information, see PigActivity.

15 October 2013
New console template, activity, and data format

Added the new CrossRegion DynamoDB Copy console template, including the new HiveCopyActivity and DynamoDBExportDataFormat.

21 August 2013
Cascading failures and reruns

Added information about AWS Data Pipeline cascading failure and rerun behavior. For more information, see Cascading failures and reruns.

8 August 2013
Troubleshooting video

Added the AWS Data Pipeline Basic Troubleshooting video. For more information, see Troubleshooting.

17 July 2013
Editing active pipelines

Added more information about editing active pipelines and rerunning pipeline components. For more information, see Editing Your Pipeline.

17 July 2013
Use resources in different regions

Added more information about using resources in different regions. For more information, see Using a Pipeline with Resources in Multiple Regions.

17 June 2013
WAITING_ON_DEPENDENCIES status

CHECKING_PRECONDITIONS status changed to WAITING_ON_DEPENDENCIES and added the @waitingOn runtime field for pipeline objects.

20 May 2013
DynamoDBDataFormat

Added DynamoDBDataFormat template.

23 April 2013
Process Web Logs video and Spot Instances support

Introduced the video "Process Web Logs with AWS Data Pipeline, Amazon EMR, and Hive," and Amazon EC2 Spot Instances support.

21 February 2013

The initial release of the AWS Data Pipeline Developer Guide.

20 December 2012