Step 2: Save and Validate Your Pipeline - AWS Data Pipeline

Step 2: Save and Validate Your Pipeline


If your pipeline uses an Amazon EMR release version in the 6.x series, you must add a bootstrap action to copy the following Jar file to the Hadoop classpath where MyRegion is the AWS Region where your pipeline runs: s3://dynamodb-dpl-MyRegion/emr-ddb-storage-handler/4.14.0/emr-dynamodb-tools-4.14.0-jar-with-dependencies.jar. For more information, see Amazon EMR 6.1.0 Release and Hadoop 3.x Jar Dependencies.

In addition, you must change the first argument in the step field in EmrActivity with name TableLoadActivity from s3://dynamodb-dpl-MyRegion/emr-ddb-storage-handler/4.11.0/emr-dynamodb-tools-4.11.0-SNAPSHOT-jar-with-dependencies.jar to s3://dynamodb-dpl-MyRegion/emr-ddb-storage-handler/4.14.0/emr-dynamodb-tools-4.14.0-jar-with-dependencies.jar.

You can save your pipeline definition at any point during the creation process. As soon as you save your pipeline definition, AWS Data Pipeline looks for syntax errors and missing values in your pipeline definition. If your pipeline is incomplete or incorrect, AWS Data Pipeline generates validation errors and warnings. Warning messages are informational only, but you must fix any error messages before you can activate your pipeline.

To save and validate your pipeline

  1. Choose Save pipeline.

  2. AWS Data Pipeline validates your pipeline definition and returns either success or error or warning messages. If you get an error message, choose Close and then, in the right pane, choose Errors/Warnings.

  3. The Errors/Warnings pane lists the objects that failed validation. Choose the plus (+) sign next to the object names and look for an error message in red.

  4. When you see an error message, go to the specific object pane where you see the error and fix it. For example, if you see an error message in the DataNodes object, go to the DataNodes pane to fix the error.

  5. After you fix the errors listed in the Errors/Warnings pane, choose Save Pipeline.

  6. Repeat the process until your pipeline validates successfully.