Add myVariables to the pipeline definition Define parameter objects Define Parameter Values Submitting the pipeline definition

Creating a pipeline Using parametrized templates

You can use a parametrized template to customize a pipeline definition. This enables you to create a common pipeline definition but provide different parameters when you add the pipeline definition to a new pipeline.

Add myVariables to the pipeline definition
Define parameter objects
Define Parameter Values
Submitting the pipeline definition

Add myVariables to the pipeline definition

When you create the pipeline definition file, specify variables using the following syntax: #{myVariable}. It is required that the variable is prefixed by my. For example, the following pipeline definition file, pipeline-definition.json, includes the following variables: myShellCmd, myS3InputLoc, and myS3OutputLoc.

Note

A pipeline definition has an upper limit of 50 parameters.


{ 
  "objects": [
    {
      "id": "ShellCommandActivityObj",
      "input": {
        "ref": "S3InputLocation"
      },
      "name": "ShellCommandActivityObj",
      "runsOn": {
        "ref": "EC2ResourceObj"
      },
      "command": "#{myShellCmd}",
      "output": {
        "ref": "S3OutputLocation"
      },
      "type": "ShellCommandActivity",
      "stage": "true"
    },
    {
      "id": "Default",
      "scheduleType": "CRON",
      "failureAndRerunMode": "CASCADE",
      "schedule": {
        "ref": "Schedule_15mins"
      },
      "name": "Default",
      "role": "DataPipelineDefaultRole",
      "resourceRole": "DataPipelineDefaultResourceRole"
    },
    {
      "id": "S3InputLocation",
      "name": "S3InputLocation",
      "directoryPath": "#{myS3InputLoc}",
      "type": "S3DataNode"
    },
    {
      "id": "S3OutputLocation",
      "name": "S3OutputLocation",
      "directoryPath": "#{myS3OutputLoc}/#{format(@scheduledStartTime, 'YYYY-MM-dd-HH-mm-ss')}",
      "type": "S3DataNode"
    },
    {
      "id": "Schedule_15mins",
      "occurrences": "4",
      "name": "Every 15 minutes",
      "startAt": "FIRST_ACTIVATION_DATE_TIME",
      "type": "Schedule",
      "period": "15 Minutes"
    },
    {
      "terminateAfter": "20 Minutes",
      "id": "EC2ResourceObj",
      "name": "EC2ResourceObj",
	  "instanceType":"t1.micro",
      "type": "Ec2Resource"
    }
  ]
}

Define parameter objects

You can create a separate file with parameter objects that defines the variables in your pipeline definition. For example, the following JSON file, parameters.json, contains parameter objects for the myShellCmd, myS3InputLoc, and myS3OutputLoc variables from the example pipeline definition above.


{
  "parameters": [
    {
      "id": "myShellCmd",
      "description": "Shell command to run",
      "type": "String",
      "default": "grep -rc \"GET\" ${INPUT1_STAGING_DIR}/* > ${OUTPUT1_STAGING_DIR}/output.txt"
    },
    {
      "id": "myS3InputLoc",
      "description": "S3 input location",
      "type": "AWS::S3::ObjectKey",
      "default": "s3://us-east-1.elasticmapreduce.samples/pig-apache-logs/data"
    },
    {
      "id": "myS3OutputLoc",
      "description": "S3 output location",
      "type": "AWS::S3::ObjectKey"
    }
  ]
}

Note

You could add these objects directly to the pipeline definition file instead of using a separate file.

The following table describes the attributes for parameter objects.

Parameter attributes
Attribute	Type	Description
`id`	String	The unique identifier of the parameter. To mask the value while it is typed or displayed, add an asterisk ('') as a prefix. For example, `myVariable`—. Notes that this also encrypts the value before it is stored by AWS Data Pipeline.
description	String	A description of the parameter.
type	String, Integer, Double, or AWS::S3::ObjectKey	The parameter type that defines the allowed range of input values and validation rules. The default is String.
optional	Boolean	Indicates whether the parameter is optional or required. The default is `false`.
allowedValues	List of Strings	Enumerates all permitted values for the parameter.
default	String	The default value for the parameter. If you specify a value for this parameter using parameter values, it overrides the default value.
isArray	Boolean	Indicates whether the parameter is an array.

Define Parameter Values

You can create a separate file to define your variables using parameter values. For example, the following JSON file, file://values.json, contains the value for myS3OutputLoc variable from the example pipeline definition above.


{
  "values": 
    {
      "myS3OutputLoc": "myOutputLocation"
    }
}

Submitting the pipeline definition

When you submit your pipeline definition, you can specify parameters, parameter objects, and parameter values. For example, you can use the put-pipeline-definition AWS CLI command as follows:


$ aws datapipeline put-pipeline-definition --pipeline-id id --pipeline-definition file://pipeline-definition.json \ 
--parameter-objects file://parameters.json --parameter-values-uri file://values.json

Note

A pipeline definition has an upper limit of 50 parameters. The size of the file for parameter-values-uri has an upper limit of 15 KB.

Warning Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Document Conventions

Load data from Amazon S3 into Amazon Redshift

Viewing Your Pipelines

Creating a pipeline Using parametrized templates

Contents

Add myVariables to the pipeline definition

Note

Define parameter objects

Note

Define Parameter Values

Submitting the pipeline definition

Note