Workflow definition files - AWS HealthOmics

Workflow definition files

The HealthOmics workflow definition files must meet the following requirements:

  • HealthOmics supports workflow definitions written in WDL, Nextflow, or CWL.

  • Declare all parameters in the workflow definition file. Parameters include input and output locations, Amazon ECR container repositories, and runtime parameters such as allocated memory or CPU.

    Note

    The storage requirements to perform runs may be more than expected due to internal file system usage, so allow for more allocated memory than anticipated in your workflow definition file.

  • Declare the output files in the workflow definition file. If you want to copy intermediate run files to the output location, declare them as workflow outputs.

    The input and output locations must be in the same Region as the workflow run.

  • HealthOmics storage workflow inputs must be in ACTIVE status. OM will not import inputs with an ARCHIVED status, causing the workflow to fail.

The following is an example WDL workflow that reads the contents of an INPUT file and writes them into a RESULT file.

version 1.0 workflow TestFlow { input { File input_txt_file } # Copies input file data to output. call TxtFileCopyTask{ input: input_txt_file = input_txt_file, } output { File output_txt_file = TxtFileCopyTask.output_txt_file } } # Task definitions. task TxtFileCopyTask { input { File input_txt_file } command { cat ~{input_txt_file} > outfile.txt } output { File output_txt_file = "outfile.txt" } runtime { cpu: 2 memory: "4 GiB" docker: "ACCOUNT-ID.dkr.ecr.us-west-2.amazonaws.com/ubuntu:latest" } }

The input_txt_file.json file contains the following content:

{ "input_txt_file": { "description": "Input file to be copied", "required": true } }

You must zip the workflow definition file and any dependencies, such as subworkflows, before you can use the file to create a workflow with the create-workflow API operation.