Nextflow workflow definition specifics
HealthOmics suppports Nextflow DSL1 and DSL2. For details, see Nextflow version support.
Nextflow DSL2 is based on the Groovy programming language, so parameters are dynamic
and type coercion is possible using the same rules as Groovy. Parameters and values
supplied by the input JSON are available in the parameters (params) map of
the workflow.
Topics
Use nf-schema and nf-validation plugins
Note
Summary of HealthOmics support for plugins:
v22.04 – no support for plugins
v23.10 – supports
nf-schemaandnf-validationv24.10 – supports
nf-schemav25.10, v26.04 – supports
nf-schema,nf-core-utils,nf-fgbio, andnf-prov
HealthOmics provides the following support for Nextflow plugins:
-
For Nextflow v23.10, HealthOmics pre-installs the nf-validation@1.1.1 plugin.
-
For Nextflow v23.10 and v24.10, HealthOmics pre-installs the nf-schema@2.3.0 plugin.
-
For Nextflow v25.10, HealthOmics pre-installs the nf-schema@2.6.1, nf-core-utils@0.4.0, nf-prov@1.7.0, and nf-fgbio@1.0.1 plugins.
-
For Nextflow v26.04, HealthOmics pre-installs the nf-schema@2.7.2, nf-core-utils@0.4.0, nf-prov@1.7.0, and nf-fgbio@1.0.1 plugins.
-
You cannot retrieve additional plugins during a workflow run. HealthOmics ignores any other plugin versions that you specify in the
nextflow.configfile. -
For Nextflow v24 and higher,
nf-schemais the new version of the deprecatednf-validationplugin. For more information, see nf-schemain the Nextflow GitHub repository.
Specify storage URIs
When an Amazon S3 or HealthOmics URI is used to construct a Nextflow file or path object, it makes the matching object available to the workflow, as long as read access is granted. The use of prefixes or directories is allowed for Amazon S3 URIs. For examples, see Amazon S3 input parameter formats.
HealthOmics partially supports the use of glob patterns in Amazon S3 URIs or HealthOmics Storage URIs.
Use Glob patterns in the
workflow definition for the creation of path or file
channels. For the expected behavior and exact cases, see Nextflow Handling of Glob pattern in Amazon S3 inputs.
Nextflow directives
You configure Nextflow directives in the Nextflow config file or workflow definition. The following list shows the order of precedence that HealthOmics uses to apply configuration settings, from lowest to highest priority:
-
Global configuration in the config file.
-
Task section of the workflow definition.
-
Task-specific selectors in the config file.
Topics
Task retry strategy using errorStrategy
Use the errorStrategy directive to define the strategy for task errors. By default, when a task
returns with an error indication (a non-zero exit status), the task stops and HealthOmics terminates the entire run. If
you set errorStrategy to retry, HealthOmics attempts one retry of the failed task. To
increase the number of retries, see Task retry attempts using maxRetries.
process { label 'my_label' errorStrategy 'retry' script: """ your-command-here """ }
For information about how HealthOmics handles task retries during a run, see Task Retries.
Task retry attempts using maxRetries
By default, HealthOmics doesn't attempt any retries of a failed task, or attempts one retry if you
configure errorStrategy. To increase the maximum number of retries, set errorStrategy
to retry and configure the maximum number of retries using the maxRetries directive.
The following example sets the maximum number of retries to 3 in the global configuration.
process { errorStrategy = 'retry' maxRetries = 3 }
The following example shows how to set maxRetries in the task section of the workflow definition.
process myTask { label 'my_label' errorStrategy 'retry' maxRetries 3 script: """ your-command-here """ }
The following example shows how to specify task-specific configuration in the Nextflow config file, based on the name or label selectors.
process { withLabel: 'my_label' { errorStrategy = 'retry' maxRetries = 3 } withName: 'myTask' { errorStrategy = 'retry' maxRetries = 3 } }
Opt out of task retry using omicsRetryOn5xx
For Nextflow v23 and later, HealthOmics supports task retries if the task failed because of service errors (5XX HTTP status codes). By default, HealthOmics attempts up to two retries of a failed task.
You can configure omicsRetryOn5xx to opt out of task retry for service errors. For more
information about task retry in HealthOmics, see Task Retries.
The following example configures omicsRetryOn5xx in the global configuration to opt out of task
retry.
process { omicsRetryOn5xx = false }
The following example shows how to configure omicsRetryOn5xx in the task section of the
workflow definition.
process myTask { label 'my_label' omicsRetryOn5xx = false script: """ your-command-here """ }
The following example shows how to set omicsRetryOn5xx as task-specific configuration in the
Nextflow config file, based on the name or label selectors.
process { withLabel: 'my_label' { omicsRetryOn5xx = false } withName: 'myTask' { omicsRetryOn5xx = false } }
Task duration using the time directive
HealthOmics provides an adjustable quota (see HealthOmics service quotas) to
specify the maximum duration for a run. For Nextflow v23 and later workflows, you can also specify maximum task
durations using the Nextflow time directive.
During new workflow development, setting maximum task duration helps you catch runaway tasks and long-running tasks.
For more information about the Nextflow time directive, see time directive
HealthOmics provides the following support for the Nextflow time directive:
-
HealthOmics supports 1 minute granularity for the time directive. You can specify a value between 60 seconds and the maximum run duration value.
-
If you enter a value less than 60, HealthOmics rounds it up to 60 seconds. For values above 60, HealthOmics rounds down to the nearest minute.
-
If the workflow supports retries for a task, HealthOmics retries the task if it times out.
-
If a task times out (or the last retry times out), HealthOmics cancels the task. This operation can have a duration of one to two minutes.
-
On task timeout, HealthOmics sets the run and task status to failed, and it cancels the other tasks in the run (for tasks in Starting, Pending, or Running status). HealthOmics exports the outputs from tasks that it completed before the timeout to your designated S3 output location.
-
Time that a task spends in pending status does not count toward the task duration.
-
If the run is part of a run group and the run group times out sooner than the task timer, the run and task transition to failed status.
Specify the timeout duration using one or more of the following units: ms, s,
m,h, or d.
The following example shows how to specify global configuration in the Nextflow config file. It sets a global timeout of 1 hour and 30 minutes.
process { time = '1h30m' }
The following example shows how to specify a time directive in the task section of the workflow definition.
This example sets a timeout of 3 days, 5 hours, and 4 minutes. This value takes precedence over the global value
in the config file, but doesn't take precedence over a task-specific time directive for my_label in
the config file.
process myTask { label 'my_label' time '3d5h4m' script: """ your-command-here """ }
The following example shows how to specify task-specific time directives in the Nextflow config file, based
on the name or label selectors. This example sets a global task timeout value of 30 minutes. It sets a value of 2
hours for task myTask and sets a value of 3 hours for tasks with label my_label. For
tasks that match the selector, these values take precedence over the global value and the value in the workflow
definition.
process { time = '30m' withLabel: 'my_label' { time = '3h' } withName: 'myTask' { time = '2h' } }
Use Nextflow profiles
Nextflow profiles are named sets of configuration settings that you can select at runtime. Define profiles in
the profiles block of your nextflow.config file:
profiles { standard { process.cpus = 2 process.memory = '4 GB' } production { process.cpus = 16 process.memory = '64 GB' params.input = 's3://bucket/production-data.bam' } }
When you start a run, specify one or more profiles using the engineSettings parameter. HealthOmics passes
the -profile flag to the Nextflow engine. For more information, see Specify engine settings.
aws omics start-run \ --workflow-idworkflow-id\ --role-arnrole-arn\ --output-uri s3://bucket/prefix/ \ --engine-settings '{"profile": "production"}'
When multiple profiles are specified (for example, "test,docker"), Nextflow applies them in the
order they are specified in the command line. Later profiles override earlier ones for conflicting settings. For
Nextflow versions lower than 26, profiles are applied in the order they are defined in the configuration file
instead of command line order.
Note the following:
-
Profile support is available for all HealthOmics supported Nextflow versions.
-
Profiles can contain parameters, process directives,
includeConfigstatements, and manifest overrides (includingmanifest.nextflowVersion). -
Explicit run parameters take precedence over profile-defined parameter values.
-
If you specify a nonexistent profile, HealthOmics returns a validation error.
-
Profiles must be defined in the workflow definition zip file. HealthOmics doesn't support fetching profile definitions from external sources.
-
If you don't specify a profile, the run uses the
standardprofile if it's defined under profiles in the workflow definition. Otherwise, the run uses the default (top-level) configuration. -
When using profiles, we recommend pinning the Nextflow version in your workflow definition using
manifest.nextflowVersionto ensure consistent profile application behavior across runs.
Export workflow-level content
For Nextflow v25.10 and later, you can export files produced outside of individual tasks, such as
provenance reports or pipeline DAGs. To export these files, write them to
/mnt/workflow/output/. HealthOmics exports files placed in this directory to the
output/ prefix in your run's Amazon S3 output location.
The following example shows how to configure the nf-prov plugin to write a
provenance report to /mnt/workflow/output/.
prov { formats { bco { file = "/mnt/workflow/output/pipeline_info/manifest.bco.json" } } }
You can also pass this path as a parameter in your run's input JSON. This approach is common with nf-core
workflows that use params.outdir.
{ "outdir": "/mnt/workflow/output/" }
Export task content
For workflows written in Nextflow, define a publishDir directive to export task content
to your output Amazon S3 bucket. As shown in the following example, set the publishDir value to
/mnt/workflow/pubdir. To export files to Amazon S3, the files must be in this directory.
nextflow.enable.dsl=2 workflow { CramToBamTask(params.ref_fasta, params.ref_fasta_index, params.ref_dict, params.input_cram, params.sample_name) ValidateSamFile(CramToBamTask.out.outputBam) } process CramToBamTask { container "<account>.dkr.ecr.us-west-2.amazonaws.com/genomes-in-the-cloud" publishDir "/mnt/workflow/pubdir" input: path ref_fasta path ref_fasta_index path ref_dict path input_cram val sample_name output: path "${sample_name}.bam", emit: outputBam path "${sample_name}.bai", emit: outputBai script: """ set -eo pipefail samtools view -h -T $ref_fasta $input_cram | samtools view -b -o ${sample_name}.bam - samtools index -b ${sample_name}.bam mv ${sample_name}.bam.bai ${sample_name}.bai """ } process ValidateSamFile { container "<account>.dkr.ecr.us-west-2.amazonaws.com/genomes-in-the-cloud" publishDir "/mnt/workflow/pubdir" input: file input_bam output: path "validation_report" script: """ java -Xmx3G -jar /usr/gitc/picard.jar \ ValidateSamFile \ INPUT=${input_bam} \ OUTPUT=validation_report \ MODE=SUMMARY \ IS_BISULFITE_SEQUENCED=false """ }
For Nextflow v25.10 and later, as an alternative to publishDir, you can use workflow outputs to export task content.
The following example shows how to define a workflow output block that
exports task results to Amazon S3.
process myTask { input: val data output: path 'result.txt' script: """ echo ${data} > result.txt """ } workflow { main: output_file = myTask('hello') publish: results = output_file } output { results { path '.' } }
For more information about workflow outputs, see Workflow
outputs
Specify the Nextflow syntax version
Nextflow v26.04.0 uses the strict (v2) syntax parser by default. This is a breaking change for
workflows written using the legacy (v1) syntax, which is the default in Nextflow v25.10.0 and earlier.
For information about the v2 syntax, see Strict
syntax
To run a workflow authored against the legacy (v1) parser, set engineSettings.syntaxVersion
to v1 in the StartRun request:
{ "engineSettings": { "syntaxVersion": "v1" } }
For Nextflow v25.10.0 and earlier, HealthOmics does not support the v2 parser.
Nextflow v26.04 release notes
The following tables summarize HealthOmics support for new features, enhancements, and deprecations released in Nextflow version 26.04.
New features and enhancements
| Feature | From version | HealthOmics support | Notes |
|---|---|---|---|
| Strict syntax parser (default) | 26.04 | Yes | Enabled by default from v26.04. Legacy parser available via
syntaxVersion: "v1" in engine settings. |
| Record types | 26.04 | Yes | For more information, see Records |
| Workflow output summaries | 26.04 | Yes | Prints a summary of workflow outputs on run completion. Output format configurable
via outputFormat in engine settings.
For more information, see
Specify engine settings. |
| Agent logging mode | 26.04 | Yes | Configurable via agentMode in engine settings. For more information, see
Specify engine settings. |
| Module system (Nextflow Registry) | 26.04 | No | HealthOmics workflows run in an isolated network with no outbound internet access. You can include modules directly in your workflow zip. |
| Static typing (preview) | 26.04 | No | HealthOmics does not support preview features. |
| Auto-load collection params from files | 26.04 | No | Requires static typing (preview), which HealthOmics does not support. |
| Multi-revision pipelines checkout | 26.04 | N/A | Not applicable. HealthOmics does not use Git-based pipeline checkout. |
Deprecations
| Deprecated item | From version | Impact | Recommended action |
|---|---|---|---|
listFiles() method |
26.04 | Deprecation warning | Replace with listDirectory(). |
nextflow.enable.strict flag |
26.04 | No longer needed | Remove from config. Strict mode is now the default. |
manifest.defaultBranch |
26.04 | No longer needed | Remove from config. HealthOmics does not use Git-based pipeline checkout and has never supported this option. |