AWS ParallelCluster
AWS ParallelCluster User Guide

Custom Bootstrap Actions

AWS ParallelCluster can execute arbitrary code either before (pre-install) or after (post-install) the main bootstrap action during cluster creation. This code is typically stored in Amazon Simple Storage Service (Amazon S3) and accessed via HTTPS during cluster creation. The code is executed as root and can be in any script language that is supported by the cluster OS, typically bash or python.

Pre-install actions are called before any cluster deployment bootstrap, such as configuring NAT, Amazon Elastic Block Store (Amazon EBS) or the scheduler. Typical pre-install actions can include modifying storage, adding extra users, or adding packages.

Post-install actions are called after cluster bootstrap is complete, as the last action before an instance is considered complete. Typical post-install actions can include changing scheduler settings, modifying storage, or modifying packages.

Arguments can be passed to scripts by specifying them in the configuration. These are passed double-quoted to the pre-install or post-install actions.

If a pre-install or post-install action fails, the instance bootstrap fails. Success is signaled with an exit code of 0. Any other exit code is considered a failure.

You can differentiate between master and compute nodes execution. Source the /etc/parallelcluster/cfnconfig file and evaluate the cfn_node_type environment variable, whose possible values are "MasterServer" and "ComputeFleet" for the master and compute nodes respectively.

#!/bin/bash . "/etc/parallelcluster/cfnconfig" case "${cfn_node_type}" in MasterServer) echo "I am the master node" >> /tmp/master.txt ;; ComputeFleet) echo "I am a compute node" >> /tmp/compute.txt ;; *) ;; esac

Configuration

The following configuration settings are used to define pre-install and post-install actions and arguments.

# URL to a preinstall script. This is executed before any of the boot_as_* scripts are run # (defaults to NONE) pre_install = NONE # Arguments to be passed to preinstall script # (defaults to NONE) pre_install_args = NONE # URL to a postinstall script. This is executed after any of the boot_as_* scripts are run # (defaults to NONE) post_install = NONE # Arguments to be passed to postinstall script # (defaults to NONE) post_install_args = NONE

Arguments

The first two arguments — $0 and $1 — are reserved for the script name and url.

$0 => the script name $1 => s3 url $n => args set by pre/post_install_args

Example

The following steps create a simple post-install script that installs the R packages in a cluster.

  1. Create a script.

    #!/bin/bash echo "post-install script has $# arguments" for arg in "$@" do echo "arg: ${arg}" done yum -y install "${@:2}"
  2. Upload the script with the correct permissions to Amazon S3.

    $ aws s3 cp --acl public-read /path/to/myscript.sh s3://<bucket-name>/myscript.sh
  3. Update the AWS ParallelCluster configuration to include the new post-install action.

    [cluster default] ... post_install = https://<bucket-name>.s3.amazonaws.com/myscript.sh post_install_args = "R curl wget"

    If the bucket does not have public-read permission, use s3 as the URL protocol.

    [cluster default] ... post_install = s3://<bucket-name>/myscript.sh post_install_args = "R curl wget"
  4. Launch the cluster.

    $ pcluster create mycluster
  5. Verify the output.

    $ less /var/log/cfn-init.log 2019-04-11 10:43:54,588 [DEBUG] Command runpostinstall output: post-install script has 4 arguments arg: s3://eu-eu-west-1/test.sh arg: R arg: curl arg: wget Loaded plugins: dkms-build-requires, priorities, update-motd, upgrade-helper Package R-3.4.1-1.52.amzn1.x86_64 already installed and latest version Package curl-7.61.1-7.91.amzn1.x86_64 already installed and latest version Package wget-1.18-4.29.amzn1.x86_64 already installed and latest version Nothing to do