AWS ParallelCluster
AWS ParallelCluster User Guide

AWS Services used in AWS ParallelCluster

The following Amazon Web Services (AWS) services are used in AWS ParallelCluster.

AWS Auto Scaling

AWS Auto Scaling is used to manage the ComputeFleet instances. These instances are managed as an AutoScaling Group, and can be elastically driven by workload, or can be static and driven by the configuration.

AWS Auto Scaling is not used with AWS Batch clusters.

For more details about AWS Auto Scaling, see https://aws.amazon.com/autoscaling/.

AWS Batch

AWS Batch is the AWS-managed job scheduler that dynamically provisions the optimal quantity and type of compute resources (for example, CPU or memory-optimized instances). It provisions resources based on the volume and the requirements of the batch jobs that are submitted. With AWS Batch, there is no need to install and manage batch computing software or server clusters to run your jobs.

AWS Batch is used only with AWS Batch clusters.

For more details, see https://aws.amazon.com/batch/.

AWS CloudFormation

AWS CloudFormation is the core service used by AWS ParallelCluster. Each cluster is represented as a stack. All resources required by the cluster are defined within the AWS ParallelCluster AWS CloudFormation template. AWS ParallelCluster CLI commands typically map to AWS CloudFormation stack commands, such as create, update, and delete. Instances that are launched within a cluster make HTTPS calls to the AWS CloudFormation endpoint for the region in which the cluster is launched.

For more details about AWS CloudFormation, see https://aws.amazon.com/cloudformation/.

Amazon CloudWatch

Amazon CloudWatch (CloudWatch) is used to log Docker image build steps and the standard output and error of the AWS Batch jobs.

CloudWatch is used only with AWS Batch clusters.

For more details, see https://aws.amazon.com/cloudwatch/.

AWS CodeBuild

AWS CodeBuild (CodeBuild) is used to automatically and transparently build Docker images at cluster creation time.

CodeBuild is used only with AWS Batch clusters.

For more details, see https://aws.amazon.com/codebuild/.

Amazon DynamoDB

Amazon DynamoDB (DynamoDB) is used to store minimal state of the cluster. The MasterServer tracks provisioned instances in a DynamoDB table.

DynamoDB is not used with AWS Batch clusters.

For more details, see https://aws.amazon.com/dynamodb/.

Amazon Elastic Block Store

Amazon Elastic Block Store (Amazon EBS) provides persistent storage for shared volumes. All Amazon EBS settings can be passed through the configuration. Amazon EBS volumes can either be initialized empty, or from an existing Amazon EBS snapshot.

For more details about Amazon EBS, see https://aws.amazon.com/ebs/.

Amazon Elastic Compute Cloud

Amazon Elastic Compute Cloud (Amazon EC2) provides the computing capacity for AWS ParallelCluster. The MasterServer and ComputeFleet are Amazon EC2 instances. Any instance type that support HVM can be selected. The MasterServer and ComputeFleet can be different instance types, and the ComputeFleet can also be launched as a Spot instance. Instance store volumes found on the instances are mounted as striped LVM volumes.

For more details about Amazon EC2, see https://aws.amazon.com/ec2/.

Amazon Elastic Container Registry

Amazon Elastic Container Registry (Amazon ECR) stores the Docker images built at cluster creation time. The Docker images are then used by AWS Batch to run the containers for the submitted jobs.

Amazon ECR is used only with AWS Batch clusters.

For more details, see https://aws.amazon.com/ecr/.

AWS Identity and Access Management

AWS Identity and Access Management (IAM) is used within AWS ParallelCluster. It provides a least privileged IAM role for Amazon EC2 for the instance that is specific to each individual cluster. AWS ParallelCluster instances are given access only to the specific API calls that are required to deploy and manage the cluster.

With AWS Batch clusters, IAM roles are also created for the components that are involved with the Docker image building process at cluster creation time. These components include the Lambda functions that are allowed to add and delete Docker images to and from the Amazon ECR repository, and to delete the Amazon S3 bucket that is created for the cluster and CodeBuild project. There are also roles for AWS Batch resources, instances, and jobs.

For more details about IAM, see https://aws.amazon.com/iam/.

AWS Lambda

AWS Lambda (Lambda) runs the functions that orchestrate Docker image creation. Lambda also manages the cleanup of custom cluster resources, such as Docker images stored in the Amazon ECR repository and on Amazon S3.

Lambda is used only with AWS Batch clusters.

For more details, see https://aws.amazon.com/lambda/.

Amazon Simple Notification Service

Amazon Simple Notification Service (Amazon SNS) is used to receive notifications from Auto Scaling. These events are called life cycle events, and are generated when an instance launches or terminates in an Autoscaling Group. Within AWS ParallelCluster, the Amazon SNS topic for the Autoscaling Group is subscribed to an Amazon SQS queue.

Amazon SNS is not used with AWS Batch clusters.

For more details about Amazon SNS, see https://aws.amazon.com/sns/.

Amazon Simple Queue Service

Amazon Simple Queue Service (Amazon SQS) is used to hold notification messages from Auto Scaling, sent through Amazon SNS, and notifications from the ComputeFleet instances. Using Amazon SQS decouples the sending of notifications from receiving them, and allows the Master to handle them through polling. The MasterServer runs Amazon SQSwatcher and polls the queue. Auto Scaling and the ComputeFleet instances post messages to the queue.

Amazon SQS is not used with AWS Batch clusters.

For more details about Amazon SQS, see https://aws.amazon.com/sqs/.

Amazon Simple Storage Service

Amazon Simple Storage Service (Amazon S3) is used to store the AWS ParallelCluster templates used in each region. AWS ParallelCluster can be configured to allow CLI/SDK tools to use Amazon S3.

When an AWS Batch cluster is used, an Amazon S3 bucket in the customer's account is used for storage. For example, it stores artifacts used by the Docker image creation, and scripts from submitted jobs.

For more details, see https://aws.amazon.com/s3/.