Serverless - High Performance Computing Lens

Serverless

The loosely coupled cloud journey often leads to an environment that is entirely serverless, meaning that you can concentrate on your applications and leave the server provisioning responsibility to managed services. AWS Lambda can run code without the need to provision or manage servers. You pay only for the compute time you consume — there is no charge when your code is not running. You upload your code, and Lambda takes care of everything required to run and scale your code. Lambda also has the capability to automatically trigger events from other AWS services.

Scalability is a second advantage of the serverless Lambda approach. Although each worker may be modest in size – for example, a compute core with some memory – the architecture can spawn thousands of concurrent Lambda workers, thus reaching a large compute throughput capacity and earning the HPC label. For example, a large number of files can be analyzed by invocations of the same algorithm, a large number of genomes can be analyzed in parallel, or a large number of gene sites within a genome can be modeled. The largest attainable scale and speed of scaling matter. While server-based architectures require time on the order of minutes to increase capacity in response to a request (even when using virtual machines such as EC2 instances), serverless Lambda functions scale in seconds. AWS Lambda enables HPC infrastructure that responds immediately to any unforeseen requests for compute-intensive results, and can fulfill a variable number of requests without requiring any resources to be wastefully provisioned in advance.

In addition to compute, there are other serverless architectures that aid HPC workflows. AWS Step Functions let you coordinate multiple steps in a pipeline by stitching together different AWS services. For example, an automated genomics pipeline can be created with AWS Step Functions for coordination, Amazon S3 for storage, AWS Lambda for small tasks, and AWS Batch for data processing.

Serverless architectures are best for loosely coupled workloads, or as workflow coordination if combined with another HPC architecture.

Reference Architecture

Figure 4: Example Lambda-deployed loosely coupled workload

Workflow steps:

  1. The user uploads a file to an S3 bucket through the AWS CLI or SDK.

  2. The input file is saved with an incoming prefix (for example, input/).

  3. An S3 event automatically triggers a Lambda function to process the incoming data.

  4. The output file is saved back to the S3 bucket with an outgoing prefix (for example, output/.)