Using Data Repositories with Amazon FSx for Lustre - Amazon FSx for Lustre

Using Data Repositories with Amazon FSx for Lustre

Amazon FSx for Lustre provides high-performance file systems optimized for fast workload processing. It can support workloads such as machine learning, high performance computing (HPC), video processing, financial modeling, and electronic design automation (EDA). These workloads commonly require data to be presented using a scalable, high-speed file system interface for data access. They typically have datasets stored on long-term durable data stores like Amazon S3, or on-premises storage.

When you use Amazon FSx with a durable storage repository, you can ingest and process large volumes of file data in a high-performance file system. At the same time, you can periodically write intermediate results to your data repository. By using this approach, you can restart your workload at any time using the latest data stored in your data repository. When your workload is done, you can write final results from your file system to your data repository and delete your file system.

You can link your Amazon FSx file system to an Amazon S3 durable data repository when you create the file system. For more information, see Step 1: Create Your Amazon FSx for Lustre File System.

Amazon FSx is deeply integrated with Amazon S3. This integration means that you can seamlessly access the objects stored in your Amazon S3 buckets from applications that mount your Amazon FSx file system. When you use Amazon FSx with a data repository, you can import your data into your Amazon FSx file system as needed. You can also run your compute-intensive workloads on Amazon EC2 instances in the AWS Cloud and export the results to your data repository after your workload is complete.

Amazon FSx also supports cloud bursting workloads with on-premises file systems by enabling you to copy data from on-premises clients using AWS Direct Connect or VPN.

Important

If you have linked one or more Amazon FSx file systems to a durable data repository on Amazon S3, don't delete the Amazon S3 bucket until all linked file systems have been deleted.