New — File Release for Amazon FSx for Lustre

August 18, 2023

Amazon FSx for Lustre provides fully managed shared storage with the scalability and high performance of the open-source Lustre file systems to support your Linux-based workloads. FSx for Lustre is for workloads where storage speed and throughput matter. This is because FSx for Lustre helps you avoid storage bottlenecks, increase utilization of compute resources, and decrease time to value for workloads that include artificial intelligence (AI) and machine learning (ML), high performance computing (HPC), financial modeling, and media processing. FSx for Lustre integrates natively with Amazon Simple Storage Service (Amazon S3), synchronizing changes in both directions with automatic import and export, so that you can access your Amazon S3 data lakes through a high-performance POSIX-compliant file system on demand.

Today, I’m excited to announce file release for FSx for Lustre. This feature helps you manage your data lifecycle by releasing file data that has been synchronized with Amazon S3. File release frees up storage space so that you can continue writing new data to the file system while retaining on-demand access to released files through the FSx for Lustre lazy loading from Amazon S3. You specify a directory to release from, and optionally a minimum amount of time since last access, so that only data from the specified directory, and the minimum amount of time since last access (if specified), is released. File release helps you with data lifecycle management by moving colder file data to S3 enabling you to take advantage of S3 tiering.

File release tasks are initiated using the AWS Management Console, or by making an API call using the AWS CLI, AWS SDK, or Amazon EventBridge Scheduler to schedule release tasks at regular intervals. You can choose to receive completion reports at the end of your release task if so desired.

Initiating a Release Task
As an example, let’s look at how to use the console to initiate a release task. To specify criteria for files to release (for example, directories or time since last access), we define release data repository tasks (DRTs). DRTs release all files that are synchronized with Amazon S3 and that meet the specified criteria. It’s worth noting that release DRTs are processed in sequence. This means that if you submit a release DRT while another DRT (for example, import or export) is in progress, the release DRT will be queued but not processed until after the import or export DRT has completed.

Note: For the data repository association to work, automatic backups for the file system must be disabled (use the Backups tab to do this). Secondly, ensure that the file system and the associated S3 bucket are in the same AWS Region.

I already have an FSx for Lustre file system my-fsx-test.

I create a data repository association, which is a link between a directory on the file system and an S3 bucket or prefix.

I specify the name of the S3 bucket or an S3 prefix to be associated with the file system.

After the data repository association has been created, I select Create release task.

The release task will release directories or files that you want to release based on your specific criteria (again, important to remember that these files or directories must be synchronized with an S3 bucket in order for the release to work). If you specified the minimum last access for release (in addition to the directory), files that have not been accessed more recently than that will be released.

In my example, I chose to Disable completion reports. However, if you choose to Enable completion reports, the release task will produce a report at the end of the release task.

Files that have been released can still be accessed using existing FSx for Lustre functionality to automatically retrieve data from Amazon S3 back to the file system on demand. This is because, although released, their metadata stays on the file system.

File release won’t automatically prevent your file system from becoming full. It remains important to ensure that you don’t write more data than the available storage capacity before you run the next release task.

Now Available
File release on FSx for Lustre is available today in all AWS Regions where FSx for Lustre is supported, on all new or existing S3-linked file systems running Lustre version 2.12 or later. With file release on FSx for Lustre, there is no additional cost. However, if you release files that you later access again from the file system, you will incur normal Amazon S3 request and data retrieval costs where applicable when those files are read back into the file system.

To learn more, visit the Amazon FSx for Lustre Page, and please send feedback to AWS re:Post for Amazon FSx for Lustre or through your usual AWS support contacts.