LUSTRE

Lustre is a high performance scratch system, used for data intensive cluster computing. Users should conduct all intensive compute work using the Lustre file system. The Lustre file system is scalable and it is usually composed of many servers (metadata (MDS) and object storage, (OSS) with possibly thousands of clients (compute nodes). Our Lustre file system has 2.1 petabytes of storage and very fast throughput.

Each PI receives a default allocation of 1 terabyte. This allocation should be enough for most researchers. However, we do realize many researchers will need more space.  An increase to 10 terabytes is available upon request from the PI. Allocations beyond 10TB are also possible upon request but on a temporary basis. 

There are two main directories on each Lustre allocation. 

  • “work” (a symlink to /scratch/group/PI-name) – a group directory where members of a research group can easily share files. 
  • “scratch” (a symlink to /scratch/users/userid) – a local directory for each user. All data on this directory is private to the user. 

We recommend that you do not compile code on Lustre, but rather compile on the ZFS system. Compiling code requires many small file operations and asks many questions about each file. Lustre does not perform at its best when a program repeatedly asks for metadata by checking the existence of a file, listing the contents of a directory, or performing repetitive actions.

Lustre cannot be used to store executables as these files are often much smaller than what Lustre is optimized for.

Note: Lustre allocations are for temporary data and are NOT backed up. In the near future, files older than 6 months will be automatically deleted from lustre. Users should make a plan to migrate data off of their system when they are done using it for their calculations.

To read more about best practices, click here.

ZFS

The ZFS file system should be used mainly to store large amounts of data, executables, and compiling code. It is a combination of a file system and a logical volume manager originally designed by Sun Microsystems. It is usually mounted via NFSv4 to the compute nodes.

Each PI receives a default quota of 1 terabyte. Group members can share data using this directory. The quota should be enough for most research groups but it can be increased per request from the PI up to 10TB.

There are two main directories that utilize ZFS

  • “data” (a symlink to /data/PI-name) – the ‘“data” file system is backed up to a remote location.
  • $HOME directory – 50GB quota – the home directory can be used to host critical files like software applications. It is backed up to a remote location. HOME directories are private to each user.
  • ~/code directory– a directory found within each users $HOME directory. To be used as a repository for scripts, executables, and environments.

To read more about best practices, click here.