Scientific Computing Linux Clusters Cold storage solution

Some data that is not actively used are still stored on BriefCASE needs to be archived to a different service. This can be either because the data are not needed anymore or because the original project that needed the data ended. This type of data can be archived either for later use or for regulatory reasons as BriefCASE is meant for data that is actively worked on, this causes inefficiencies and high cost.

In order to optimize the available resources, you can sign up to the automated archival. The automated archival works by getting the statistics of all folders in the BriefCASE storage, and creating a list of folders that are above a 10000 KB threshold and have not been modified for over 365 days. This list will be saved in the BriefCASE folder in human readable form. (tab separated txt file). After a grace period the listed folders will be moved to the MAD storage. MAD has the same secure protocols as BriefCASE and is accessible from the Scientific Computing (SciC) Linux Clusters, but with a slower speed. This allows the data to be easily be moved back at a later point in time if they are actively needed again.  

Sign up:

Prequisites:

  • Request a MAD account (Or have a MAD account with same group access as BriefCASE account). You might need a separate PAS groups and two keygivers to regulate access to it.
  • Have a working /data group folder with all proper group permissions setup to archivable folders.

Sign up:

To sign up to the automatic archival please send the following information to hpcsupport@partners.org:

  • BriefCASE storage: The name of the BriefCASE you want to sign up. The name of your current MAD account for archival or your new MAD account name will be created accordingly.

    • For new MAD accounts:

      • Key Givers (PHS Username): Two individuals who have keygiver privilege. Those can be the same ones as for BriefCASE.
      • Data classification: (public, institutional, confidential). Chose the most confidential data you might have to move to MAD
      • Institution (MGH, BWH, PHS): your institution
      • lab/PI name: 
      • Department
      • Storage required: Please be conservative with this number as we can always expand the quota on request. General rule of thumb is request what will be needed for the year or a little more.
      • Purpose: Research / Clinical / Other (describe in comments)
      • Additional admins (Mass General Brigham user name): You can nominate other people to receive the automated messages.by the archival service.
      • Funding source: the specific grant or fund number you want the MAD storage to be charged against.  

Once your MAD account is ready it will allow you to move files for archival. Every two months we will as well check if there are any movable files in your BriefCASE storage. Movable files in this case are defined as folders that take more than 10MB and have no modifications within a year.  In case we found any folders with this criteria you will receive an e-mail, you would be notified if you have a total of at least 10GB to archive. You as well will find a file named movable_files_LABNAME.txt in your /data/LABNAME folder. The file is a tab separated list with:

  • size: Size of the folder in KB
  • change: Date of the last change
  • full_path: The path to the folder. 

You can have a look at the files. This gives you the opportunity to manually delete or add a row in case you want to keep the folder or archive anything else not listed. You as well can look at the folders and remove them from disk if you are sure the data are not needed anymore.

After two weeks of the notifications, the folders listed in movable_files_LABNAME.txt will be moved to MAD automatically. Note that depending on size, this process may take a while so you might experience that some folders are already moved while others are still on BriefCASE. 

You can find more information about MAD service and access here.

Go to KB0037642 in the IS Service Desk

Related articles