BriefCASE storage in-depth

Redundancy levels

Traditional RAID protection distributes the file and parity-check data across independent disk drives. BriefCASE instead distributes file and parity data across independent file servers, and this is termed "object-RAID".

BriefCASE redundancy levels can be set on a per-file or per-directory basis.  By default files under 64kB are stored with object-RAID1 and files over 64kB are stored with object-RAID5.  In both cases, a second orthogonal layer of protection is applied, termed "vertical parity".  Object-RAID5 + vertical parity provides a comparable level of protection to traditional RAID6, with the advantage that object-RAID rebuild times are much shorter.  Object-RAID6 will be available in a future release. Vertical parity overhead is 3% (1 in 32 sectors), and it protects against loss of a disk sector.

The redundancy overhead of Object-RAID5 + vertical parity for large files is 1.143570, meaning that 1GB of data occupies 1.143570GB of formatted disk space.  The "du" (disk usage) linux command by itself gives the size-on-disk, and with the "--apparent-size" flag gives the actual file size:

du -B 1000000 test.bam
195764    .test.bam
du --apparent-size -B 1000000 test.bam
171208    test.bam

Folder usage

BriefCASE includes utilities optimized for to that filesystem.  "pan_du" calculates disk usage based on file size and number of files in each folder.  Note that the default 20 parallel threads places a heavy load on the system and should not be used except off-hours.  Instead use

pan_du -t 2 /data/lab_folder

A usage report is automatically generated each week for /data folders on BriefCASE.  Look for a file named "usage_report.txt".  To quickly identify files and folders taking up the most space, use the command

sort -n -k 5 usage_report.txt | tail -n 100

Quotas

Each folder has an associated hard and soft quota.  The hard quota is a fixed limit is the value in the "Size" column seen with the "df" command:

cd /data/test
df -H .
Filesystem            Size  Used Avail Use% Mounted on
panfs://10.129.86.180/hpc/groups/test
                       64T   49T   16T  77% .

The "Used" column reported by "df" does not match the billed usage so don't rely on this value. It may include the amount of space occupied by snapshots (we're not sure).

Email Scientific Computing to request a quota expansion if you anticipate exceeding the current quota.  There is also a soft quota set at a lower value which alerts ERIS HPC staff that a folder is close to quota, you can request to be included on this email alert.

Extended Attributes

Extended attributes can be used to set the redundancy level, file locking settings and extended permissions for a file or folder

Go to KB0027969 in the IS Service Desk

Related articles