Scientific Computing Report 2021
ERIS Scientific Computing provides a range of computational platforms, resources and support for research and innovation across MGB.
This report provides an overview of the resources available to the Mass General Brigham research and innovation communities and how they are used.
ERISOne/ERISTwo Linux Cluster
Account creation history
The account creation form processes the creation of accounts in both ERISOne and ERISTwo, the CPU-based high performance computing platform. Fig.1 shows the total number of accounts created in each year since 2015. While there was a peak of users in 2019, the number of new users in 2021 is a close second with 710 new users and represents a 4.5% increase from prior year.
Of the 710 new users, the majority reported they were from Brigham and Women’s Hospital (42%), Massachusetts General Hospital (36%) or from Mass General Brigham (MGB Corporate) (11%). Figure 2 below shows the full breakdown of new accounts created in 2021 by institution.
Currently, there are 2467 active Linux accounts and a total of 250 groups.
ERISOne Access Methods
Users were asked how they intend to use the cluster and what access method they will most likely use (multiple answers are possible). Fig. 4A shows the answer to the new users for each year. Most users are using a command-line interface. There is a strong trend to use the web portals, Jupyter-Notebooks and R-Studio. It is notable that while remote desktop access has remained steady, using web portals increased have increased nearly every year.
Number of available nodes per cluster
|Special||6 (aristo, celeste, lm001, plato, seed1, and socrates)|
Number of resources per cluster
Please note that the data below is from nodes that are online at the time of data collection. Counts are subject to change based on nodes being online/offline.
|Total RAM (GB)||Total CPUs||Total Cores|
|Total RAM (GB)||Total CPUs||Total Cores|
We provide a wide range of applications for each Platform. Currently, new installations on ERISOne are closed. New applications are installed only on ERISTwo that is installed with the latest OS. A total of 574 modules have been built on ERISTwo with more are available each day.
Web Portals Usage
Jupyter hub is an open-source web application that allows users to create and share documents that contain live code, equations, visualizations and narrative text. Uses include data cleaning and transformation, numerical simulation, statistical modeling, data visualization, machine learning, and much more.
Figure 4 depicts open sessions over time. Please note that Jupyter-hub sessions stay open unless the user explicitly closes them. This explains the continual increase with gaps representing reboots or outages. The service needs restarting periodically which eliminates inactive sessions. The overall maximum number of users, 198, occurred in February 2020, while the maximum users in 2021 was 195 in March.
RStudio is an integrated development environment for R Statistical Computing. Figure 5 shows the number of active users per day. Unlike JupyterHub, inactive sessions are automatically closed, which explains the frequent drops in usage and relatively stable usage trend unlike the spikes in JupyterHub usage. As with Jupyter hub, there is a gap in the data recording from mid-April till mid-May of 2019 due to failure of the data collection database
Shiny applications developed in R is a valuable tool for our community. Shiny provides web interactivity and access to Briefcase data and ERISOne resources.
- 19 public apps
- 17 private (PAS login required) apps
This represents a doubling from 2020.
The NoMachine Remote Desktops are a group of nodes on ERISOne and ERISTwo that host remote, graphical Linux desktop sessions that users connect to virtually. This allows users to work in a full Linux GUI while accessing their cluster data and any cluster software modules, especially helpful when working with graphical applications like Matlab. Inactive sessions are terminated after 2 weeks.
Figure 6 below shows usage over time of the remote desktop sessions specifically on ERISOne rgs machines. In 2021, we have worked to remove any non-functioning graphical nodes, which have included several rgs nodes on ERISOne. This accounts for the slight decline throughout the year. In their place, we have recently installed 8 new grx graphical nodes in ERISTwo for remote desktop usage. The grx nodes for ERISTwo remote desktop sessions are not included in the graph below.
HPCWIN3/4 Windows Analytics Servers
While the number of new Linux users is continually growing, the number of new users on the HPCWIN3/4 is relatively stable from 2017 till 2020. In 2020 we saw a decrease of 30% in new users, and in 2021 we saw another decrease in new users of 26%.
With respect to the user base, as shown in Figure 8, the distribution of users is very similar to the Linux cluster and follows roughly the same breakdown across MGB institutions.
This past year, the newest cluster ERISXdl opened for new users during its pilot phase. Currently free for users, the goal of ERISXdl is to give researchers the opportunity and necessary resources to complete deep learning and other GPU-powered analyses. The cluster is made up of 3 login nodes and 5 Nvidia DGX compute nodes, each equipped with NVIDIA v100 GPUs. Data tracking for this cluster began in June 2021, with limited usage due to the small size of the cluster and the limited nature of the pilot testing phase. Figure 9 shows the number of jobs submitted to the cluster daily, and Figure 10 shows the number of active users each month.
As seen below, while the month with the most number of active users was July 2021, the month with the peak job submissions was November with 13 active users. Sudden spikes in job submissions can often be accounted for by a few select users who submit large numbers of jobs in batch, but normal day to day job submissions remain below 50 jobs per day. Please note that daily jobs are calculated by the starting date of each job, as jobs can run for up to 14 days on the cluster. This means that while only a few new jobs might be submitted on any given day, the 5 DGX nodes that are running may still be backed up from longer jobs submitted several days prior. Additionally, the daily job count includes both completed and incomplete (failed or canceled) jobs.
- Deploying 60 additional nodes on ERISTwo.
- Deploy 3PB of additional storage on Briefcase.
- Open ERISXdl for production with charge-back implementation.