Computational Resources

ERIS provides a range of computational resources, platforms and scientific computing support for research and innovation at Mass General Brigham hospitals.  Our high-performance analysis servers, compute clusters and storage are relied upon daily for data processing and analysis by research groups across the organization.  Clinical computational workflows such as genome sequencing and radiation dosimetry are also supported. 

Service Model - What is the cost?

Shared resources are available to all users at no cost. This includes a basic storage quota, shared computational resources and assistance from the ERIS Support teams.  Additional storage and compute capacity can be acquired using research funds.

Selecting a Computational Resource

When choosing a computational resource from our list of services below, consider

  • Your preferred computing platform (Microsoft Windows, Linux or Hadoop)
  • What the software application requirements are - what platforms does the software run on?
  • If you will work interactively with applications and data, or submit many jobs together for batch-processing
  • How large is the data you will be working with, what are the storage requirements?
  • How much memory will the application require? 

‼ For details on the HPC Remediation plan see the section below 

HPC Remediation Plan

Modernized, more reliable version of the ERISTwo HPC cluster that rebuilds the existing infrastructure using current-generation tools

MGB Digital is currently in the process of improving the ERISTwo Linux Cluster as part of our HPC remediation plan. The remediation project leverages the existing ERISTwo cluster hardware while implementing new operational and system-level processes designed to enhance performance, stability, and usability, building a more modern and sustainable HPC platform. This upgraded version of ERISTwo is what we’re calling ERIS Nucleus.

Approach

We are taking the existing ERISTwo and ERISXdl infrastructure and rebuilding it using modern HPC cluster management tools and methodologies. The project follows a phased approach:

  • Phase 1: Establish foundational infrastructure and services
  • Phase 2: Deploy ERISTwo Nucleus (a subset of existing cluster nodes) as a modern HPC cluster
  • Phase 3: Integrate the remaining cluster nodes into ERISTwo Nucleus

This is a major upgrade that includes a new RHEL 9 operating system, an updated software toolbox, and an upgraded 100 GbE network fabric. These changes will require many existing workflows and scientific applications to be thoroughly tested, so your participation in testing is essential.

Why are we doing this?

The ERISTwo HPC cluster has been experiencing performance, consistency, and stability issues. Additionally, the ERISXdl GPU cluster’s Slurm/Kubernetes setup has proven complex for many AI/ML and other workloads.

What are some of the underlying reasons for these issues?

  • The existing 40GbE network is underperforming and poorly configured.
  • The ERISTwo cluster still relies on legacy ERISOne services and processes, making day-to-day operations fragile.
  • The cluster is manually managed and configured, increasing the risk of configuration drift and instability.
  • Cluster services lack resilience, creating additional risks of instability.
ERIS Linux Cluster

High Performance Computing System with a job scheduler for batch jobs, storage and remote desktops with GPUs for graphical applications

ERISXdl Linux GPU Platform

ERISXdl (ERIS Extreme Deep Learning) platform provides efficient, multi-GPU performance designed for Deep Learning applications

Analytics Enclave Hub

A highly secure, privacy-aware, data ecosystem equipped with self-service AI, machine learning, and research data tools.

Windows Analysis Servers

Powerful servers with scientific applications installed and remote desktop capability, running on a Microsoft Windows OS