ERISTwo Quick User Guide

What is ERISTwo 

ERISTwo is our next-generation CentOS 7 Linux cluster. It is currently in production to enable the transition off ERISOne. ERISTwo includes a change of administrative tools, an updated job scheduler, and an updated application module system. If you require GPU machines please use ERISXdl there is no support for new GPU applications on ERISTwo. The storage both home folders and Briefcase from ERISOne are available on ERISTwo and ERISXdl.

Why a new environment?

ERISOne has long been managed with administration tools that were supported by CentOS 6 but are not compatible with CentOS 7. We decided to ease the transition for admins, as well as users, to provide a separate environment where all cluster features can be thoroughly tested. All new nodes including adopted nodes are being deployed in CentOS 7 and all new applications are being implemented for the environment configuration of ERISTwo.

Why should I use it?

IMPORTANT: All users are urged to test ERISTwo for their workloads and should be able to migrate their work to ERISTwo. As users migrate their workloads we will migrate nodes to improve the capacity of ERISTwo. Only applications that are not compatible with CentOS 7 will be maintained on ERISOne and the computational resources will be reduced according to the number of people using each system. It is important for users to let us know if there are any features missing on ERISTwo.

You can take a look at the FAQ at FAQ: ERISTwo Linux Cluster (beta)

Comparison Table for Users
   ERISOne    ERISTwo
OS CentOS 6 CentOS 7
Filesystem Panasas Panfs Panasas Panfs 
Scheduler LSF 8.0.1 LSF 9.1.3.0
Login nodes Yes Yes
Applications Legacy modules/easybuild New modules
Remote Desktops Yes Yes
Filemovers Yes Yes
RStudio Pro No (Have been moved) Yes
Jupyter No (Have been moved) Yes
Shiny No (Have been moved) Yes

 

How to log-in  

If you already have access to ERISOne you can login via SSH, if not, you would need to request an account by filling out the ERISOne Account Request form. 

ERISTwo can be accessed by ssh: 
$ssh <userID>@eristwo.partners.org 

You will be landing on one of our two login nodes eris2n4 or eris2n5. In the same way as before, no large job should be run on these nodes. All jobs must be submitted to the compute nodes. 

 

General Usage 

Overview 

ERISTwo is a Linux cluster. Currently, it is only accessible via ssh from a command line interface (bash). Jobs should be submitted through the lsf job scheduler, this includes GPU jobs and file transfer jobs. Applications are loaded via the lmod module system. Unlike ERISOne there is no need to activate lmod since it is loaded by default 

 

LSF 9 

Platform Load Sharing Facility (or simply LSF) is a workload management platform, and job scheduler, for distributed high-performance computing (HPC). We have implemented LSF version 9.1.3.0. The general idea is, that each computational job should be submitted to the system, so the system can distribute the jobs to the available nodes, providing each user with a fair share of the cluster and maximizing efficiency.  

Do not run computational jobs on the login nodes. Computational jobs on the login nodes will be terminated and we reserve the right to ban users if we find the user in constant violation.  

 

How to submit a job 

The general syntax of a job submission is: 

$bsub [options]script.lsf 

Note: The “<” is important, when you pass an lsf script. 

The lsf script contains descriptions of the job. An example of an lsf script can be found in each user's lsf folder: ~/lsf/test.lsf 

 

The options can either be specified in the script or during the bsub command. Some important options: 

  • -q que_name: Specify the queue for the job 
  • -n <number of cores> : Request that number of cores for the job 
  • -R ' ' : Requirements, mainly on memory 

Note that if you don’t use an option, the default value is set. 

Example: 

$bsub -q normal -R 'rusage[mem=64000]' < my_script.lsf  

Depending on job requirements, you need to choose the right queue for your job. (Right now there are only 5 queues). 

How to start an interactive job

To start a regular interactive job you can do, for example:

$bsub -Is -q interactive /bin/bash

You can start an interactive job with memory reservation as:

$bsub -Is -q interactive -R 'rusage[mem=64000]' /bin/bash  

Queues 

ERISTwo currently has the following queues:

Queue Memory limits Max run time Job limit PEND limit
GPU - 4 days 100 200
Normal <32G 15 days 500 1000
Filemove - 5 days 100 200
Bigmem >32G 5 days 100 200
Interactive - 5 days 5 0


  • GPU: Limited GPU machines are available. Users must request/renew access at @email.
  • Normal: Most cluster jobs fit the normal queue. All jobs with less than 32G memory requirement.
  • Filemove: Transfer files out/in ERISOne to an external mount.
  • Bigmem: Only for jobs with more than 32G memory requirement.
  • Interactive:  To request an interactive session.

If you don’t specify a queue it will start in the normal queue. This is meant for average-size jobs. If you require GPU acceleration you need to use the GPU queueFilemove is meant for transferring files. Note that you should not run any large file transfers (> 100MB) on the login node.  

The queues can be seen by the command bqueues 

bqueues  
QUEUE_NAME      PRIO STATUS          MAX JL/U JL/P JL/H NJOBS  PEND   RUN  SUSP  
filemove          1  Open:Active       -    -    -    -     0     0     0     0 
normal            1  Open:Active       -    -    -    - 27069 27019    50     0 
gpu           1  Open:Active       -    -    -    -     0     0     0     0
bigmem          1  Open:Active       -    -    -    -     0     0     0     0

interactive       1  Open:Active       -    -    -    -     0     0     0     0 

This command as well shows how many jobs are currently in the queue and gives you an idea how long it will take for new jobs to start. 

Jobs 

To see the current status of the jobs use:

$bjobs [options] [jobID] 

 When used without any options or argument it shows all your current jobs. If you want to see jobs by all users you can use “-u all”. To specify a queue with “-q [queue_name]”  

 
Example: 

$bjobs -q gpu -u all 

shows jobs by all users currently in the gpu queue  

 

Modules 

Environment modules are a great way to manage applications on a multi-user, multi-node system. The general problem is, that users have different requirements for what software and what version of the software they need. If multiple versions of the same software (especially libraries) are installed, the system needs rules to know what version to use. This is done by setting certain system variables. Modules allow you to easily set those variables by using a simple command.  On ERISOne, by default, we used tcl environment modules while ERISTwo used lmodLmod is a more advanced module system that is better in handling more complex dependencies and adds some user features. However, the basic usage and functionality is the same and even old module files can be used (Note, ERISOne module files point to CentOS 6 applications and should not be called). 

 

Basic usage of lmod 

In order to use an application that is available via lmod, you need to load the corresponding module: 

$module load [modulename]/[version_number] 

You don’t need to specify a version number but if you don’t it will load the default version. If no default is set, it will load the highest version number.  

In order to see available modules you use: 

$module avail 

This will show a long list of modules and the path where the corresponding module file is located. If you are looking for a specific application you can do: 

$module avail [application_name] 

This will list all modules that contain the application_name, independent of capitalization. This might as well be other modules that contain the application name in the module. Note that this search feature is unique to lmod and will not work on the tcl environment modules on ERISOne.   

If you need more information about a module you can use the following: 

$module spider [module_name] 

This will gather information about different versions and general information about the module.  

To see what modules are currently loaded, use 

$module list 

Note here that some modules load other modules as a dependency, so don’t be confused that when you have only loaded one module you see multiple ones listed.  

 

Example: 

$module load CMake 
$module list 

Currently Loaded Modules: 

  1) GCCcore/7.3.0   2) ncurses/6.1-GCCcore-7.3.0   3) CMake/3.11.4-GCCcore-7.3.0 

You can unload modules by  

$modules unload [module_name] 

  

Important: When you use an application in a batch job, you need to add the “module load” command to the lsf file.  

 

 

Go to KB0037393 in the IS Service Desk