October 30, 2024
The operating system and software on compute nodes requires periodic updates as bug fixes and stability improvements are released in the Linux distribution. Configuration changes may also be required to improve performance or stability. ERISTwo will announce via the ERISTwo mailing list the availability of upgraded nodes for testing before these changes are applied to all cluster nodes.
Testing Cluster jobs prior to compute node updates
Compute node updates are first made available on a group of testing nodes for all users to validate their applications on the new platform before being applied throughout the cluster. To run a job on the latest testing platform, use these additional options with "bsub":
bsub -sla testing_sc -m testing_hg < test.lsf
replacing "test.lsf" with your job script. The '-sla' option gives access to the testing nodes, and the '-m' option restricts the job to only run on a testing node.
Forcing a job to only run on a specific operating system version
The current operating system version of all nodes (or a specific node) can be checked with the commands
lshosts -s osver
RESOURCE VALUE LOCATION
osver 6.7 aristo
osver 6.7 celeste
osver 7.4 atto008
osver 6.5 cmu004
osver 6.5 cmu005
which returns version numbers, eg "6.5", in the VALUE column. To view the status of nodes matching a specific operating system version, use
bhosts -R 'select[osver=6.5]'
and to submit a job that must run on version "6.7" use the same resource select condition as an option to "bsub":
bsub -R 'select[osver=6.7]'
Notes
The number and type of node available in the testing group will vary.
Updates are applied to remote desktop nodes, nodes in the "interact" queue, nodes in the "matlab_hg" and nodes in the "testing_hg" node group earlier than all other compute nodes.