October 31, 2024
Introduction
On the Scientific Computing (SciC) Linux Clusters, it is important to choose the correct queue so that your job is scheduled as quickly as possible and has access to the resources needed by your application. This is a list of the most common ERISTwo queues which apply equally for general use and for research groups with access to dedicated nodes.
Working with Queues
Set the queue to use with the "-q" option to bsub:
bsub -q long < my_script.lsf
Or by putting the same option in the header of your script:
#BSUB -q long
Specifying resource requirements
This example requests 4 CPU cores and 10GB RAM memory (specified in MB)
bsub -q big-multi -n 4 -R 'rusage[mem=10000]' < my_script.lsf
The same options given in an LSF script:
#BSUB -q big-multi #BSUB -n 4 #BSUB -R rusage[mem=10000]
Another example showing both memory requirement and memory limit settings, which are both needed for reservations of more than 40GB - here 64GB is reserved:
bsub -q big-multi -M 64000 -R 'rusage[mem=64000]' < my_script.lsf
The same options given in an LSF script:
#BSUB -q big-multi #BSUB -M 64000 #BSUB -R rusage[mem=64000]
Job arrays
Job arrays assign the CPU/memory allocation to each job in the job array. Requesting multiple job slots with "-n SLOTS" does not do this, except in the "mpi" queue.
Queues scripts examples
Several script templates are available on each home folder upon account creation, look for them on:
ls ~/lsf/templates/bsub
If you want to test it, copy each example on a different folder, for example on ~/lsf, and then submit the job as described on the example. Read each example for more detailed information.
If you have deleted your ~/lsf folder, you can copy it from /lsf/copy/templates.
Standard Queues on ERISTwo
The job scheduler offers several job queues to which you can submit your jobs. Each queue is optimized for different types of job, based on:
- run time
- memory requirement
- number of CPUs used in parallel
Summary
vshort
The "vshort" queue is a high priority queue for very short jobs requiring 1GB or less memory.
- Default memory allocation is 1GB and should not exceed 4GB..
- Maximum runtime is 15 minutes.
short
The "short" queue is a priority queue for short jobs taking less than 1 hour with modest memory requirements.
- Default memory allocation is 2GB.
- Minimum runtime is 10s.
- Maximum runtime is 1 hour.
- Memory requirement should be specified if more than 2GB and should not exceed 4GB.
- Ideal for single threaded applications (-n 1)
medium
The "medium" queue is a priority queue for jobs taking less than 1 day with modest memory requirements.
- Default memory allocation is 2GB.
- Minimum runtime is 1min.
- Maximum runtime is 24 hours.
- Memory requirement should be specified if more than 2GB and should not exceed 8GB.
- Ideal for applications using less than 4 CPU cores per job (-n 4 or less)
normal
The "normal" queue is a general queue for jobs taking less than 3 days
- Default memory allocation is 2GB.
- Minimum runtime is 1min.
- Maximum runtime is 3 days.
- Memory requirement should be specified if more than 2GB and should not exceed 8GB.
- Maximum CPU allocation is 6 CPU cores per job
long
The "long" queue is suitable for running jobs with modest memory requirements.
- Default memory allocation is 2GB.
- Minimum runtime is 1min.
- Maximum runtime is 1 week.
- Memory requirement should be specified if more than 2GB and should not exceed 8GB.
- Ideal for applications using less than 4 CPU cores per job (-n 4 or less).
vlong
The "long" queue is suitable for long running jobs with modest memory requirements.
- Default memory allocation is 2GB.
- Minimum runtime is 1min.
- Default max runtime is 4 weeks.
- Memory requirement should be specified if more than 2GB and should not exceed 4GB.
- Email Scientific Computing if you require access to queues with longer run time.
big
The "big" queue is suitable for single threaded, large memory jobs, 8GB or more.
- Big single node jobs will be dispatched fastest from this queue.
- Only 1-6 job slots (CPUs) can be allocated.
- Minimum runtime is 1min.
- Memory requirement should be specified if more than 16GB.
- Memory limit must also be set equal to memory reservation if more than 40GB.
- Maximum memory limit 498G.
big-multi
The "big-multi" queue is suitable for multi threaded, large memory jobs, 8GB or more.
- Big single node multi-core jobs will be dispatched fastest from this queue.
- Number of CPU cores required should be specified with the "-n THREADS" option.
- Ideal for applications using between 4 and 12 (or 16) CPUs per job (-n 4 or more)
- Minimum runtime is 1min.
- Memory requirement should be specified if more than 8GB.
- Memory limit must also be set equal to memory reservation if more than 40GB.
- Maximum memory limit 498G.
mpi
The "mpi" queue is for jobs using an implementation of the Message Passing Interface to run jobs spanning several compute nodes.
- job slots can be allocated on different hosts.
- Maximum runtime is 4 weeks.
- Memory requirement should be specified if more than 2GB and should not exceed 4GB per job slot.
- Example submission script to follow in the test folder.
Additional Queues
Queue for an interactive command line session
To open a login session on a compute node use the following:
bsub -Is /bin/bash
or, if X11 forwarding is required
bsub -Is -XF /bin/bash
Remember to request more than one job slot and additional memory if multi-threaded/large memory applications are to be run in the session (eg 10GB, 4 concurrent CPUs):
bsub -Is -n 4 -R 'rusage[mem=10000]'
Rerunnable queue for access to more resources
Submit to the "rerunnable" queue to use idle time on "Adopt-a-Nodes" that are otherwise available to your jobs. If the "Adopt-a-Node" owner requires the node while your job is running on (via the rerunnable queue) then your job will be terminated and resubmitted to the top of the queue for running elsewhere.
- Default memory allocation is 4GB.
- Maximum runtime is 4 weeks.
- Memory requirement should be specified if more than 4GB and should not exceed 12GB.
- !! Job scripts require testing to ensure they work after being terminated and restarted !!
Job slot allocation
Only the mpi queue will allocate job slots on different hosts when multiple jobs slots are requested with "-n SLOTS". All others allocate all jobs slots on the same host. Job arrays are treated differently and will run jobs on different hosts in all queues.
Priority node allocation
Labs that have priority nodes through the "Adopt-a-node" program get priority access to those nodes by submitting to the following queues:
- vlong
- long
- normal
- medium
- short
- vshort
- big
- big-multi
- interact
- matlab
- matlabdce
The "defaultlow" queue does not give access to priority nodes
Other requirements
Please contact Scientific Computing if none of the above queues fit your requirements