Graphics Processing Units (GPUs) on ERISOne computing cluster

Introduction

Graphics Processing Units (GPUs) can significantly accelerate some types of computational tasks.  The ERISOne computing cluster includes 8 nodes with 3 Nvidia Tesla M2070 cards in each. Using this capability requires software that supports GPU acceleration.

Getting access for the GPU nodes

An ERISOne cluster account does not give access to the GPU queue right away. Please contact us for getting access into the GPU nodes, and provide a brief description of the intended use for GPU processing and the applications that will be used. Once added to the gpu group you can submit jobs into the GPU queue.  

The gpu queue and the CUDA module

All jobs for gpu computing should be submitted to the "gpu" queue

An environment module for cuda/6.5.14 is available, this is the currently recommended version. Each gpu node includes three GPU cards.  There is not currently a mechanism to reserve a specific GPU card through the job scheduler, so we recommend submitting GPU jobs with a job slot reservation of 3, whether you will be using 1,2 or 3 GPU cards to avoid potential conflicts with jobs submitted by other users. You will need to specify the device ID (1,2 or 3) in the LSF script files. To load the CUDA modules, use:

module load cuda/6.5.14

An example LSF script may be:

#!/bin/bash
#BSUB -J test-gpu
#BSUB -q gpu
#BSUB -o output/test-%J.out
#BSUB -e output/test-%J.err
module load cuda/6.5.14

OpenCL support is included in the CUDA libraries. There is no need to separately load OpenCL.

Compiling your GPU software

If the application you require is not currently available in the list of installed modules, you can request an installation or compile the software yourself.  This section describes compiling GPU enabled code on the cluster.

  • Load the cuda module in order to compile software to run on the GPU nodes
  • Open an SSH terminal session on the software compilation node
ssh eris1pm01
  • Choose the installation location
    • For most people, your home folder of lab shared folder under /data is the best location
    • Some groups maintain environment modules and should compile under /apps/lab/<your_lab>
  • Load the cuda module
module load cuda/6.5.14
  • Follow the software direction for software compilation

GPU Computing resources

  • For a list of the gpu computing nodes, type:

bhosts gpu_hg

  • Access to the gpu nodes is via the "gpu" queue. This queue accepts both batch and interactive jobs

Coming soon

  • Support for MPI jobs with multiple GPUs

Related articles