cuda
Link to section 'Description' of 'cuda' Description
CUDA is a parallel computing platform and programming model invented by NVIDIA. It enables dramatic increases in computing performance by harnessing the power of the graphics processing unit (GPU).
Link to section 'Versions' of 'cuda' Versions
- Scholar: 12.1.0
- Gilbreth: 8.0.61, 9.0.176, 10.0.130, 10.2.89, 11.0.3, 11.2.0, 11.7.0, 12.1.1
- Anvil: 11.0.3, 11.2.2, 11.4.2, 12.0.1
- Gautschi: 12.6.0
Link to section 'Module' of 'cuda' Module
You can load the modules by:
module load cuda
Link to section 'Monitor Activity and Drivers' of 'cuda' Monitor Activity and Drivers
Users can check the available GPUs, their current usage, installed version of the nvidia drivers, and running processes with the command nvidia-smi
. The output should look something like this:
User@gilbreth-fe00:~/cuda $ nvidia-smi
Sat May 27 23:26:14 2023
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 515.48.07 Driver Version: 515.48.07 CUDA Version: 11.7 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA A30 Off | 00000000:21:00.0 Off | 0 |
| N/A 29C P0 29W / 165W | 19802MiB / 24576MiB | 0% Default |
| | | Disabled |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 29152 C python 9107MiB |
| 0 N/A N/A 53947 C ...020.11-py38/GP/bin/python 2611MiB |
| 0 N/A N/A 71769 C ...020.11-py38/GP/bin/python 1241MiB |
| 0 N/A N/A 72821 C ...8/TorchGPU_env/bin/python 2657MiB |
| 0 N/A N/A 91986 C ...2-4/internal/bin/gdesmond 931MiB |
+-----------------------------------------------------------------------------+
We can see that the node gilbreth-fe00
is running driver version 515.48.07 and is compatible with CUDA version 11.7. We do not recommend users to run jobs on front end nodes, but here we can see there are three python processes and one gdesmond process.
Link to section 'Compile a CUDA code' of 'cuda' Compile a CUDA code
The below vectorAdd.cu is modified from the textbook Learn CUDA Programming.
#include<stdio.h>
#include<stdlib.h>
#define N 512
void host_add(int *a, int *b, int *c) {
for(int idx=0;idx<N;idx++)
c[idx] = a[idx] + b[idx];
}
//basically just fills the array with index.
void fill_array(int *data) {
for(int idx=0;idx<N;idx++)
data[idx] = idx;
}
void print_output(int *a, int *b, int*c) {
for(int idx=0;idx<N;idx++)
printf("\n %d + %d = %d", a[idx] , b[idx], c[idx]);
}
int main(void) {
int *a, *b, *c;
int size = N * sizeof(int);
// Alloc space for host copies of a, b, c and setup input values
a = (int *)malloc(size); fill_array(a);
b = (int *)malloc(size); fill_array(b);
c = (int *)malloc(size);
host_add(a,b,c);
print_output(a,b,c);
free(a); free(b); free(c);
return 0;
}
We can compile the CUDA code by the CUDA nvcc compiler:
nvcc -o vector_addition vector_addition.cu
Link to section 'Example job script' of 'cuda' Example job script
#!/bin/bash
#SBATCH -A XXX
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=20
#SBATCH --cpus-per-task=1
#SBATCH --gpus-per-node=1
#SBATCH --time 1:00:00
module purge
module load gcc/XXX
module load cuda/XXX
#compile the vector_addition.cu file
nvcc -o vector_addition vector_addition.cu
#runs the vector_addition program
./vector_addition