Error when processing on HPC: Unable to allocate space for the FFT calculation. This might be due to insufficient memory on the GPU.

6 views (last 30 days)
Hello,
Error message: Unable to allocate space for the FFT calculation. This might be due to insufficient memory on the GPU.
I received this error message when I'm processing multiple images on a Slurm server. The code used both GPU and multi-core computing. The for loop goes over all the images are not parallelized, within each image, the cores work together to produce the result for this simgle image.
The error message shows up after going through around 4000 images. I tried to clear all the variables after completing every single image, and reset the GPU device every 2000 images, and the error message is still there.
The error results in a stop in calculation, the server gets a return 0 message (which means a normal exit on the server).
Please help.

Accepted Answer

Joss Knight
Joss Knight on 5 Mar 2023
At a guess you are trying to share one GPU between multiple workers on a pool. Depending on how work is scheduled one or two workers may have allocated all GPU memory leaving none for others.
Options:
  • Reduce the size of the pool
  • If on R2022b or later, try setting the gpuDevice CachePolicy to "minimum"
  • Place your code inside a try... catch block and ignore out of memory errors, or use the CPU instead if the GPU errors
  5 Comments
Joss Knight
Joss Knight on 6 Mar 2023
The ifft is computed in a data-parallel way but there is no overlap between the computations being run on different workers that share a GPU. Some overheads will be reduced but the main gains you see will be the fact that you have 4 GPUs.

Sign in to comment.

More Answers (0)

Categories

Find more on Cluster Configuration in Help Center and File Exchange

Products


Release

R2022b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!