GPU memory overhead dependent on fft dimension.
Show older comments
Hello all, I have a question regarding memory management during Matlab's gpuArray/fft operation. I have a large NxM matrix [N = 10E3,M = 20E3, as an approx] where where I wish to take an fft in the M dimension. Now, for CPU operations I would normally permute the matrix to make the fft operation act in the 1st (column) dimension, for speed.
On the GPU, if I run the fft operation in the 1st dimension, I slam into the memory ceiling of my GPU. However, if I apply it in the row dimension I do not. I assume that this has to do with whether Matlab is doing N asynchronous fft's in the row direction, vs. a single massive matrix operation in the column dimension.
So, 4 questions:
- Is my assumption true?
- Are GPU operations still faster in the column direction (sort of answered this myself, got 3x speed advantage with below snippet.)
- Is there a way to know what the GPU memory need will be for the fft? If so, I can try chunking up the fft based on the GPU memory available.
- Is there another implementation that will have the speed of the column operation without the memory issues? I am going to try doing this as an arrayfun just to see.
Code snippet:
x = gpuArray.rand(10000,10000);
xp = x.';
gputimeit(@() fft(x,[],1))
gputimeit(@() fft(xp,[],2))
Thanks all.
1 Comment
D. Plotnick
on 2 Jul 2018
Accepted Answer
More Answers (0)
Categories
Find more on GPU Computing in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!