ifft2 on GPU array

6 views (last 30 days)
Bruno Alvisio
Bruno Alvisio on 3 Jan 2022
Edited: Matt J on 4 Jan 2022
I am trying to compute the ifft2 of a multiple matrices. The simplete code snippet is:
gAs = gpuArray.rand(999, 519, 20);
gBs = gpuArray.rand(999, 519);
ifft2(gAs .* gBs, "symmetric");
Error using gpuArray/ifft2
An invalid array was used on the GPU.
I thought that I was using all the GPU memory. I tried using single GPU arrays but it However, I then tried the following code (bigger matrix) and worked just fine.
gAs = gpuArray.rand(1000, 519, 2);
gBs = gpuArray.rand(1000, 519);
ifft2(gAs .* gBs, "symmetric");
I know that I can also do a for-loop through gAs slices and it works but I want to get some speedup by doing it in one call to ifft2.
I wanted to understand why this is happening and if there is a way in which I can pad the matrices so that I can still get the ifft2 of the original matrices.
For reference:
>> gpuDevice()
ans =
CUDADevice with properties:
Name: 'Tesla V100-SXM2-32GB'
Index: 1
ComputeCapability: '7.0'
SupportsDouble: 1
DriverVersion: 11.2000
ToolkitVersion: 11
MaxThreadsPerBlock: 1024
MaxShmemPerBlock: 49152
MaxThreadBlockSize: [1024 1024 64]
MaxGridSize: [2.1475e+09 65535 65535]
SIMDWidth: 32
TotalMemory: 3.4090e+10
AvailableMemory: 3.3167e+10
MultiprocessorCount: 80
ClockRateKHz: 1530000
ComputeMode: 'Default'
GPUOverlapsTransfers: 1
KernelExecutionTimeout: 0
CanMapHostMemory: 1
DeviceSupported: 1
DeviceAvailable: 1
DeviceSelected: 1
  3 Comments
Bruno Alvisio
Bruno Alvisio on 3 Jan 2022
Right. Thank you. Shouldn't this code throw the same error since the matrices are not symmetric: (in my case runs fine)
gAs = gpuArray.rand(999, 519);
gBs = gpuArray.rand(999, 519);
ifft2(gAs .* gBs, "symmetric");
Thanks again
Walter Roberson
Walter Roberson on 3 Jan 2022
Sorry, I would have to boot into a different operating system to test (GPU is not supported on my MacOS.)

Sign in to comment.

Accepted Answer

Matt J
Matt J on 4 Jan 2022
Edited: Matt J on 4 Jan 2022
I think you should probably just omit the 'symmetric' flag. On the GPU (mine at least), it doesn't seem to make a big difference in performance:
A = gpuArray.rand(512,512,512);
gputimeit(@() ifft2(A,'symmetric') ) % 0.0706 seconds
gputimeit(@() ifft2(A) ) % 0.0753 seconds
Whether this is an indication of sub-optimal software design on Mathworks part, I'm not sure. On the CPU, the 'symmetric' flag means the software does fewer flops, but on a parallel system like the GPU, it's not the number of flops that matters.

More Answers (1)

Matt J
Matt J on 3 Jan 2022
Edited: Matt J on 3 Jan 2022
I think it's a bug, but one solution might be,
fn=@(z,d) ifft(z,[],d,'symmetric');
out = fn( fn(gAs .* gBs,1) ,2);
  2 Comments
Bruno Alvisio
Bruno Alvisio on 4 Jan 2022
Thanks for the answer. The code you provided is correct.
I have noticed though that very often there is a discrepancy between the results of the function handle fn and ifft2 even for 2 dimensional matrices when their sizes are greater than ~4. I created the following code snippet. If run multiple times sometimes it displays not equal .
clear all;
close all;
fn=@(z,d) ifft(z, [], d, "symmetric");
m = 5;
n = 4;
a = gpuArray.rand(m, n);
b = gpuArray.rand(m, n);
c = ifft2(a .* b, "symmetric");
d = fn(fn(a .* b, 1), 2);
if ~abs(c - d) <= eps(max(abs(c), abs(d)))
disp("not equal")
end
IIUC, are you suggesting that there is a bug in ifft2 when the symmetric flag is provided.
Matt J
Matt J on 4 Jan 2022
Edited: Matt J on 4 Jan 2022
It seems I had a conceptual error. ifft(ifft(X,1,'sym'),2,'sym') is not a valid replacement for ifft2(X,'sym') unless X is symmetric about both the x and y axes.
However, it does seem like a bug that only certain array sizes work for gpuArray.ifft2(). The CPU version of ifft2() doesn't have that problem.

Sign in to comment.

Products


Release

R2021b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!