ifft2 on GPU array

Question

Bruno Alvisio on 3 Jan 2022

1
Link

Direct link to this question

https://ch.mathworks.com/matlabcentral/answers/1621490-ifft2-on-gpu-array

Edited: Matt J on 4 Jan 2022

I am trying to compute the ifft2 of a multiple matrices. The simplete code snippet is:

gAs = gpuArray.rand(999, 519, 20);   
gBs = gpuArray.rand(999, 519);      
ifft2(gAs .* gBs, "symmetric");           
Error using gpuArray/ifft2 
An invalid array was used on the GPU. 

I thought that I was using all the GPU memory. I tried using single GPU arrays but it However, I then tried the following code (bigger matrix) and worked just fine.

gAs = gpuArray.rand(1000, 519, 2);   
gBs = gpuArray.rand(1000, 519);      
ifft2(gAs .* gBs, "symmetric");

I know that I can also do a for-loop through gAs slices and it works but I want to get some speedup by doing it in one call to ifft2.

I wanted to understand why this is happening and if there is a way in which I can pad the matrices so that I can still get the ifft2 of the original matrices.

For reference:

>> gpuDevice()
ans = 
  CUDADevice with properties:
                      Name: 'Tesla V100-SXM2-32GB'
                     Index: 1
         ComputeCapability: '7.0'
            SupportsDouble: 1
             DriverVersion: 11.2000
            ToolkitVersion: 11
        MaxThreadsPerBlock: 1024
          MaxShmemPerBlock: 49152
        MaxThreadBlockSize: [1024 1024 64]
               MaxGridSize: [2.1475e+09 65535 65535]
                 SIMDWidth: 32
               TotalMemory: 3.4090e+10
           AvailableMemory: 3.3167e+10
       MultiprocessorCount: 80
              ClockRateKHz: 1530000
               ComputeMode: 'Default'
      GPUOverlapsTransfers: 1
    KernelExecutionTimeout: 0
          CanMapHostMemory: 1
           DeviceSupported: 1
           DeviceAvailable: 1
            DeviceSelected: 1

3 Comments
Show 1 older commentHide 1 older comment

Bruno Alvisio on 3 Jan 2022

Open in MATLAB Online

Right. Thank you. Shouldn't this code throw the same error since the matrices are not symmetric: (in my case runs fine)

gAs = gpuArray.rand(999, 519);
gBs = gpuArray.rand(999, 519);
ifft2(gAs .* gBs, "symmetric");

Thanks again

Walter Roberson on 3 Jan 2022

Sorry, I would have to boot into a different operating system to test (GPU is not supported on my MacOS.)

Sign in to comment.

Sign in to answer this question.

Answer 1

Matt J on 4 Jan 2022

1
Link

Direct link to this answer

https://ch.mathworks.com/matlabcentral/answers/1621490-ifft2-on-gpu-array#answer_867705

Edited: Matt J on 4 Jan 2022

I think you should probably just omit the 'symmetric' flag. On the GPU (mine at least), it doesn't seem to make a big difference in performance:

A = gpuArray.rand(512,512,512);

gputimeit(@() ifft2(A,'symmetric') ) % 0.0706 seconds

gputimeit(@() ifft2(A) ) % 0.0753 seconds

Whether this is an indication of sub-optimal software design on Mathworks part, I'm not sure. On the CPU, the 'symmetric' flag means the software does fewer flops, but on a parallel system like the GPU, it's not the number of flops that matters.

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Answer 2

Matt J on 3 Jan 2022

1
Link

Direct link to this answer

https://ch.mathworks.com/matlabcentral/answers/1621490-ifft2-on-gpu-array#answer_867155

Edited: Matt J on 3 Jan 2022

Open in MATLAB Online

I think it's a bug, but one solution might be,

fn=@(z,d) ifft(z,[],d,'symmetric');
out = fn( fn(gAs .* gBs,1)  ,2);

2 Comments
Show NoneHide None

Bruno Alvisio on 4 Jan 2022

Open in MATLAB Online

Thanks for the answer. The code you provided is correct.

I have noticed though that very often there is a discrepancy between the results of the function handle fn and ifft2 even for 2 dimensional matrices when their sizes are greater than ~4. I created the following code snippet. If run multiple times sometimes it displays not equal .

clear all;
close all;
fn=@(z,d) ifft(z, [], d, "symmetric");
m = 5;
n = 4;
a = gpuArray.rand(m, n);
b = gpuArray.rand(m, n);
c = ifft2(a .* b, "symmetric");
d = fn(fn(a .* b, 1), 2);
if ~abs(c - d) <= eps(max(abs(c), abs(d)))
    disp("not equal")
end

IIUC, are you suggesting that there is a bug in ifft2 when the symmetric flag is provided.

Matt J on 4 Jan 2022

Edited: Matt J on 4 Jan 2022

It seems I had a conceptual error. ifft(ifft(X,1,'sym'),2,'sym') is not a valid replacement for ifft2(X,'sym') unless X is symmetric about both the x and y axes.

However, it does seem like a bug that only certain array sizes work for gpuArray.ifft2(). The CPU version of ifft2() doesn't have that problem.

Sign in to comment.

ifft2 on GPU array

3 Comments
Show 1 older commentHide 1 older comment

Accepted Answer

0 Comments
Show -2 older commentsHide -2 older comments

More Answers (1)

2 Comments
Show NoneHide None

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

ifft2 on GPU array

3 Comments Show 1 older commentHide 1 older comment

Accepted Answer

0 Comments Show -2 older commentsHide -2 older comments

More Answers (1)

2 Comments Show NoneHide None

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

3 Comments
Show 1 older commentHide 1 older comment

0 Comments
Show -2 older commentsHide -2 older comments

2 Comments
Show NoneHide None