Clear Filters
Clear Filters

Tutorial / classes / training for developing asynchronous CUDA and MEX code?

1 view (last 30 days)
Hello, I'm trying to improve the performance of my code which makes use of a GPU for calculations that primarily use MTimes. I have several lines of code I would like performed asynchronously. A rough sample of the code is shown below with the asynchronous portion identified.
N = 128; % Simulation Size
[Nx,Ny,Nz] = deal(N,N,N); % Simulation Cube
Hxey = gpuArray(ones(Nx+1,Ny,Nz)); % Represents Constant
Hxez = gpuArray(ones(Nx+1,Ny,Nz)); % Represents Constant
Hyez = gpuArray(ones(Nx,Ny+1,Nz)); % Represents Constant
Hyex = gpuArray(ones(Nx,Ny+1,Nz)); % Represents Constant
Hzex = gpuArray(ones(Nx,Ny,Nz+1)); % Represents Constant
Hzey = gpuArray(ones(Nx,Ny,Nz+1)); % Represents Constant
Ex = gpuArray(zeros(Nx,Ny+1,Nz+1)); % Ex cells
Ey = gpuArray(zeros(Nx+1,Ny,Nz+1)); % Ey cells
Ez = gpuArray(zeros(Nx+1,Ny+1,Nz)); % Ez cells
Hx = gpuArray(zeros(Nx+1,Ny,Nz)); % Hx cells
Hy = gpuArray(zeros(Nx,Ny+1,Nz)); % Hy cells
Hz = gpuArray(zeros(Nx,Ny,Nz+1)); % Hz cells
%% Representation of code I want to perform asynchronously using CUDA / MEX
%% Begin asynchronous portion
Hx = Hx+Hxey.*diff(Ey,1,3)+Hxez.*diff(Ez,1,2); % Central Difference
Hy = Hy+Hyez.*diff(Ez,1,1)+Hyex.*diff(Ex,1,3); % Central Difference
Hz = Hz+Hzex.*diff(Ex,1,2)+Hzey.*diff(Ey,1,1); % Central Difference
%% End asynchronous portion
Are there any classes or training offered by Matlab or tutorials available online that can teach me how to implement this?
Thank you!

Accepted Answer

Joss Knight
Joss Knight on 9 Mar 2022
Make all the variables involved gpuArray objects and those lines of code will run as asynchronously as the GPU allows. This means that the last line of code will exit before Hz has finished being computed.
  1 Comment
Nathan Zechar
Nathan Zechar on 12 Mar 2022
Thanks, I forgot to add gpuArray() to all my variables in this example. In my non-example code, I have it like this. So it appears it already runs asynchronosly.
With that out of the way, I would still like to attempt to optimize my code further. It appears I can still potentially speed up my code by a factor of 3 according to this source:
Are there any recommended tutorials for programming in CUDA C and implementing that code in Matlab?

Sign in to comment.

More Answers (0)




Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!