CUDAKernel Object crashes GPU
Show older comments
Hi,
I am running some calculation using Matlab on the GPU using CUDAKernel Object.
It was working fine with a grid size of 41x41, but with different grid sizes the GPU crashes. Yet, it does not seem like a problem of memory since it is working with 61x61, but with 55x55 is crashes. The calculations are fine (I compared it to a CPU calculation).
When I loaded all my data to the GPU , I saw before the Kernel execution that I have left around 1.5GB out of its 2GB from 'Dedicated GPU Memory'.
Does the size of the "Result" vector that I sent to the GPU change during the calculation? I sent all zeros and then each thread is calculating a value for different cell in the vector.
The error message i get when I close Matlab is:
NVIDIA OpenGL Driver Unable to recover from a kernel exception. The application must close.
Error code: 3 (subcode 2)
I tried to change the settings of NVIDIA control panel Gloabl settings to 3D App - Visual Simulation.
This trick worked with the 55x55 grid, but did not solve the problem for other sizes such as 71x71 which makes me thing it is only in the right direction but not quiet sufficient.
Thank you very much, I am looking forward for your help.
8 Comments
Joss Knight
on 28 Sep 2018
The most likely explanation is a bug in your CUDAKernel C++ code. You should post that code.
Omer Hamburger
on 28 Sep 2018
Joss Knight
on 30 Sep 2018
There are many things that could be wrong with this code if you are not calling it with the right inputs, for instance if b is not numReplica*(colShift-1) + mWidth elements, or if Result is not long enough. So you'd better post the code you are using to define the inputs and the launch the parameters of the CUDAKernel.
Omer Hamburger
on 30 Sep 2018
Joss Knight
on 2 Oct 2018
Edited: Joss Knight
on 2 Oct 2018
Of course the calling code is relevant! It's essential that you define your CUDAKernel block and grid dimensions correctly and that you're passing the right data of the right size and shape to the kernel. This is especially important for your kernel because CUDAKernel will not automatically detect the correct block and grid dimensions from your data, because you are processing a chunk of the input with each thread. All I need to know is the properties of your CUDAKernel and the size and shape of all the arguments to feval.
Result cannot grow during your calculation, if it does that will cause an illegal memory access. So Result must be initialized to be mHeight*numReplica elements.
Omer Hamburger
on 5 Oct 2018
Joss Knight
on 6 Oct 2018
Sorry, I can't interpret all that. Please just display the CUDAKernel object so I can see all its properties, show the line of code where you call feval, call size on all the array input arguments and show me the results, and give me the value of all the scalar input arguments ( mWidth, mHeight, colShift, SizeSparse, numReplica ).
Omer Hamburger
on 8 Oct 2018
Answers (0)
Categories
Find more on GPU Computing in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!