Using CUDA mex files interoperably with gpuArray

4 views (last 30 days)
I have a 3rd party black box CUDA mex file. Every time I use it, it puts my GPU in a state such that gpuArray cannot subsequently use it. This is in R2020a, but I've observed similar problems in previous versions.
>> a=gpuArray(1)
Error using gpuArray
An unexpected error occurred during CUDA execution. The CUDA error was:
CUDA_ERROR_CONTEXT_IS_DESTROYED
The only remedy once this occurs is to restart Matlab. Is there any way gpuArray and my cuda mex can share use of the same GPU without nuclear devastation?
  3 Comments
Edric Ellis
Edric Ellis on 15 Jan 2021
Unfortunately, I suspect there is not. It is definitely intended that you should be able to use CUDA MEX files together with gpuArray. If the CUDA MEX files are putting the GPU into a bad state, then reset(gpuDevice) is indeed the best option. You can use save to save gpuArray data to a file (providing you do this before the GPU is in a bad state).
Matt J
Matt J on 15 Jan 2021
It is definitely intended that you should be able to use CUDA MEX files together with gpuArray
Thanks, Edric, but can you speculate what basic mistake the MEX might be making that would put the GPU in a bad state, and how it could be fixed? How exactly is a CUDA MEX file supposed to be aware of and operate within the same CUDA context as gpuArray code?

Sign in to comment.

Accepted Answer

Joss Knight
Joss Knight on 16 Jan 2021
Edited: Joss Knight on 16 Jan 2021
A third party CUDA MEX file could be built with a different version of the CUDA toolkit, and thus a different version of the CUDA runtime. You will not be able to safely pass gpuArray objects into and out of the function.
To avoid this, obtain the source code and compile the function yourself using MEXCUDA.
That said, there's nothing to guarantee it won't work...but in this case it looks like your MEX function is resetting the device itself and thus destroying the CUDA context (for instance, it could be calling the CUDA C runtime function cudaDeviceReset). You won't be able to use it safely with gpuArray variables in your MATLAB workspace. Again, if you had access to the source code, you could modify this behaviour and even have MATLAB successfully respond to the reset event.
  3 Comments
Joss Knight
Joss Knight on 16 Jan 2021
Well, I was being loose with my words. It's possible to have a call to reset in MATLAB successfully clear up persistent state in a MEX function, equivalently it is possible to cleanly reset the device from a MEX function (typically by actually executing the MATLAB function reset(gpuDevice) rather than calling the CUDA API). One of the things you can theoretically do by managing device reset is save GPU variables or return them to the host so the data is not lost.
However, it's pretty rare to need reset the device in a MEX function so perhaps your best bet is to work out why the CUDA context is being destroyed and fixing that problem.
Matt J
Matt J on 22 Jan 2021
Getting rid of the cudaDeviceReset's fixed it. Thanks!

Sign in to comment.

More Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!