Using a templated CUDA kernel via MATLAB
Show older comments
Hello,
Is it possible to use a C++-style templated CUDA kernel via MATLAB's GPU Computing interface?
For example, consider the following (useless) toy code:
template<typename T>
__global__ void get_nans(T*, const int*);
template<>
__global__ void get_nans<double>(double* out, const int* dims)
{
const int tx = blockIdx.x*blockDim.x + threadIdx.x;
const int ty = blockIdx.y*blockDim.y + threadIdx.y;
if ((tx < dims[1]) && (ty < dims[0]))
out[tx*dims[0] + ty] = nan(0);
}
template<>
__global__ void get_nans<float>(float* out, const int* dims)
{
const int tx = blockIdx.x*blockDim.x + threadIdx.x;
const int ty = blockIdx.y*blockDim.y + threadIdx.y;
if ((tx < dims[1]) && (ty < dims[0]))
out[tx*dims[0] + ty] = nanf(0);
}
I then compile this into PTX code, but when I try to instantiate the kernel object in MATLAB I get the following error:
>> k = parallel.gpu.CUDAKernel( 'get_nans.ptx', 'get_nans.cu' );
Error using handleKernelArgs (line 61)
Found multiple matching entries in the PTX code. Matches found:
_Z16get_nansIdEvPT_PKS0_S3_S3_PKiS5_
_Z16get_nansIfEvPT_PKS0_S3_S3_PKiS5_
Thank you,
Alex
Accepted Answer
More Answers (0)
Categories
Find more on GPU Computing in Help Center and File Exchange
Products
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!