Main Content

gpucoder.transpose

Optimized GPU implementation of the MATLAB transpose function

Description

example

B = gpucoder.transpose(A) performs efficient out-of-place non-conjugate transpose on the GPU using shared memory. When called from MATLAB® (out of the code generation context), gpucoder.transpose calls the built-in transpose function.

Examples

collapse all

This example generates CUDA® code to transpose a matrix.

In one file, write an entry-point function myTranspose that accepts a matrix input A. Use the gpucoder.transpose function to generate a GPU efficient implementation for transposing A.

function B = myTranspose(A)
     B = gpucoder.transpose(A);
end

Use the codegen function to generate CUDA MEX function.

codegen -config coder.gpuConfig('mex') -args {ones(1024,1024,'double')} -report myTranspose

Input Arguments

collapse all

Input array, specified as a vector or matrix.

Data Types: single | double | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64 | logical | char
Complex Number Support: Yes

Output Arguments

collapse all

Transposed array, returned as a vector or matrix.

Data Types: single | double | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64 | logical | char

Limitations

  • gpucoder.transpose does not support inputs that are of dimension greater than two.

Version History

Introduced in R2019a