Manipulating matrices/vectors in GPU array operation

1 view (last 30 days)
I am currently dealing with tasks manipulating matrices/vectors in a large number of different ways in parallel. I attempted to do this by gpuArray/arrayfun, however found that it is only capable of dealing with scalars: the nexted functions requires all inputs and outputs to be scalars. On the other hand, no parrelism acceleration is involved using a for loop implementation, based on my experiment. Consider the following example: M is a n-by-m matrix, I need to obtain n(n-1)/2 matrices as the results, which are obtained by exchanging all combinations of 2 rows of M. The codes (test.m) below are for-loop implementations using CPU and GPU respectively:
gpu=gpuDevice();
n=200;
M=rand(n,10);
P=nchoosek(1:n,2);
M1=gpuArray(M);
P1=gpuArray(P);
timer=tic();
R=cell(size(P,1),1);
for i=1:size(P,1)
p=P(i,1);
q=P(i,2);
N=M;
N(p,:)=M(q,:);
N(q,:)=M(p,:);
R{i,1}=N;
end
CT=toc(timer)
timer1=tic();
R1=cell(size(P1,1),1);
for j=1:size(P1,1)
p1=P1(j,1);
q1=P1(j,2);
N1=M1;
N1(p1,:)=M1(q1,:);
N1(q1,:)=M1(p1,:);
R1{j,1}=N1;
end
wait(gpu);
GT=toc(timer1)
>> test
CT =
0.20855
GT =
52.834
GPU implementation is siginificantly slower. Can this task be possibly done by gpuArray/arrayfun? Or is there any further vectorization?

Answers (1)

Joss Knight
Joss Knight on 8 May 2018
It's hard to know how to get started answering your question. Using the GPU requires vectorized code, so you need to translate loops into vector and matrix operations. See the documentation for Vectorization.
For instance, in your first loop you are permuting rows of N and M. Well, the vectorized version of this is to use vector indexing:
for i=1:size(P,1)
p=P(i,1);
q=P(i,2);
N=M;
N(p,:)=M(q,:);
N(q,:)=M(p,:);
end
is equivalent to
N(P(:,1),:) = N(P(:,2),:);
Of course, you are much better off using a permutation (or pivot) matrix so that your entire permutation can be written as a matrix multiplication. This is a matrix where for output row p, the q'th column is set to 1 and the rest to zero.
PM = accumarray(P, 1);
N = PM * N;
  2 Comments
Sicheng Zhang
Sicheng Zhang on 9 May 2018
Thank you very much Joss! So the key idea is, I need to design a mechanism to vectorize my codes such that all manipulations will be equivalent to a single matrix multiplication, is that correct? The example I gave was just a simplest case. In my work I'm dealing with much more complex manipulations. If I'm going to require multiple matrices as outputs, which have different sizes and cannot be concantenated, is it still possible to vectorize?
Joss Knight
Joss Knight on 10 May 2018
No, vectorizing code means using matrix operations, element-wise operations, indexing, logical masks, permutations, reductions, accumulations, scalar dimension expansion, sorting, and functions like accumarray, arrayfun and others. You should start with the vectorization documentation.

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!