can this loop be vectorized?

Here's a simplified example of some data I have. I'd like to eliminate the loop because my x and y arrays in reality have tens of thousands of elements, not just three.
X = repmat(0:5:1000,201,1);
Y = repmat((1000:-5:0)',1,201);
x = [50 326 800];
y = [900 429 600];
H = NaN(201,201,length(x));
for k = 1:length(x)
H(:,:,k) = hypot(X-x(k),Y-y(k));
end

 Accepted Answer

Matt J
Matt J on 30 May 2014
Edited: Matt J on 30 May 2014
Doing bsxfun(@minus,X,...) will be inefficient since X contains a lot of replicated data. So, I propose,
X0=(0:5:1000).'; sX=length(X0);
Y0=(1000:-5:0).'; sY=length(Y0);
[I,J]=ndgrid(1:sY,1:sX);
Xc=bsxfun(@minus,X0,x(:).').^2;
Yc=bsxfun(@minus,Y0,y(:).').^2;
H=reshape( sqrt( Yc(I,:)+Xc(J,:)) ,sY,sX,[]);

5 Comments

Matt J
Matt J on 30 May 2014
Edited: Matt J on 30 May 2014
I get about a factor of 2 speed-up over
[Y,X]=ndgrid(Y0,X0);
H2 = hypot(bsxfun(@minus,X,reshape(x,1,1,N)),...
bsxfun(@minus,Y,reshape(y,1,1,N)));
for N=1000.
Nice. +1
The only caveat is that the data should indeed be repeated and that it was not only a simplified example by @Chad.
A still faster modification. No data replication at all. 1.5 times faster than my first proposal,
X0=(0:5:1000).'; sX=length(X0);
Y0=(1000:-5:0).'; sY=length(Y0);
Xc=reshape( bsxfun(@minus,X0,x(:).').^2 , 1,sX,N);
Yc=reshape( bsxfun(@minus,Y0,y(:).').^2, sY,1,N);
H=sqrt( bsxfun(@plus,Xc,Yc) );
This thread is fun to watch! I appreciate your clear examples; I'm learning quite a bit as I follow along.
To respond to a couple of points raised:
1. The data don't necessarily need to be repeated. If it's faster to not repeat the X and Y vectors, I'm happy to leave them as vectors.
2. I like the idea of taking out the square root to speed up the calculation because it's just as easy for me to compare the H.^2 values to a distance squared.
The reason I'm calculating H is because I have a surface Z and I want to identify all data in Z that are more than 80 meters from the points given by x and y. I'll end up setting
Z(min(H,[],3)>80) = NaN;
If it's faster, the sqrt function can be removed and I'll change the line above to
Z(min(H,[].3)>80^2) = NaN;
I've adapted your solution and shared it as a function on File Exchange here . Thanks all for your help on this.

Sign in to comment.

More Answers (3)

José-Luis
José-Luis on 30 May 2014
Edited: José-Luis on 30 May 2014
bsxfun() is your friend:
X = repmat(0:5:1000,201,1);
Y = repmat((1000:-5:0)',1,201);
x(1,1,1:3) = [50 326 800];
y(1,1,1:3) = [900 429 600];
tic
H = NaN(201,201,length(x));
for k = 1:length(x)
H(:,:,k) = hypot(X-x(k),Y-y(k));
end
toc
tic
H_new = hypot(bsxfun(@minus,X,x),bsxfun(@minus,Y,y));
toc
Vectorized code is not necessarily faster. If performance is really an issue, you could try a mex file.
EDIT
Another option:
tic
H_nuevo = hypot(repmat(X,[1 1 3]) - repmat(x,[201 201 1]) , ...
repmat(Y,[1 1 3]) - repmat(y,[201 201 1]) );
toc
An idea to increase the efficiency of your code is to try to compare squared distances instead, since the sqrt() is relatively expensive to perform.

6 Comments

as is hypot()
Out of curiosity I went ahead an mexed this and I'm seeing about a three times slowdown.
My bad... I meant the sqrt in hypot. I would not have guessed to a slow down with a mex file.
@Jose, the linear algebra and math libraries that MATLAB uses are highly optimized and multithreaded for this type of operation. Simple MEX code won't necessarily (I cautiously say even usually) outperform it.
Thanks Sean. Good to know. I find it hard to know a-priori whether Matlab is going to perform decently or will be mired in overheads. Any good rule of thumb?
What we say at seminars is:
  • You'll always see a speedup for signal processing, communications, or other applications where there is a persistent state that needs to be updated
  • Anything with fixed-point
You won't see a speedup for:
  • Linear Algebra, ffts, or functions that use IPP.
But, your milage will vary :)

Sign in to comment.

I'm seeing a slight (1.25ish times) speedup with the bsxfun approach
function speedyhypot()
X = repmat(0:5:1000,201,1);
Y = repmat((1000:-5:0)',1,201);
x = [50 326 800];
y = [900 429 600];
t1 = timeit(@FirstWay);
t2 = timeit(@SecondWay);
disp([t1 t2])
function FirstWay
H = NaN(201,201,length(x));
for k = 1:length(x)
H(:,:,k) = hypot(X-x(k),Y-y(k));
end
end
function SecondWay
H2 = hypot(bsxfun(@minus,X,reshape(x,1,1,3)),bsxfun(@minus,Y,reshape(y,1,1,3)));
end
end
Jan
Jan on 31 May 2014
Edited: Jan on 31 May 2014
hypot is smart with avoiding overflows. But for limited values sqrt(a^2+b^2) is much faster:
for k = 1:length(x)
H(:,:,k) = sqrt((X-x(k)).^2 + (Y-y(k)).^2);
end
I get a speedup of factor 3, much more than the bsxfun tricks.
The power operator is expensive, so it is worth to try to reduce the squaring by using the identity: (a-b)^2 == a^2 - 2*a*b + b^2:
X2 = X .^ 2;
Y2 = Y .^ 2;
for k = 1:length(x)
H(:,:,k) = sqrt(X2 - X*(2*x(k)) + x(k)^2 + Y2 - Y*(2*y(k)) + y(k)^2);
end
2*X*x(k) is less efficient than X*(2*x(k)), because in the first case you have two multiplications of a scalar with a matrix, but in the second one one matrix operation only.
But this is slower than the first method. It seems like Matlab smartly performs a multiplikation for .^2 and not a power operation.

5 Comments

Matt J
Matt J on 31 May 2014
Edited: Matt J on 31 May 2014
I get a speedup of factor 3, much more than the bsxfun tricks.
For me, bsxfun is still about 15% faster (for length(x)=4000). The gap increases to a factor of 2 if we drop the sqrt() operation, as Chad said is permissible in his comment here.
so it is worth to try to reduce the squaring by using the identity: (a-b)^2=a^2 - 2*a*b + b^2
There is a sacrifice in numerical accuracy from doing so. You end up subtracting two very large numbers.
@Matt J: I'm using the ancient Matlab version 2011b and should have mention this.
Sean de Wolski found a speedup of 1.25 with bsxfun. In your solution hypot is replaced by sqrt(a^2+b^2) and I think that this is the main reason for the speedup. The additional 15% with bsxfun are fine, but considering hypot as bottleneck is important.
I do not find any numerical differences between the solutions with the provided test data. This is obviously caused bythe integer values, which can be represented exaclty. For real data numerical differences will occur. But I though it is interesting to mention, that it is even not faster.
Matt J
Matt J on 1 Jun 2014
Edited: Matt J on 1 Jun 2014
In your solution hypot is replaced by sqrt(a^2+b^2) and I think that this is the main reason for the speedup.
I don't think so. Below is a test script that includes all three methods, mine, yours, and Sean's re-implemented without hypot. The timings I get are as follows,
Elapsed time is 1.846706 seconds. %Sean
Elapsed time is 0.881087 seconds. %Jan
Elapsed time is 0.427472 seconds. %Matt
Note that I have dropped the sqrt operations because Chad says he doesn't need them. When I put them back in, the gap between yours and mine diminishes,
Elapsed time is 4.260694 seconds. %Sean
Elapsed time is 1.546904 seconds. %Jan
Elapsed time is 1.291443 seconds. %Matt
but there is still a considerable gap between mine and Sean's.
N=4000; x=rand(1,N); y=x;
X0=(0:5:1000).'; sX=length(X0);
Y0=(1000:-5:0).'; sY=length(Y0);
[Y,X]=ndgrid(Y0,X0);
%%%Sean
tic;
H = ( bsxfun(@minus,X,reshape(x,1,1,N)).^2 - ...
bsxfun(@minus,Y,reshape(y,1,1,N)).^2 );
toc
%%%Jan
tic;
H=zeros(sY,sX,N);
for k = 1:length(x)
H(:,:,k) = ((X-x(k)).^2 + (Y-y(k)).^2);
end
toc
%%%Matt
tic;
Xc=reshape( bsxfun(@minus,X0,x(:).') , 1,sX,N);
Yc=reshape( bsxfun(@minus,Y0,y(:).'), sY,1,N);
H=( bsxfun(@plus,Xc,Yc) );
toc
Jan
Jan on 1 Jun 2014
Edited: Jan on 1 Jun 2014
@Matt J: Which Matlab version is used for your timings?
Matt J
Matt J on 1 Jun 2014
Edited: Matt J on 1 Jun 2014
R2013b. I get similar timings in R2012a, though.

Sign in to comment.

Categories

Find more on Historical Contests in Help Center and File Exchange

Tags

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!