How to vectorize this for loop
Show older comments
for i = 1:no_frame
frame(:,i) = y(start:stop).*wind_sam;
start = start+h_size;
stop = start+w_sam-1;
end
3 Comments
Rik
on 23 Jan 2018
Are wind_sam h_size and w_sam scalar values or functions? And is y a vector or a function?
What is an "array function"? Is wind_sam a vector, a matrix or is it really a "function"? Please provide some example data, e.g. produced by rand.
Did you pre-allocate the output frame? If not, this might be more efficient than a vectorization, if creating the required temporary array to store all y(start:stop) is time consuming.
Jan
on 23 Jan 2018
@serena: Please post a working example, which e.g. explains the initial value of start and whether the subvectors of y overlap or not. Can hsize be smaller than w_sam - 1? Seeing a part of the code only impedes the suggestion of improvements.
Answers (3)
Walter Roberson
on 23 Jan 2018
0 votes
You have an initial value for start. Remove the first start-1 values from y. Reshape into rows of h_size. Take the first w_sam rows of the result.
1 Comment
Jan
on 23 Jan 2018
hsize could be smaller than w_sam-1. Then the overlapping subvectors of y cannot be represented by reshaping the vector.
David Goodmanson
on 23 Jan 2018
Edited: David Goodmanson
on 23 Jan 2018
Hi serena,
I believe this works, assuming that wind_sam is a column vector of length w_sam. The idea is to make an index matrix and use it to address the contents of y.
I did not use variable names start and stop because they are existing Matlab function names.
st0 = 5; % whatever the first start index is
st = st0 + h_size*(0:no_frame-1); % vector of starting indices
ind = st + (0:w_sam-1)'; % matrix of start-to-stop indices, in columns
frame1 = y(ind).*wind_sam;
If you have an older Matlab version and line for ind does not work, you could use
ind = repmat(st,w_sam,1) + repmat((0:w_sam-1)',1,no_frame);
11 Comments
serena dsouza
on 24 Jan 2018
Edited: Walter Roberson
on 26 Jan 2018
Jan
on 25 Jan 2018
@serena dsouza: Please mark you code with the mouse and press the "{} Code" button. Then the code is readable.
serena dsouza
on 26 Jan 2018
David Goodmanson
on 26 Jan 2018
Hi serena,
you didn't supply fs, l or y so I just picked some numbers so that the size of the 'frame' matrix came out all right. y is not really an issue assuming it is a column vector. I just used a very large column of random numbers for comparison purposes.
Running the for loop method vs. the answer I came up with before, I still get agreement in the frame calculation. (One thing the code does not do compared to yours is come up with the last values for start and stop, if it so happens you need those later). If you do not get agreement, could you supply an example where it doesn't work? Thanks.
% pick some stuff
fs = 32000;
l = 20000;
y = rand(1e6,1);
wind_int = 20; % Window duration 20 ms
ovlap = 50; % Overlap 50%
w_sam = (wind_int*fs/1000);
ovlap_sam = floor(w_sam*ovlap/100);
h_size = w_sam-ovlap_sam; % HOP size 200 ms
no_frame = floor(l/h_size)-1; % Number of frames
wind_sam = hamming(w_sam);
start = 1;
% ----------
% insert a line to save the first start point for the new code
st0 = start;
%-----------
stop = w_sam;
frame = zeros(w_sam,no_frame);
for i = 1:no_frame
frame(:,i) = y(start:stop).*wind_sam;
start = start+h_size;
stop = start+w_sam-1;
end
% new code
st = st0 + h_size*(0:no_frame-1); % vector of starting points
ind = st + (0:w_sam-1)'; % matrix of start-to-stop indices, in columns
%ind = repmat(st,w_sam,1) + repmat((0:w_sam-1)',1,no_frame); % long way to do same thing
frame1 = y(ind).*wind_sam;
max(max(abs(frame1-frame)))
ans = 0
Jan
on 26 Jan 2018
When I run the original loop and the vectorized method 100 times, I get these timings by tic/toc:
Elapsed time is 0.020065 seconds. % Loop
Elapsed time is 0.062524 seconds. % Vectorized
This means:
- The loop is efficient.
- At least for inputs of the test data, which are not tiny, the total runtime is very small. Then trying to optimize this piece of code can be a waste of time. Search in the net for "premature optimization".
David Goodmanson
on 27 Jan 2018
Hi Jan,
This is interesting. In this particular case I did not expect that the matrix way would be slower. When I tried this 10k times on my pc, the matrix way took "only" twice as long as the do loop way and not three times as long. Going to uint32 for the index variable got it down to 1.5 times as long. But it's still slower.
Jan
on 27 Jan 2018
@David: You have a large index matrix for the vectorized code. Then Matlab has to check for each index, if it is inside the boundaries. For y(a:b) this is checked for a and b only and the subvector can be accessed more efficiently, e.g. because a:b is not created explicitly. If you write: y([a:b]), this advantage vanishes.
serena dsouza
on 27 Jan 2018
Edited: Walter Roberson
on 28 Jan 2018
@serena: What does this mean? What is not getting what? Did you read my comments and take into account, that the loop is faster? Please be so kind to answer my question, although it might be clear for yourself already: can hsize be smaller than w_sam - 1? I'd try to suggest a method with logical indices, if it is clear, what you want to calculate exactly. It would be easier if you post a code, which runs successfully.
Walter Roberson
on 28 Jan 2018
Jan, your description of the benefits of y(a:b) over y([a:b]) do not agree with the example documentation at https://www.mathworks.com/help/matlab/ref/subsref.html#br5htww-6 which show that subsref is called with expanded indices.
@Walter: You are right, there is a discrepancy. Then I dare to consider the explanations at the link as simplification, which does not take the JIT acceleration into account. Usually I trust the documentation for good reasons, but here I am suspicious because of the timings:
function Untitled
x = rand(1, 1e6);
y = zeros(1000, 1000);
% Warm up - not used for measuring:
for k = 1:1000
a = (k - 1) * 1000 + 1;
y(:, k) = x([a:a+999]);
end
tic
for rep = 1:100
for k = 1:1000
a = (k - 1) * 1000 + 1;
y(:, k) = x([a:a+999]);
end
end
toc
tic
for rep = 1:100
for k = 1:1000
a = (k - 1) * 1000 + 1;
v = a:a+999;
y(:, k) = x(v);
end
end
toc
tic
for rep = 1:100
for k = 1:1000
a = (k - 1) * 1000 + 1;
y(:, k) = x(a:a+999);
end
end
toc
Elapsed time is 1.640906 seconds. % x([a:b])
Elapsed time is 1.636042 seconds. % v=a:b; x(v)
Elapsed time is 0.737971 seconds. % x(a:b)
I can imagine 2 explanations for the speed up of x(a:b):
- a:b is not created explicitly and range checks are omitted for the inner elements.
- a faster memcpy method instead of an elementwise copy (in addition, therefore 1. must be assumed also).
The 2. idea can be excluded by using a:2:b for indexing, which shows a comparable advantage for x(a:2:b) compared to x([a:2:b]).
Therefore I claim boldly, that 1. is applied.
What a pity that the JIT is not documented. But, wait, even blaming the JIT is a too cheeky speculation:
feature jit off
feature accel off
and enabling the profiler does not influence the timings remarkably. Anyway, something smarter than calling subref must happen here. How would you explain the speed difference?
Maybe we should ask TMW for an explanation. Their argument for not documenting the JIT was: "If we publish the methods, the users will optimize their code for the JIT. But we want to optimize the JIT for the real user code." But this is not mutual exclusive.
Walter Roberson
on 26 Jan 2018
0 votes
If you have the Communications toolbox, I suggest calling buffer()
Categories
Find more on Loops and Conditional Statements in Help Center and File Exchange
Products
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!