Vectorizing the colon problem

7 views (last 30 days)
xtremecheez
xtremecheez on 21 Nov 2017
Edited: Jan on 25 Nov 2017
I need help with vectorizing this colon. I've looked at the solution presented by Loren http://blogs.mathworks.com/loren/2008/10/13/vectorizing-the-notion-of-colon/, but mine is a slightly different problem and I am having trouble extending Loren's work.
Tavg, cbl, and cbu are vectors of equal but unknown length (could be anywhere from length 2 to hundreds of elements, hence the need for vectorization). T is a very long vector (~10000 elements). cbl and cbu contain the lower and upper indices of interesting regions in which the average value across that region is placed into every cell of the region (Tavg); the regions do not overlap NOR touch (see example below). Every region has at least length 1.
for ii = 1:length(Tavg)
T(cbl(ii):cbu(ii)) = Tavg(ii);
end
Edit: longer code fragment. Given the internal energy of each cell in a spatially discretized domain, I am finding the temperature in each cell. "bisection_eff" is a function that performs like fzero, but with array inputs (credit here: https://www.mathworks.com/matlabcentral/fileexchange/28150-bisection-method-root-finding). To decrease the number of calls to bisection_eff by avoiding redundant calculation, I find regions of nearly constant internal energy, indexed by cbl and cbu. Each region only requires one call of bisection_eff, using the average value of e_Jmol.
Clarification: I misspoke by saying the indexed regions can overlap even by one element. There is no overlap between regions. See example below.
Note: I have omitted from the code below some if statements to catch empty vectors and nans, so no need to worry about fringe cases for this question.
ediff = find(diff(round(e_Jmol,0))); %round e_Jmol a bit for speed
ediff2 = find(diff(ediff));
%identifying regions of nearly constant internal energy
cbl = [1;ediff(ediff2)+1;ediff(end)+1]'; %lower bounds
cbu = [ediff(1);ediff(ediff2+1);len]'; %upper bounds
cbavg = round((cbl+cbu)/2); %indices in the middle of constant region
cbej = e_Jmol(cbavg); %corresponding internal energy
ejend = e_Jmol(end); %ambient internal energy
%select narrow starting bounds for bisection
reg = 1*(cbej/ejend < 1.5) + 3*(cbej/ejend > 6); %low T (250-500) or high T (1500-3500)
reg(reg==0) = 2; %middle T (500-1500)
Tlb = 250*(reg==1) + 500*(reg==2) + 1500*(reg==3);
Tub = 500*(reg==1) + 1500*(reg==2) + 3500*(reg==3);
Tavg = bisection_eff(Tlb',Tub',Ru,e_Jmol(cbavg),conc(cbavg,:),A(:,cbavg,:));
for ii = 1:length(Tavg)
T(cbl(ii):cbu(ii)) = Tavg(ii);
end
Example numbers:
e_Jmol (10000x1) = 1.0e+04 * [6.6113 6.6113 6.6112 6.6145 6.7086 0.8103 0.8102 0.8101 0.8102 0.8102 0.8102 0.8102 etc...];
ediff (10x1) = [2 3 4 5 6 7 8 1000 1001 1002];
ediff2 (9x1) = [1 2 3 4 5 6 7 8 9];
cbl (11x1) = [1 3 4 5 6 7 8 9 1001 1002 1003 ];
cbu (11x1) = [2 3 4 5 6 7 8 1000 1001 1002 10000];
cbavg (11x1) = [2 3 4 5 6 7 8 505 1001 1002 5502];
cbej (11x1) = 1.0e+04 * [6.6113 6.6112 6.6145 6.7086 0.8103 0.8102 0.8101 0.8102 0.8031 0.5998 0.5986];
reg (11x1) = [3 3 3 3 1 1 1 1 1 1 1];
Tavg (11x1) = [1936 1936 1937 1965 369.6 369.4 369.4 369.4 366.5 286.3 285.8];
  1 Comment
Jan
Jan on 22 Nov 2017
@xtremecheez: An example of real data would be useful. Optimizing the code depend on the distribution of the data and the detail if the regions can overlap by 1 element is fundamental. Please attach some input data and explain, how you obtain cbl and cbu - perhaps the optimization can be started here already.

Sign in to comment.

Accepted Answer

Jan
Jan on 22 Nov 2017
Edited: Jan on 25 Nov 2017
the regions do not overlap but may touch (endpoint of reg. 1 =
startpoint of reg. 2)
This means, that the regions can overlap by 1 element. Therefore the order of the processing matters. Then a parallelization is not trivial.
Did you pre-allocate T before the loop?
T = zeros(1, max(cbu)); % Or nan()?
for ii = 1:length(Tavg)
T(cbl(ii):cbu(ii)) = Tavg(ii);
end
Without a pre-allocation T grows iteratively, which wastes a lot of resources.
Do you have a C compiler installed? Then using a Mex function might accelerate the processing.
[EDITED] The C code:
// File: MColonAssign.c
#include "mex.h"
void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[])
{
double *Tavg, *cbl, *cbu, *T, *Tp, *Tf, LenT;
size_t n;
Tavg = mxGetPr(prhs[0]);
cbl = mxGetPr(prhs[1]);
cbu = mxGetPr(prhs[2]);
LenT = mxGetScalar(prhs[3]);
plhs[0] = mxCreateDoubleMatrix(LenT, 1, mxREAL);
T = mxGetPr(plhs[0]);
n = mxGetNumberOfElements(prhs[0]);
while (n--) {
Tp = T + ((size_t) *cbl++) - 1;
Tf = T + (size_t) *cbu++;
while (Tp < Tf) {
*Tp++ = *Tavg;
}
Tavg++;
}
}
Compile it by:
mex -O MColonAssign.c
and call it as:
T = MColonAssign(Tavg, cbl, cbu, cbu(end));
Here it is assumed, that "cbu(end)" contains the maximum size of the output T.
This C code is fragile: As soon as any cbu is outside the valid range, the code let your Matlab session crash. Use it with care or add checks for values and types of the input.
For your tiny example data, the C code needs the half run time compared to the Matlab code.

More Answers (2)

Walter Roberson
Walter Roberson on 21 Nov 2017
Any solution to this is likely going to involve a minimum of one temporary vector the same size as T. It becomes questionable that vectorizing would give you any performance gains, especially if T is a large fraction of your memory (in which case working with the temporary vectors could require swapping.)
  2 Comments
xtremecheez
xtremecheez on 22 Nov 2017
Okay. If vectorizing will not be more efficient than the for loop, then I can eat the cost of the loop and look for other opportunities to improve performance. Thanks.
Walter Roberson
Walter Roberson on 22 Nov 2017
The computations that created cbl and cbu: is it possible that at some point you had a logical vector the same length as T and that you analyzed it to find "runs" of true elements with the start and end indices being put into cbl and cbu ?
If that does happen to be the case, then:
copies_of_avg = repelems(Tavg, cbu - cbl + 1);
T(That_hypothetical_logical_mask) = copies_of_avg;
would store the values in a vectorized way.
The part that is tricky to vectorize efficiently is forming the logical mask -- but if you happen to have that mask already...

Sign in to comment.


Matt J
Matt J on 22 Nov 2017
Edited: Matt J on 22 Nov 2017
n=length(T);
m=length(cbl);
e=ones(m,1);
cbu(cbu==n)=[];
S=sparse(cbl,e,Tavg,n,1) - sparse(cbu+1,e,Tavg,n,1);
T(:)=T(:)+cumsum(S);

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!