How to store and reuse coefficients in a for loop

Question

Matthew Kehoe on 23 Jul 2021

0
Link

Direct link to this question

https://ch.mathworks.com/matlabcentral/answers/884339-how-to-store-and-reuse-coefficients-in-a-for-loop

Edited: per isakson on 25 Jul 2021

My Matlab code has a subroutine that repeatedly executes a double for loop. While testing large simulations, this subroutine is called over 10^6 times. I am curious if I can change/reuse part of the subroutine so that I can improve performance.

My original Matlab code (with test data) is shown below.

% Test Data
clc; clear all;
M = 30;
N = 30;
Nx = 32;
f_n_m = rand(Nx,N+1,M+1);
epsilon = 0.3;
delta = 0.4;
SumType = randi(3,1);
f = zeros(Nx,1);
coeff = zeros(N+1,M+1);
% Original Implementation
for j=1:Nx
  % Can this double for loop be performed once instead of Nx times?
  for r=0:N
    for s=0:M
      coeff(r+1,s+1) = f_n_m(j,r+1,s+1);
    end
  end
  if(SumType==1)
    % My code calls seperate subroutines which need the value of coeff.
    % For testing purposes I have put f(j) = test data. These three if
    % statements call external functions in my code.
    f(j) = coeff(j)*epsilon*delta*N*M;
  elseif(SumType==2)
    f(j) = 2*coeff(j)*epsilon*delta*N*M;
  elseif(SumType==3)
    f(j) = 3*coeff(j)*epsilon*delta*N*M;
  end
end

My first (bad) idea was to change the above code to the code below.

f2 = zeros(Nx,1);
coeff2 = zeros(N+1,M+1);
% Move the double for loop here
for j=1:Nx
  for r=0:N
    for s=0:M
      coeff2(r+1,s+1) = f_n_m(j,r+1,s+1);
    end
  end
end
% Then run the original for loop 
for j=1:Nx
  if(SumType==1)
    f2(j) = coeff2(j)*epsilon*delta*N*M;
  elseif(SumType==2)
    f2(j) = 2*coeff2(j)*epsilon*delta*N*M;
  elseif(SumType==3)
    f2(j) = 3*coeff2(j)*epsilon*delta*N*M;
  end
end
% However, this gives different answers since f(j) runs against a single
% value of j while f2(j) computes all the values of j.
diff = norm(f-f2,inf) % large

I'm curious if there is a more efficient way of writing

for j=1:Nx
  % start inner double for loop
  for r=0:N
    for s=0:M
      coeff(r+1,s+1) = f_n_m(j,r+1,s+1);
    end
  end
  % end inner double for loop
  if(SumType==1)
    f(j) = % An external function involving coeff, epsilon, delta, N, M
  elseif(SumType==2)
    f2(j) = % A different external function involving coeff, epsilon, delta, N, M
  elseif(SumType==3)
    f2(j) = % A third external function involving coeff, epsilon, delta, N, M
  end
end

so that the inner double for loop is only calculated once instead of Nx times. Is this a limitation of the way the function is written?

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Sign in to answer this question.

Answer 1

per isakson on 23 Jul 2021

1
Link

Direct link to this answer

https://ch.mathworks.com/matlabcentral/answers/884339-how-to-store-and-reuse-coefficients-in-a-for-loop#answer_752219

Open in MATLAB Online

Caveat: I don't fully understand your code and what I say might not be relevant to your real project.

"% Can this double for loop be performed once instead of Nx times?" The short answer is no, because coeff is 2D and f_n_m is 3D. Maybe, you can make coeff 3D. Depends on how you will use coeff.

"f(jj) = coeff(jj)*epsilon*delta*N*M;" What is coeff(jj) supposed to return? This is linear indexing, which returns a scalar.

Proposal: Replace the double for-loop by alt_coeff = f_n_m( :, :, jj );. Notice that I have modified the indexing of f_n_m so that jj is the third index. Run the function cssm1() with profile(). alt_coeff improves speed significantly.

cssm1
ans = "Happy end"
function    out = cssm1
    % Test Data
    % clc; clear all;
    M = 30;
    N = 30;
    Nx = 32;
    f_n_m = rand(N+1,M+1,Nx);
    epsilon = 0.3;
    delta = 0.4;
    SumType = randi(3,1);
    
    f = zeros(Nx,1);
    coeff = zeros(N+1,M+1);
    
    % Original Implementation
    for jj=1:Nx
        % Can this double for loop be performed once instead of Nx times?
        for r=0:N
            for s=0:M
                coeff(r+1,s+1) = f_n_m(r+1,s+1,jj);
            end
        end
        alt_coeff = f_n_m( :, :, jj );
        
        if(SumType==1)
            % My code calls seperate subroutines which need the value of coeff.
            % For testing purposes I have put f(j) = test data. These three if
            % statements call external functions in my code.
            f(jj) = coeff(jj)*epsilon*delta*N*M;
        elseif(SumType==2)
            f(jj) = 2*coeff(jj)*epsilon*delta*N*M;
        elseif(SumType==3)
            f(jj) = 3*coeff(jj)*epsilon*delta*N*M;
        end
    end
    out = "Happy end";
end

3 Comments
Show 1 older commentHide 1 older comment

per isakson on 24 Jul 2021

Edited: per isakson on 25 Jul 2021

Open in MATLAB Online

"I don't see why it is necessary to change the indexing." May be, "better" is a more appropriate word than "necessary". I'll try to explain. I start with a little bit of background.

Matlab uses column-major order to store arrays. Since The Mathworks don't want to bother the ordinary user with "low level stuff", the documentation doesn't discuss the importance of order to performance (I fail to find it anyhow). For example, it is not mentioned in Techniques to Improve Performance. (It's crucial in communication with other languages (e.g. C) as described in MATLAB Data.)

Wikipedia provides a good description at Row- and column-major order. I pick an important statement: "[...] modern CPUs process sequential data more efficiently than nonsequential data". (The larger the array the more "sequential" matters.)

My conclusion regarding column-major and performance is: Choose the order of the dimensions of large arrays so that data will be processed sequentially (or even better in big chunks of contiguous data). Below is a really simple example of picking the elements of an array in the order they are stored in memory:

cssm3
  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
%
function cssm3
    vec = 1:24;
    array = reshape( vec, 2,3,4 );
    for kk = 1:4
        for jj = 1:3
            for ii = 1:2
                fprintf( '%3d', array( ii, jj, kk ) );
            end
        end
    end
    fprintf('\n')
end

.

Why alt_coef_orig = f_n_m_orig(jj,:,:); doesn't work. alt_coef_orig is a 3D array with a leading singleton dimension. It works if you index it as a 3D. Or you can remove the singleton dimension with the function squezze(); i.e squezze(alt_coef_orig)==alt_coeff. Problem is, squezze takes time. alt_coeff is 2D because Matlab automagically removes trailing singleton dimensions in no time. (@Simon Chan proposes that you changes the order of the dimensions.)

Some comments on your code

f_n_m = rand(Nx,N+1,M+1); The order of the dimensions looks arbitrary to me. Is there a good reason for this order?
rand() is not a good choice to make test data in an early stage of code development. That's because it makes it difficult to judge whether the code produces the expected result.
why create coeff at all? In your example it's only used to supply a scalar value to the calculation of f(jj). That scalar value should be possible to extract with proper indexing of f_n_m.

Matthew Kehoe on 24 Jul 2021

Edited: Matthew Kehoe on 25 Jul 2021

Open in MATLAB Online

Thanks for the helpful information (and comments). To address the comments -

First Comment: All of my Matlab code creates dimensions of order (Nx,N+1,M+1). This is arbitary and has no meaning. I consistently create for loops where Nx (the largest value) is before N and M which are smaller. A local test

M = 20;
N = 20;
Nx = 32;
f_n_m = rand(N+1,M+1,Nx);
f_n_m_2 = rand(Nx,N+1,M+1);
ntests = 10000;
% Method 1
tic
for ii = 1:ntests
    for j=1:Nx
      j_n_m = f_n_m(:,:,j);
    end
end
toc
% Method 2
tic
for ii = 1:ntests
    for j=1:Nx
      j_n_m_2 = f_n_m_2(j,:,:);
    end
end
toc
% Method 1 is faster

shows that Method 1 is faster. I should review all of my code and update the double/triple for loops so that the loop goes through the column last (I think this is how column-major works. I will look further into this tomorrow as it is 1:00am here and I need to sleep).

Second Comment: I agree that rand()/randi() is inefficient. It isn't part of my production code and was only created for putting code in the forum. My real code is thousands of lines long (with multiple functions). I don't create data through rand().

Third Comment: There is no need to create coefficient. Writing

for j=1:Nx
      if(SumType==1)
        % This is what my production code calls. I didn't show what
        % taylor_sum2, pade_sum2, pasde_sum2_safe do as that would be too much
        % detail and would make the question "not self-contained."
        f(j) = taylorsum_2_coeff(f_n_m(:,:,j),Eps,delta,N,M);
      elseif(SumType==2)
        f(j) = padesum2(f_n_m(:,:,j),Eps,delta,N,M);
      else 
        f(j) = padesum2_safe(f_n_m(:,:,j),Eps,delta,N,M);
      end
    end

is faster. I will review (and test locally) changing (Nx,N+1,M+1) to (N+1,M+1,Nx) as my production code has a large amount of for loops that loop through something similar to

for j=1:Nx
  for n=1:N
    for m=1:M
      % Do a bunch of stuff and evenually get to
      f_n_m = (j,n+1,m+1);
    end
  end
end

or

for j=1:Nx
   f_n_m = (j,:,:);
end

which may be inefficient as the index j should go in the third column. It is interesting to observe that in this setup

M = 20;
N = 20;
Nx = 32;
Gnm = rand(N+1,M+1,Nx);
Gnm_2 = rand(Nx,N+1,M+1);
ntests = 10000;
% Method 1
tic
for ii = 1:ntests
    for n=0:N
      for m=0:M
          f_n_m = Gnm(n+1,m+1,:);
      end
    end
end
toc
% Method 2
tic
for ii = 1:ntests
    for n=0:N
      for m=0:M
          f_n_m_2 = Gnm_2(:,n+1,m+1);
      end
    end
end
toc
% Method 2 is faster even though it is of size (Nx,N+1,M+1)

Sign in to comment.

Answer 2

Simon Chan on 23 Jul 2021

1
Link

Direct link to this answer

https://ch.mathworks.com/matlabcentral/answers/884339-how-to-store-and-reuse-coefficients-in-a-for-loop#answer_752224

Edited: Simon Chan on 23 Jul 2021

Open in MATLAB Online

The loop of finding the coefficient can be entirely replaced by:

new_coeff = permute(f_n_m,[2 3 1]);

Noticed that size of new_coeff is 31x31x32 and new_coeff(:,:,1) is equivalent to your coeff when j = 1.

1 Comment
Show -1 older commentsHide -1 older comments

Matthew Kehoe on 25 Jul 2021

This also works.

Sign in to comment.

How to store and reuse coefficients in a for loop

0 Comments
Show -2 older commentsHide -2 older comments

Accepted Answer

3 Comments
Show 1 older commentHide 1 older comment

More Answers (1)

1 Comment
Show -1 older commentsHide -1 older comments

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

How to store and reuse coefficients in a for loop

0 Comments Show -2 older commentsHide -2 older comments

Accepted Answer

3 Comments Show 1 older commentHide 1 older comment

More Answers (1)

1 Comment Show -1 older commentsHide -1 older comments

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

0 Comments
Show -2 older commentsHide -2 older comments

3 Comments
Show 1 older commentHide 1 older comment

1 Comment
Show -1 older commentsHide -1 older comments