Multiplying matrices as cells - 4 dimensional with variable no. of matrices (code works now!)

%I'm trying to work with 4 dimensional cells containing matrices that need multiplying together.
%I have three matrices, A_1, A_2 & B, which are multiplied together to get Y. Each matrix - A_1, A_2 & B has different values depending on frequency, f.
%I've done this the long way first - with three separate frequency vectors: -
%% 1ST METHOD
f1 = 20
f2 = 500
f3 = 10000
%Then building the matrices, A_1, A_2 & B: -
A_1_f1 = [(f1 + 0) (f1 + 1);(f1 + 2) (f1 + 3)]
A_1_f2 = [(f2 + 0) (f2 + 1);(f2 + 2) (f2 + 3)]
A_1_f3 = [(f3 + 0) (f3 + 1);(f3 + 2) (f3 + 3)]
A_2_f1 = [(f1 + 1) (f1 + 2);(f1 + 3) (f1 + 4)]
A_2_f2 = [(f2 + 1) (f2 + 2);(f2 + 3) (f2 + 4)]
A_2_f3 = [(f3 + 1) (f3 + 2);(f3 + 3) (f3 + 4)]
B1 = [(f1 + 6);(f1 + 7)]
B2 = [(f2 + 6);(f2 + 7)]
B3 = [(f3 + 6);(f3 + 7)]
%And simple multiplication: -
Y1 = A_1_f1*A_2_f1*B1
Y2 = A_1_f2*A_2_f2*B2
Y3 = A_1_f3*A_2_f3*B3
%RESULTS: -
%Y1 = [48966;53738],
%Y2 = [509543046;511579178],
%Y3 = [4.0038e+12;4.0046e+12]
%THAT'S THE 1ST LONG WAY!
%% 2ND METHOD
%Then I've successfully reduced this so that f1, f2 & f3 become the vector, f: -
f = [20 500 10000]
A_1(1,1,:) = f + 0
A_1(1,2,:) = f + 1
A_1(2,1,:) = f + 2
A_1(2,2,:) = f + 3
A_2(1,1,:) = f + 1
A_2(1,2,:) = f + 2
A_2(2,1,:) = f + 3
A_2(2,2,:) = f + 4
B(1,1,:) = f + 6
B(2,1,:) = f + 7
ref(1,1,:) = f
ref(2,1,:) = f %(this is just a reference matrix for the next part)
Y = arrayfun(@(ind)A_1(:,:,ind)*A_2(:,:,ind)*B(:,:,ind)...
,1:size(ref,3),'uniformOutput',false);
Y = cat(3, Y{:});
%RESULS: -
%Y =
% val(:,:,1) =
% 48966
% 53738
% val(:,:,2) =
% 509543046
% 511579178
% val(:,:,3) =
% 4.0038e+12
% 4.0046e+12
%THAT'S THE MEDIUM - STILL POTENTIALLY LONG WAY - DEPENDING ON HOW MANY 'A' MATRICES THERE ARE.
%% 3RD METHOD
%This is good, but now I need to introduce more 'A' matrices, like A_3, A_4, A_5 etc.
%I've tried using 4-D cells, so that A_1, A_2, A_3, A_4... become just A, like this: -
A(1,1,1,:) = f + 0
A(1,2,1,:) = f + 1
A(2,1,1,:) = f + 2
A(2,2,1,:) = f + 3
A(1,1,2,:) = f + 1
A(1,2,2,:) = f + 2
A(2,1,2,:) = f + 3
A(2,2,2,:) = f + 4
%but that's where I get stuck. I'm thought a for loop method would work, but I couldn't get it work. I basically don't know how to multiply these out so I get the same results as above.
%If anyone knows how to do this that would be great - many thanks in advance.

 Accepted Answer

As you have only a limited number of values for f, you should avoid using complicated 3D or 4D arrays (indexing them is not always faster than looping indexing based on simpler 2D arrays), or solutions based on ARRAYFUN, and go for a much simpler solution based on a cell array for storing Y's.
Also, your matrices are quite simple so you could define them in a simpler way that you are doing above, i.e. you seem to have (for any given value of f):
A1 = f + [0 1; 2 3]
A2 = A1 + 1
B = f + [6; 7]
Using a cell array for storing results Y, you could implement the following:
f = [20 500 10000] ;
n = numel(f) ;
Y = cell(n,1) ;
for k = 1 : n
A1 = f(k) + [0 1; 2 3] ;
Y{k} = A1 * (A1 + 1) * (f(k) + [6; 7]) ;
end
Solutions could be then accessed as follows:
Y{1}
ans =
48966
53738
Y[2}
...
etc
Note that this solutions is flexible, as the size of the cell array and the upper bound of the loop adapt to the number of elements in the f vector. Also, I assumed that you just need Y's in the end and not successive values of A1, A2, and B. You could use a few extra cell arrays if you needed to store them.

10 Comments

Hi thanks for the response. I should have said that the code is simplified; I'll actually have f as quite large, maybe 1 to 2000 in increments of 1 (or bigger if it won't run it).
Also the equations used to get the matrix values are simplified - these will be much more complicated, but I do have a function for that, which I'll call each time it requires a new matrix.
@Tom: Please do not post simplified code without explaining this detail.
So you are looking for a solution that is more efficient than
f = [20 500 10000] ;
n = numel(f) ;
Y = cell(n,1) ;
for k = 1 : n
A1 = get_A1(f((k)) ;
A2 = get_A2(f((k)) ;
B = get_B(f((k)) ;
Y{k} = A1 * A2 * B ;
end
with n roughly in the range 1e3-1e4, A1 and A2 2x2 matrices (larger?), and B 2x1, all returned by external functions?
If so, we would need to know a bit more about these functions.
I've been using this simplified version of my code just to get this process working. I'm getting there. Basically, 'A' represents something that is divided up into sections - hence, A1, A2, A3 etc. I need to investigate what happens when I use more or less sections.
I think it would make more sense to deal with the 'A' matrices alone - multiplying them all together to create one 2x2 matrix called 'A'. Then just multiply that by B.
Here's my progress: -
f = [1 4 10]
A(1,1,1,:) = f + 0
A(1,2,1,:) = f + 1
A(2,1,1,:) = f + 2
A(2,2,1,:) = f + 3
A(1,1,2,:) = f + 1
A(1,2,2,:) = f + 2
A(2,1,2,:) = f + 3
A(2,2,2,:) = f + 4
A(1,1,3,:) = f + 2
A(1,2,3,:) = f + 3
A(2,1,3,:) = f + 4
A(2,2,3,:) = f + 5
B(1,1,1,:) = f + 6
B(2,1,1,:) = f + 7
ref(1,1,1,:) = f
ref(1,2,1,:) = f %(this is just a reference matrix for the next part)
ref(2,1,1,:) = f
ref(2,2,1,:) = f
n = 3
Y(:,:,1,:) = A(:,:,1,:)
for k = 1:(n - 1)
Y(:,:,1,:) = arrayfun(@(ind) Y(:,:,1,ind).*A(:,:,k + 1 ,ind)...
,1:size(ref,4),'uniformOutput',false);
end
Y = cat(4, Y{:});
As you can tell - it doesn't work. But hopefully you can see what I'm trying to do and help me to tweak it so it works.
I've managed to get it to do what I want, but only with one frequency value. The problem seems to be that the matrices will only multiply out properly if '*' is used to multiply them - not '.*'. But then as soon as more than one frequency value, f, is used, it will not multiply the matrices unless '.*' is used, which does not result in proper matrix multiplication.
The following gives me a 2x2 matrix of [95 118;211 262], which is correct, but I need it to work for multiple values of f.
%f = [1 4 10]
f = 1;
A(1,1,1,:) = f + 0;
A(1,2,1,:) = f + 1;
A(2,1,1,:) = f + 2;
A(2,2,1,:) = f + 3;
A(1,1,2,:) = f + 1;
A(1,2,2,:) = f + 2;
A(2,1,2,:) = f + 3;
A(2,2,2,:) = f + 4;
A(1,1,3,:) = f + 2;
A(1,2,3,:) = f + 3;
A(2,1,3,:) = f + 4;
A(2,2,3,:) = f + 5;
B(1,1,1,:) = f + 6;
B(2,1,1,:) = f + 7;
ref(1,1,1,:) = f;
ref(1,2,1,:) = f; %(this is just a reference matrix for the next part)
ref(2,1,1,:) = f;
ref(2,2,1,:) = f;
n = 3;
Y(:,:,1,:) = A(:,:,1,:);
for k = 1:(n - 1)
Y(:,:,1,:) = Y(:,:,1,:)*A(:,:,k + 1,:);
end
It is actually difficult to understand why you need a 4D array, or a 3D defined element by element using a ':' sub for a 4th singleton dimension.
I think that it would be more productive if you would define exactly what it is that you want to to, because reasoning on a simplified problem where you define matrices component by component using a syntax which doesn't make a lot of sense (e.g. A(2,2,3,:) = f + 5;) is not ideal for us to propose a solution.
Hi, yes I see that now - I apologize if I've wasted anyone's time. Actually doing it in that way was initially helping me to get my head around exactly what was going on with 4D arrays. Anyway, I've made some progress, and here's my real code. The code compares two methods of modelling the impedance of an old style megaphone. The analytical method is from 'kappa' down, and the main code before that basically works by dividing the shape into cylindrical sections - each with slightly different radius from one end to the other.
rho0 = 1.21; %approx. air density (kg/m^3)
c = 343; %approx. speed of sound (m/s)
x = 0.0774; %length (m)
r_T = 0.0016; %small end radius (m)
r_M = 0.0273; %large end radius (m)
S_T = pi*r_T^2; %small end surface area (m^2)
S_M = pi*r_M^2; %large end surface area (m^2)
f_min = 1; %min. freq. (Hz)
f_max = 20000; %max. freq. (Hz)
n_f = 20000; %no. of frequency increments
f = linspace(1,f_max,n_f)'; %frequency range (Hz)
omega = f*2*pi; %angular frequency range (s^-1)
lambda = c./f; %wavelength range (m)
k = 2*pi./lambda; %wave number
n_Du = 500; %no. of short cylindrical sections used (INDEPENDENT VARIABLE)
x_Du = x/n_Du; %length of each section (m)
Z0 = rho0*c; %characteristic impedance of air
Z0_A = rho0*c/S_T; %characteristic impedance of air in the acoustic domain
b = (r_M/r_T)^(1/(n_Du - 1)); %exponential growth factor
r_Du = r_T*b.^((1:n_Du)-1); %radius of each subsequent duct section (m)
S_Du = pi.*r_Du.^2; %surface area of each subsequent section (m^2)
x_M = 2.*k*r_M; %Throat Bessel function subject (1st order) (2ka)
R_1 = 1 - 2.*besselj(1,x_M)./x_M; %piston resistance function
X_1 = 2.*StruveH1(x_M)./x_M; %piston reactance function (K&F,2009,p.186)
Z_r = Z0_A.*(R_1 + 1i.*X_1); %radiation load at large end
Z_rMX(1,1,:) = Z_r; %2x1 matrix version of radiation impedance (just Z_r at the top...)
Z_rMX(2,1,:) = ones(length(f),1); %2x1 matrix version of radiation impedance (...& 1 at the bottom)
MX21_ref(1,1,:) = ones(length(f),1); %[2;1] reference matrix just for sizing
MX21_ref(2,1,:) = ones(length(f),1); %[2;1] reference matrix just for sizing
Z_DuMXi = zeros(2,2,n_Du,n_f); %preallocate 2x2 matrix for impedance of each duct piece
for i = 1:n_f
for j = 1:n_Du
Z_DuMXi(1,1,j,i) = cos(2*pi*f(i)*x_Du/343);
Z_DuMXi(1,2,j,i) = sin(2*pi*f(i)*x_Du/343)*1i*415/S_Du(j);
Z_DuMXi(2,1,j,i) = sin(2*pi*f(i)*x_Du/343)*1i*S_Du(j)/415;
Z_DuMXi(2,2,j,i) = cos(2*pi*f(i)*x_Du/343);
end
end
Z_DuMX(:,:,1,:) = Z_DuMXi(:,:,1,:);
for l = 1:n_f
for m = 1:(n_Du - 1)
Z_DuMX(:,:,1,l) = Z_DuMX(:,:,1,l)*Z_DuMXi(:,:,m + 1,l);
end
end
Z_DuMX = squeeze(Z_DuMX); %reduce size of matrix
Z_TMX = arrayfun(@(ind) Z_DuMX(:,:,ind)*Z_rMX(:,:,ind),1:size(MX21_ref,3),'uniformOutput',false);%mult 2-ports
Z_TMX = cat(3, Z_TMX{:}); %convrt cell to 3D array
Z_T = arrayfun(@(ind) Z_TMX(1,1,ind)./Z_TMX(2,1,ind),1:size(MX21_ref,3),'uniformOutput',false);
Z_T = cat(1, Z_T{:}); %reduce 2 by 1 matrix to single value
R_T = real(Z_T);
X_T = imag(Z_T);
figure(2); plot(f,R_T,'r','LineWidth',2); hold on;
figure(2); plot(f,X_T,'g','LineWidth',2); hold on;
%figure(3); plot(f,R_T); hold all;
%figure(3); plot(f,X_T); hold off;
kappa = (k.^2 - beta^2).^0.5;
theta = atan(beta./kappa);
Z_T_ii = (rho0*c/S_T).*((Z_r.*cos(kappa*x + theta) + 1i.*(rho0*c/S_M).*sin(kappa*x))./...
(rho0*c/S_M.*cos(kappa*x - theta) + 1i.*Z_r.*sin(kappa*x)));
R_T_ii = real(Z_T_ii);
X_T_ii = imag(Z_T_ii);
figure(2); plot(f,R_T_ii,'b'); hold on;
figure(2); plot(f,X_T_ii,'m'); hold off;
Its taking about 30 seconds to do this and it'd be great to get this time down. I was previously using a function to build my matrices, but because I have two dependent variables now I can't see how to do it. Before I introduced the varying surface areas of subsequent sections, I was using a function like this: -
function [chain] = pipe(rho0,c,l,S,f)
kl = 2*pi*f*l/c;
%build matrix
chain(1,1,:) = cos(kl);
chain(1,2,:) = sin(kl)*1i*rho0*c/S;
chain(2,1,:) = sin(kl)*1i*S/(rho0*c);
chain(2,2,:) = cos(kl);
I tried doing it with S as a vector of values - rather than just one - and using
chain(1,1,:,:)
etc.
but then I thought how do I tell MATLAB that the third place is for S and the fourth place is for f?
No problem; thank you for the full code, it is easier to understand now. Do you know how to use the profiler in MATLAB? If not, type
profile viewer
in the command window; it will open a profiling tool. Assuming that the current path is the folder that contains your script/M-file, type the name of this script in the field labeled "Run this code" at the top of the profiling window (or the function call if it is a function), and click on [Start Profiling]. You will obtain a report after the run and you can click on your script name in the report table to get the detail.
You will see that more than 60% of the time is spent on the line
Z_DuMX(:,:,1,l) = Z_DuMX(:,:,1,l)*Z_DuMXi(:,:,m + 1,l);
and that most of the rest of the time is spent in the loop that defines Z_DuMXi. These are therefore what needs to be optimized in priority.
About the line defining Z_DuMX, at each iteration you are computing a matrix product of two 2x2 matrices.. is it correct? This would be difficult to optimize, and might require that you work with arrays of complex numbers instead of arrays of real and imaginary parts.
To optimize the block defining Z_DuMXi, you can vectorize calls to SIN/COS as follows (you'll have to test that it really works the way you want though, and fine tune if necessary): after the block
for i = 1:n_f
for j = 1:n_Du
Z_DuMXi(1,1,j,i) = cos(2*pi*f(i)*x_Du/343);
Z_DuMXi(1,2,j,i) = sin(2*pi*f(i)*x_Du/343)*1i*415/S_Du(j);
Z_DuMXi(2,1,j,i) = sin(2*pi*f(i)*x_Du/343)*1i*S_Du(j)/415;
Z_DuMXi(2,2,j,i) = cos(2*pi*f(i)*x_Du/343);
end
end
insert
[f_mesh, S_Du_mesh] = meshgrid(f, S_Du) ;
k_mesh = 2*pi*f_mesh*x_Du/343 ;
Z_DuMXi2(1,1,:,:) = cos(k_mesh);
Z_DuMXi2(2,2,:,:) = Z_DuMXi2(1,1,:,:) ;
Z_DuMXi2(1,2,:,:) = sin(k_mesh)*1i*415./S_Du_mesh;
Z_DuMXi2(2,1,:,:) = sin(k_mesh)*1i.*S_Du_mesh/415;
all(Z_DuMXi(:) == Z_DuMXi2(:)) ; % Test.
The call to ALL is testing whether Z_DuMXi2's elements are equal to Z_DuMXi's. It outputs 1 (true), which seems to indicate that the vectorized version works. Once you are sure that it works, you can replace the first version with the second, and rename Z_DuMXi2 into Z_DuMXi.
If you time the two blocks using TIC/TOC, you will see that the computation time goes from about 21s for executing the first nested loop to less than 1s for executing the vectorized version.
Now about the general approach, 4D arrays are not that easy to work with and they are much slower computing-wise than 2D arrays. The reason is that they are cut into frames that are not stored at contiguous locations in memory. Sometimes, just permuting dimensions so you work on the first two dimensions instead of e.g. 1 and 3, will reduce the computation time by a factor 1000. This to say that once you'll have vectorized everything (if possible), you'll hit a wall that will be defined by indexing operations in arrays that are larger than 2D.
That's great, many thanks Cedric - I've used the meshgrid version to lessen the time a bit, and the profile viewer is new to me too - very useful!
You're welcome! The profiler is a great tool; if you look at the number of calls of what takes time in your code, you'll see that no operation is inherently slow, but that you are repeating them millions times because you address/index arrays element per element. If you can vectorize these operations and reduce 10e6 loops 1 or 500 whether you fully vectorize or vectorize by row/column, you win.

Sign in to comment.

More Answers (0)

Products

Asked:

Tom
on 5 Apr 2013

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!