How to preallocate memory for storing data in same mat file?

3 views (last 30 days)
Hi, I wrote the below code and I would like to preallocate memory so that the code will run faster. Once I preallocate I know that I cannot use append but need to index to store output. Can you suggest how to get output for code below?
Here the value of f is a 1*5449 double. Final output is 5449*5449 double.
clc;
n=1; %system order
m=1; %number of inputs
p=6;%number of outputs
Final = [];
for i = 1:7783
for j = 1:50
if exist(['ID_',num2str(i),'_file_',num2str(j),'_Variables','.mat'],'file')
load(['ID_',num2str(i),'_file_',num2str(j),'_Variables','.mat']);
A1 = A{1};
A1 = A1 / max(abs(eig(A1)));
B1 = B{1};
C1 = C{1};
index = 1;
for k = 1:7783
for l = 1:50
if exist(['ID_',num2str(k),'_file_',num2str(l),'_Variables','.mat'],'file')
load(['ID_',num2str(k),'_file_',num2str(l),'_Variables','.mat']);
A2 = A{1};
A2 = A2 / max(abs(eig(A2)));
B2 = B{1};
C2 = C{1};
f(index) = distance1_matlab(A1,A2,B1,B2,C1,C2);
index = index + 1;
end
end
end
Final = [Final;f];
end
end
end
save('Distance','Final');
  5 Comments
Sunny
Sunny on 21 Oct 2018
Edited: Sunny on 21 Oct 2018
Thanks. I changed the program to this. I think this is faster. A is 10*10 double, B is 1*10 and C is 6*10. Now the structs f, o and g are 1*5449.
clc;
n=10; %system order
m=1; %number of inputs
p=6;%number of outputs
Final = [];
k = 1;
for i = 1:7783
for j = 1:50
if exist(['ID_',num2str(i),'_file_',num2str(j),'_Variables','.mat'],'file')
load(['ID_',num2str(i),'_file_',num2str(j),'_Variables','.mat']);
f{k} = A{1};
o{k} = B{1};
g{k} = C{1};
k = k+1;
end
end
end
save('Rescaled_A_Values_All_States','f');
save('Rescaled_B_Values_All_States','o');
save('Rescaled_C_Values_All_States','g');
for c = 1:5449
A1 = f{c};
A1 = A1 / max(abs(eig(A1)));
B1 = o{c};
C1 = g{c};
index = 1;
for d = 1:5449
A2 = f{d};
A2 = A2 / max(abs(eig(A2)));
B2 = o{d};
C2 = g{d};
q(index) = distance1_matlab(A1,A2,B1,B2,C1,C2);
index = index + 1;
end
Final = [Final;q];
end
Guillaume
Guillaume on 21 Oct 2018
Well, yes it's going to be much faster. You're reading each file only once. You're still doing N^2 unnecessary eigs and related calculations. And nearly 99% of the files you test for existence don't exist, so it'd be faster to do a dir so the OS just tells you which files are there.
Finally, depending on what distance1_matlab does, it may well be that your 2nd loop is not needed.

Sign in to comment.

Accepted Answer

Guillaume
Guillaume on 21 Oct 2018
Depending on what distance1_matlab does, this code could be significantly improved.
I'm also assuming that all files that match the pattern ID_*_file_*_Variables.mat' need to be loaded.
filelist = dir('ID_*_file_*_Variables.mat'); %get list of files that exist
fileids = regexp({filelist.name}, 'ID_(\d+)_file_(\d+)_', 'tokens', 'once') %extract numeric ids as text
fileids = str2double(vertcat(fileids{:})); %and convert to numeric
%you may want to sort fileids and filelist to match the order of your original loops
%it's trivial to do. For now I assume it does not matter.
filedata = struct('A', cell(numel(filelist), 1), 'B', [], 'C', []); %preallocate structure to receive file content and final result
%note that A, B and C are very poor field names.
for fileiter = 1:numel(filelist)
filecontent = load(filelist(fileiter).name));
filedata(fileiter).A = filecontent.A{1} / max(abs(eig(A{1})));
filedata(fileiter).B = filecontent.B{1};
filedata(fileiter).C = filecontent.C{1};
end
[cartprod1, cartprod2] = ndgrid(filedata); %cartesian product of all files with themselves
distance = arrayfun(@(s1, s2) distance1_matlab(s1.A, s2.A, s1.B, s2.B, s1.C, s2.C), cartprod1, cartprod2); %assumes that the result of distance1_matlab is scalar
Note that that last line assumes distance1_matlab returns a scalar. If not, change it to:
distance = arrayfun(@(s1, s2) distance1_matlab(s1.A, s2.A, s1.B, s2.B, s1.C, s2.C), cartprod1, cartprod2, 'UniformOutput', false);
If you want the result in the same form as your original Final, then:
distance = distance(:); %if scalar result out of
distance = vertcat(distance{:}); %otherwise
  2 Comments
Sunny
Sunny on 26 Oct 2018
@Guillaume
Can I use parfor instead of for to speed up execution with parallel processing? Does the loops synchronize?
Guillaume
Guillaume on 26 Oct 2018
I doubt that using parfor for the loading loop would help much. The slow part of that is not the processor but the disk access. If anything, it's possible that parfor will slow things down as parallel threads compete for disk access. You'll only know if you try.
I don't know if the parallel toolbox can parallelise arrayfun (I don't have the toolbox). arrayfun is a for loop in disguise. Parallelising that code could certainly result in a speed-up
However, as I've said (twice now) depending on what distance_matlab does, it's likely that this 2nd loop/arrayfun is not needed at all and that the function can be vectorised. This would probably be the most efficient way to improve your code. Hence why I asked for the details of this function.

Sign in to comment.

More Answers (0)

Categories

Find more on Loops and Conditional Statements in Help Center and File Exchange

Products


Release

R2018b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!