Faster way of appending data to large struct array

17 views (last 30 days)
I have 700k files of velocities (vx), each file is an array of size x (16) and y (77). Here is my code to create a large struct array so that I can access the data and then peform statistics at each x and y location. At the end I want to make for each x and y location a time series of the data. So at each x and y I will have the vx (velocity) for all 700k time instances.
I have read many recommendations on here that a struct or cell array are not efficient and fast in storing or accessing data. The recommendation is to use single arrays or vectors. But I don't know how to implement for my case. The bottle neck is the growing size of the array. For 7k samples it take 151 seconds but for the real 700k data set it takes days to process!
Can someone please recommend a speed improvement to the following:
ref_x = 16;
ref_y = 77;
NoFiles = 7000; % but for real case I have 700000
stats = [];
%preallocate struct
stats = struct('vx',zeros(NoFiles, ref_x, ref_y));
tic
for i = 1: NoFiles
% code that loads file
% each new file has new.vx data
new.vx = rand(16,77);
for k = 1: ref_x
if i == 1
stats(k).vx = new.vx(k,:);
else
stats(k).vx = cat(1,stats(k).vx,new.vx(k,:));
end
end
end
toc
Elapsed time is 37.361091 seconds.
% more code here to reshape arrays in desired format.

Accepted Answer

Matt J
Matt J on 27 Mar 2023
Edited: Matt J on 27 Mar 2023
That is a quite a bit of data. Even in single precision, the array will consume ~3 GB. Ths should be faster, though.
stats=nan(ref_x,ref_y,No_files,'single');
tic
for i = 1: NoFiles
% code that loads file
% each new file has new.vx data
stats(:,:,i) = new.vx;
end
toc
  3 Comments
Matt J
Matt J on 27 Mar 2023
Seems like th edifference is non-impactful in this context.
N=3000;
E=rand(100);
stats=nan([size(E),N],'single');
tic
for i = 1: N
% code that loads file
% each new file has new.vx data
stats(:,:,i) = E;
end
toc
Elapsed time is 0.045564 seconds.
stats=inf([size(E),N],'single');
tic
for i = 1: N
% code that loads file
% each new file has new.vx data
stats(:,:,i) = E;
end
toc
Elapsed time is 0.047948 seconds.
Peter Manovski
Peter Manovski on 27 Mar 2023
Edited: Peter Manovski on 27 Mar 2023
Thank you that works way better! and yes, high speed camera can generate a lot of data quickly!
in my case, I prefer NaN because when I do statistics zero or other values can bias data.

Sign in to comment.

More Answers (0)

Categories

Find more on Creating and Concatenating Matrices in Help Center and File Exchange

Products


Release

R2021b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!