Could someone please help me speed up my code?

4 views (last 30 days)
I have a code that I am trying to run that will end up taking days for me to execute. The problem is that I have to read in tons and tons of values to "mean" and a couple other matlab functions. I have 12 .mat files that I have to do my operation for, but I can't get it fast enough to get through one file in less than a few days. I really need help finding a way to speed everything up. Just name a file KCTDI001A.mat and make a 14002450x2 random number matrix to check the code.
clear
clc
addpath('C:\filepath');
AllFiles = [];
filenames = dir('C:\filepath');
profile clear
profile on
for ii = 3:length(filenames); % Start at third file( i.e., don’t include “.” and “..”)
%Get the filename
filename_timestamp = filenames(ii).name;
Index = findstr('A',filename_timestamp);
n = str2num(filename_timestamp(6:Index(1)-1));
File_Name = sprintf('KCTDI0%dA.mat',n);
DataFile = load(File_Name);
ACC_Data = DataFile.FileData(:,:);
for k = 1:13978444;
x = (ACC_Data(1+(k-1):24007+(k-1),:));
RMS(k,:) = sqrt(mean(x(:,:).^2));
end
new_name = sprintf('KCTDI0%dA_RMS.mat',n);
save(new_name,'RMS');
end
profile off
profile viewer
  5 Comments
Guillaume
Guillaume on 27 Jun 2016
Also, using addpath just so you can load files in a different directory is not very good, this would be much better:
root = 'C:\filepath'
filenames = dir(root);
for ...
...
DataFile = load(fullfile(root, File_name));
Also, note that your code will never load a file named KCTDI001A.mat (as you suggest creating) since for n = 1, the name your sprintf generates is KCTDI01A.mat (one less 0).
Tony Pate
Tony Pate on 27 Jun 2016
The bottleneck is at the "mean" function, but I would like to find an alternative to this function or an alternative to the overall function that I am trying to create. I need the root mean square of the window of data that I am selecting, so that is why I have that line of code "RMS(k,:) = sqrt(mean(x.^2));". If there is a quicker way to find the RMS a ton of times, then that would also work. I just am having a hard time running this program quickly with the "mean" function slowing things down so much.

Sign in to comment.

Accepted Answer

Guillaume
Guillaume on 27 Jun 2016
See my comment to the question about the (:, :). The extra brackets is x = (ACC...) also do not help readability.
To speed up the loop you could certainly take out the squaring and the square root:
ACC_Data = DataFile.FileData.^2; %do the squaring only once
RMS = zeros(13978444, 2); %would be better if sizes were not hardcoded
for k = 1:13978444 %semicolon not needed
RMS(k,:) = mean(ACC_Data(1+(k-1):24007+(k-1),:));
end
RMS = sqrt(RMS);
But this is not going to help much with speed because you're still sliding over lots of rows.
If you have matlab R2016a or newer, then you can use movmean to calculate the moving average without a loop:
RMS = sqrt(movmean(DataFile.FileData .^ 2, 24007, 1, 'EndPoints', 'discard'));
If not, you can simply do a convolution with a constant vector of the right length and value:
RMS = sqrt(conv2(DataFile.FileData .^ 2, ones(24007, 1) / 24007, 'valid'));
  2 Comments
Tony Pate
Tony Pate on 27 Jun 2016
Thank you. I will try this instead and see if it speeds up my program enough. I am not familiar with "conv2" or "movmean", but I will research and find out if they do the calculations I need correctly.
Guillaume
Guillaume on 27 Jun 2016
Well movmean is just a moving average and is exactly what you are doing, so yes it does the calculation correctly.
A convolution with a constant function is also a moving average. Due to the way it's implemented it may results in negligible differences (in the last few decimals only).

Sign in to comment.

More Answers (3)

Roger Stafford
Roger Stafford on 27 Jun 2016
It is the repeated ‘mean’ of 24008 elements at a time taken 13978444 times that is the time-consuming aspect of your computation. You would greatly increase your speed if you compute the column-wise cumulative sum of the squares of the x elements and use that to compute the equivalent of the mean instead. There is of course a loss of accuracy over such a large number of cumulative sums but perhaps that would be acceptable to you. If not, perhaps you could still break up things into overlapping cumulative blocks only relatively small multiples of 24008. You would still gain a lot of speed that way. Having to add almost the same set of numbers repeatedly in forming your means is bound to be an inefficient kind of procedure.

Jan Orwat
Jan Orwat on 27 Jun 2016
  1. If you have to use loop, preallocate variable RMS cause it seems it's changing size every iteration. With 14M iterations it may take "ages".
  2. It looks like you are doing moving average. Vectorize the code. Try movmean if you have MATLAB 2016a or newer. You can also do it via convolution, using conv/conv2, filter/filter2 or fft etc.

Thorsten
Thorsten on 27 Jun 2016
Edited: Thorsten on 27 Jun 2016
I found this to run much faster (about 23s on my machine): preallocate RMS_new, move the squaring and the division by N (to compute the mean) out of the loop, and then in each iteration subtract a single element and add a single element to be previous mean; finally do the square root.
K = 13978444;
N = 24007;
ACC_DataN = (ACC_Data.^2)/N;
RMS_new = nan(K, size(ACC_DataN, 2));
RMS_new(1, :) = sum(ACC_DataN(1:1+N-1, :));
for i = 2:K
RMS_new(i,:) = RMS_new(i-1,:) - ACC_DataN(i-1,:) + ACC_DataN(i+N-1,:);
end
RMS_new = sqrt(RMS_new);
  1 Comment
Tony Pate
Tony Pate on 27 Jun 2016
This method sped things up a lot. Thank you for your help. I am going to test run all of the ideas that everyone gave me and find the best possible scenario.

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!