Will splitting a for loop up make code run quicker?
Show older comments
I have some code where I read data from a file then perform some pretty heavy anaylsis on it. This is repeated for 650 files and the whole process is taking around 3 hours to run. The general code is:
for i = 1:650
file_name = sprintf('file%f', i);
load(file_name);
.......... %rest of code
end
As the same actions are performed independently on each file, I was wondering if there's anyway to split this for loop up so that the code can be ran on several files siultaneously instead of having to wait for the previous loop to finish. Would this even make my code run quicker? I've briefly heard about 'parallel computing' before, but I'm not too familar with the concept or how to implement it.
Thanks.
9 Comments
per isakson
on 6 Jan 2022
Yes, see: Parallel Computing Toolbox
Jan
on 6 Jan 2022
load() without catching the inputs in a variable creates variables dynamically. This can slow down the processing by a factor of 100, because it impedes the JIT acceleration in a way comparable to the evil eval(). So prefer data=load(file_name) instead.
Use the profiler to finde the bottleneck of your code. If it is the load() command, the disk access needs more time than the processing. Then the parallel processing cannot accelerate the code substantially: It does not matter, if 1 or 8 threads are waiting for the slow disk.
If the import from the disk and the processing need the same amount of processing time, a parallel processing can save 50% of the total time.
S
on 7 Jan 2022
S
on 7 Jan 2022
KSSV
on 7 Jan 2022
Points to consider:
- You can get all the files required using dir and then use the output in loop.
- Depending on how and what data the files has, you need to use a function.
- Speeding up also depends on what you are doing in the %rest of code. Unless this is known, we cannot comment on it.
S
on 8 Jan 2022
Chunru
on 8 Jan 2022
% The first thing you can try:
parfor i = 1:650
file_name = sprintf('file%f', i);
load(file_name);
.......... %rest of code
end
Stephen23
on 8 Jan 2022
"Would this even make my code run quicker?"
No. Yes. Maybe. Who knows?
Any potential benefit of using parallel computation depends on many factors that we do not know about your code.
However, it is best to avoid trying to micro-manage MATLAB's JIT optimization: when beginners try to do something very cunning to speed things up it just gets in the way of JIT engine (which is written to speed up clearly-written, well-organized code).
Jumping onto the parallel-computation bandwagon is certainly not a alternative to writing better code, for example:
- replacing directly LOADing into the workspace with LOADing into an output variable.
- not saving your code in the same location as your data files, using absolute/relative filenames instead.
- avoiding variable name i.
- no doubt many other improvements in the code that you did not show us.
"I've briefly heard about 'parallel computing' before, but I'm not too familar with the concept or how to implement it"
Learning good MATLAB practices would be a much better use of your time, until you can find a bottle-neck that really can only be solved using parallel computation.
S
on 8 Jan 2022
Answers (0)
Categories
Find more on Loops and Conditional Statements in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!