Parfeval command slower than a classic for loop

3 views (last 30 days)
I have four .mat files ('input1.mat', 'input2.mat', 'input3.mat', 'input4.mat') containing four different input dataset (in the form of matrices) to be run on the same function called PFPD_paralleli.m. I want these data to be run with PFPD_paralleli.m in four different workers in parallel and to make this I use the command parfeval.
The output of function PFPD_paralleli.m is only one vector (En), and I would like to store each of the four En on an output cell array 1X4 (since I have four workers computing PFPD_paralleli.m I will have four vectors En as solutions).
Below I show you the code.
clear all; clc; close all;
% Check if a parallel pool is already open, if not, open one
if isempty(gcp('nocreate'))
parpool('local', 4); % You can adjust the number of workers based on your system's capabilities
end
tic
% List of input files
inputFiles = {'input1.mat', 'input2.mat', 'input3.mat', 'input4.mat'};
% Preallocate cell array to store futures
futures = cell(1, numel(inputFiles));
% Loop through each input file and submit a task to the parallel pool
for i = 1:numel(inputFiles)
inputFile = inputFiles{i};
% Use parfeval to asynchronously evaluate PFPD_paralleli
futures{i} = parfeval(@PFPD_paralleli, 1, inputFile); % 1 is the number of output arguments
end
En = cell(1, numel(inputFiles));
for i = 1:numel(inputFiles)
result = fetchOutputs(futures{i});
En{i} = result;
end
CPU_time=toc
Below I attach the same code but without exploiting the parallel computing.
% List of input files
inputFiles = {'input1.mat', 'input2.mat', 'input3.mat', 'input4.mat'};
En = cell(1, numel(inputFiles));
tic
for i = 1:numel(inputFiles)
inputFile = inputFiles{i};
En{i} = PFPD_paralleli(inputFile);
time(i) = toc;
end
CPU_time=toc
Question: The CPU time of the first code is about 40 seconds, whereas the one of the second code is about 30 seconds. Why in this case the parallel option is slower? Could it depend on the specific function that is run in parallel (it calls some sub-functions inside it)? I know that parallel computing is not necessarily faster, but I don't think that parallel overhead time could be more than 10 seconds.
Thank you!

Answers (1)

Sai Teja G
Sai Teja G on 20 Nov 2023
Hi Giovanni,
I understand that you expected the first code, which utilizes parallel processing, to execute more swiftly than the second, sequential code. However, this was not the case. There are several potential explanations for this outcome. Below, I outline some possible reasons and suggest strategies to investigate and possibly mitigate the issue to a certain degree.
Parallel computing in MATLAB, while powerful, can sometimes lead to slower performance than sequential execution due to various types of overhead. The initialization of the parallel pool (`parpool`) can be time-consuming, particularly when first started in a MATLAB session. Additionally, the need to transfer data between the main MATLAB process and the parallel workers can introduce delays, especially with large datasets. Overhead also arises from scheduling and managing the execution of tasks across multiple workers.
The effectiveness of parallel computing depends on the nature of the task. For small or quick computations, the parallel management overhead may not be justified, as parallel computing typically excels with longer, more computationally demanding tasks. Other factors that can affect performance include the degree of optimization for parallel execution within the function or its sub-functions, competition for system resources, and the number of workers relative to available CPU cores. Excessive workers can cause resource contention and context switching, leading to inefficiencies.
To understand the specific reasons for slower parallel execution in your case, it's advisable to profile the function to identify bottlenecks, conduct scaling tests with different numbers of workers, check for any internal parallelism within the functions being used, and assess the computation's complexity and duration. These steps can help determine if the slower performance is due to the inherent overhead of parallel processing or other issues that may be addressed to improve speed. You can refer the below documentation for more details -
Hope it helps!

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!