Appropriate usage of Parallel Pool

9 views (last 30 days)
Haris K.
Haris K. on 7 Jun 2020
Commented: Matt J on 11 Jun 2020
Hi. I have two related questions:
Question 1: Which of the following 4 is best to use in order to take advantage of my PC's multi-core setup? Or if instead any other method would be more appropriate (e.g. using gcp), please let me know. Methods 1&2 are identical to 3&4 with the exception of how I have treated the loop: 'for' vs 'parfor'.
Question 2: My PC also has a NVIDIA Quadro P3200 GPU. Could you please possibly also provide an additional script (for the same CV-lasso setup) potentially utilising the GPU, too?
%Method 1
for i = 1:10
pool = parpool('local',5)
opts = statset('UseParallel',true);
b = lasso(X(:,:,i),y,'CV',10,'Options',opts);
delete(pool)
end
%Method 2
pool = parpool('local',5)
for i = 1:10
opts = statset('UseParallel',true);
b = lasso(X(:,:,i),y,'CV',10,'Options',opts);
end
delete(pool)
%Method 3
parfor i = 1:10
pool = parpool('local',5)
opts = statset('UseParallel',true);
b = lasso(X(:,:,i),y,'CV',10,'Options',opts);
delete(pool)
end
%Method 4
pool = parpool('local',5)
parfor i = 1:10
opts = statset('UseParallel',true);
b = lasso(X(:,:,i),y,'CV',10,'Options',opts);
end
delete(pool)
Thank you for your time and help!
PS: Btw, I wrote my GPU model (not to show off, but) just in case some tech-savvy pal says something like 'Your GPU is not powerful enough, and it's not worth the trouble. Stick to utilising your multi-core setup, instead'. I am really looking for the strategy that provides me with the lowest possible runtime given the resources.
  1 Comment
Haris K.
Haris K. on 7 Jun 2020
My intuition says that there would be added benefit when using both UseParallel and parfor, since the former treats the CV & lambda iterations in parallel, while the latter treats the iterations over X in parallel. But basically, with my question, I wanted to clarify whether or not there are any complications when using both.
Let's wait to see if anyone else can further add to the discussion.

Sign in to comment.

Accepted Answer

Haris K.
Haris K. on 11 Jun 2020
Edited: Haris K. on 11 Jun 2020
Just in case someone finds this comparison helpful. I changed the methods a bit, as I realized that it is not a good practise to delete the pool inside the loop. I also added the no-parallel case below (-this is Method 4, now). The exact code and the respective times are shown below:
clear
N=30;
T=1000;
y = randn(T,1);
X = rand(T,80,N);
for j=1:10
%Method 1
tic
pool = parpool('local',6);
for i = 1:N
opts = statset('UseParallel',true);
b = lasso(X(:,:,i),y,'CV',10,'Options',opts);
end
delete(pool)
tau(j,1)=toc;
%Method 2
tic
pool = parpool('local',6);
parfor i = 1:N
b = lasso(X(:,:,i),y,'CV',10);
end
delete(pool)
tau(j,2)=toc;
%Method 3
tic
pool = parpool('local',6);
parfor i = 1:N
opts = statset('UseParallel',true);
b = lasso(X(:,:,i),y,'CV',10,'Options',opts);
end
delete(pool);
tau(j,3)= toc;
%Method 4
tic
for i = 1:N
b = lasso(X(:,:,i),y,'CV',10);
end
tau(j,4)= toc;
end
mean(tau)
The results were:
ans =
51.6030 33.5555 34.2442 82.0831
  3 Comments
Haris K.
Haris K. on 11 Jun 2020
You are absolutely right, but the loop over j is part of the 'test', in the sense that what I am interested in measuring is how long the i-loop is taking to run. And instead of giving out a single time figure for each method, i.e. j=1 (- reflecting one-off forces that might have positively or negatively affected my runtime, such as an open browser doing processes and stuff in the background), I thought it would be a better idea to run it 10 times and take the average.
Also let me take the chance to thank you for your helpful comments, here (and also to many-many other eye-openning posts of yours that have helped me throughout the years). Cheers Matt!
Matt J
Matt J on 11 Jun 2020
You are quite welcome.

Sign in to comment.

More Answers (1)

Matt J
Matt J on 7 Jun 2020
Edited: Matt J on 7 Jun 2020
Question 1: Which of the following 4 is best to use
I'm sure it depends on the dimensions of X(:,:,i), but here's what I would try first,
pool = parpool('local',5)
opts = statset('UseParallel',true);
clear b
for i = 10:-1:1
b{i} = lasso(X(:,:,i),y,'CV',10,'Options',opts);
end
I would probably then compare that to parfor without UseParallel
parfor i = 1:10
b{i} = lasso(X(:,:,i),y,'CV',10);
end
Question 2: My PC also has a NVIDIA Quadro P3200 GPU. Could you please possibly also provide an additional script (for the same CV-lasso setup) potentially utilising the GPU, too?
It doesn't look like lasso() supports gpuArray input, unfortunately.
  3 Comments
Haris K.
Haris K. on 7 Jun 2020
Edited: Haris K. on 7 Jun 2020
Is there a reason why you wouldn't use both UseParallel and parfor?
Matt J
Matt J on 7 Jun 2020
Edited: Matt J on 7 Jun 2020
Mostly just intuition. If you have enough work in each call to lasso() to keep a core busy, I don't know why it would be beneficial to try even more parallel splitting. But ultimately, experimentation is the only way to know which strategy is optimal.

Sign in to comment.

Categories

Find more on Parallel Computing Fundamentals in Help Center and File Exchange

Products


Release

R2019b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!