You are now following this question
- You will see updates in your followed content feed.
- You may receive emails, depending on your communication preferences.
Parallel computing with solving linear system of equations
3 views (last 30 days)
Show older comments
Hi MATLAB community,
I am interested in solving two very large linear systems of equations. I am using two iterative schemes, pcg and bicgstab. Is there a way to solve the two systems in a parallel fashion? I have 24 cores, and I would like to make use of them.
Thanks
Diab
2 Comments
Walter Roberson
on 29 Jan 2019
spmd or parfeval() if you have Parallel Computing Toolbox.
But make sure to adjust maxNumCompThreads so that each of the workers can access more than one core.
Diab Abueidda
on 29 Jan 2019
Edited: Diab Abueidda
on 29 Jan 2019
Thanks Walter for the answer! yes, I have the parallel computing toolbox.
May you sketch the implementation? Let's say that I am interested in solving Ku=f.
Accepted Answer
Walter Roberson
on 29 Jan 2019
parpool(2)
spmd
maxNumCompThreads(12);
if labindex == 1
do first computation
else
do second computation
end
end
19 Comments
Edric Ellis
on 30 Jan 2019
In recent versions of Parallel Computing Toolbox, you can set up the number of threads directly on the cluster. This can be done either using the Cluster Profile Manager, or you can do it programmatically, like this:
c = parcluster('local');
c.NumThreads = 12;
parpool(c, 2);
spmd
% Assert that the setting was applied correctly
assert(maxNumCompThreads == 12)
end
Diab Abueidda
on 1 Feb 2019
So if I write the following, will the pcg function make use of the 24 cores?
c = parcluster('local');
c.NumThreads = 24;
parpool(c, 2);
spmd
% Assert that the setting was applied correctly
assert(maxNumCompThreads == 24)
end
u=pcg(K,F,1e-6,1000)
Diab Abueidda
on 1 Feb 2019
So will something like this work?
c = parcluster('local');
c.NumThreads = 24;
parpool(c, 2);
spmd
% Assert that the setting was applied correctly
assert(maxNumCompThreads == 24)
u=pcg(K,F,1e-6,1000)
end
Walter Roberson
on 1 Feb 2019
That appears to request the same task on each of the workers. That would be valid and useful if the task involves random numbers.
You also have a conflict that you have indicated that you only have 24 cores, but here you appear to be granting 24 threads to each of the two workers. That could have problems unless you have hyperthreading turned on so that there are a total of 48 threads on the system. But beware that hyperthreading generally makes the system slower unless some of the threads are waiting for resources. Hyperthreading is a form of fast context switching, suitable for servicing short I/O interrupts (for example) but not suitable for running two high performance computations simultaneously.
Diab Abueidda
on 1 Feb 2019
I am not sure if I really know the difference between the threads and cores. I am sure that I have 24 "MATLAB workers" in total on a virtual machine. The numbers are not random. Bascally, I have a very large linear system of equations that I am trying to solve iteratively, and I am checking if there is a possiblilty to solve it in a parallel fashion.
Walter Roberson
on 1 Feb 2019
Most likely you have access to 24 computation units, not to 24 physical cores with hyperthreading turned on that might permit each to run two semi-simultaneous computations.
TL;DR: request 12 threads per worker, the way I showed. The 12 is per worker, and you have 2 workers, for a total of 24 threads.
However, it is common that by the time you get past 7-ish threads per worker, that you are getting only marginal improvements in performance by adding workers. It would be common to be able to do more by dividing the work into three groups of 8 workers each instead of 2 groups of 12 workers. But a lot depends upon exactly what the computation is and how large the arrays are.
Diab Abueidda
on 1 Feb 2019
Thanks, Walter for the explanation. This is very useful. How can I write it then? You mentioned the way I am doing it make the workers do the same task. Thanks in advance
Walter Roberson
on 1 Feb 2019
c = parcluster('local');
c.NumThreads = 12;
parpool(c, 2);
spmd
% Assert that the setting was applied correctly
assert(maxNumCompThreads == 12) %just to double check
if labindex == 1
u=pcg(K,F,1e-6,1000)
else
u=something else
end
end
Diab Abueidda
on 1 Feb 2019
I don't follow the if statement. May you elaborate? So my question was if the pcg function itself can be processed in a parallel fashion. Also, when you say u does something else, how u can be used in the following step without having it evaluated?
Please excuse my ignorance in parallel computing. I am new to it
Walter Roberson
on 1 Feb 2019
You had posted,
I am using two iterative schemes, pcg and bicgstab. Is there a way to solve the two systems in a parallel fashion?
Everything that is to be done in common should be done before the spmd. At the point where you want two different simultaneous operations to start, you invoke spmd, and inside the spmd, compare labindex == 1 and have it proceed on the first path, and otherwise proceed on the second path. You can assign the outputs to a common variable. Then after the spmd, index the variable as if it were a cell array: u{1} would be the result computed by lab (worker) #1, u{2} what was computed by lab (worker) #2.
Diab Abueidda
on 1 Feb 2019
Edited: Diab Abueidda
on 1 Feb 2019
Oops! I apologize for not making it clear.
The output of the pcg which is u will be fed to the second function bicgstab. My question was about how to find u using pcg using parallel computing. Then the same applies to the bicgstab, how to execute it in parallel if there is a way to do that. In other words, do the matlab functions pcg and bicgstab support parallel computing?
Walter Roberson
on 2 Feb 2019
pcg documentation says it can use GPU, and also says it can work with sparse distributed arrays; https://www.mathworks.com/help/distcomp/run-matlab-functions-with-distributed-arrays.html . However there is no 'useparallel' option
When I look at the code, it appears to me that it uses matrix operations in various places, and that those matrix operations would probably automatically use as many cores as there are available; I am not certain of that due to the complication of sparse arrays.
Joss Knight
on 2 Feb 2019
Iterative algorithms, by their very nature, do not parallelize well. If you want to try to maximize your use of CPU cores you could convert your data to a distributed array and run any of the iterative solvers (such as pcg). However, I am very doubtful that this could ever provide an improvement in performance; firstly because, as Walter says, the individual algebraic operations used by the solvers (like matrix multiply, matrix factorizations etc) are already well multi-threaded and making good use of your CPU cores, and secondly because there is naturally a lot of communication going on between workers when you run these functions using distributed arrays, which slows everything down. So typically we'd only advise use of distributed when you have arrays that are too large to fit in the memory of a single machine, and you do it for scaling reasons, not performance reasons.
Diab Abueidda
on 3 Feb 2019
Thanks, Joss for the comment! Then what could be the fasted way to solve a very large linear system of equations given the potential of using 24 MATLAB workers?
Thanks
Diab
Walter Roberson
on 3 Feb 2019
The fastest... sometimes that involves reducing own to fewer cores, as adding too many cores can be adding communications overhead without enough performance gain to balance. Start with a smaller array and do some timing to determine how performance changes with the number of cores.
Secondly, especially if you can get away with single precision calculations, then switching to GPU can sometimes give much much better performance. But I would not count on it for sparse arrays.
More Answers (0)
See Also
Categories
Find more on Distributed Arrays in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!An Error Occurred
Unable to complete the action because of changes made to the page. Reload the page to see its updated state.
Select a Web Site
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .
You can also select a web site from the following list
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.
Americas
- América Latina (Español)
- Canada (English)
- United States (English)
Europe
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom(English)
Asia Pacific
- Australia (English)
- India (English)
- New Zealand (English)
- 中国
- 日本Japanese (日本語)
- 한국Korean (한국어)