parfor and ticBytes/tocBytes
5 views (last 30 days)
Show older comments
I am trying to use parfor on a 40-core machine. At the moment I see no improvement beyond using 16 cores in the loop, and I am wondering if this is related to communication overhead.
I have used ticBytes/tocBytes to establish that the data transferred per core is going down with the number of CPUs, but the total transferred is going up.
My question is, which of these statistics is most relevant to performance? Or, to oversimplify, is the most of the message passing effort in serial or in parallel?
Thanks, Ben
0 Comments
Answers (1)
Ankitha Kollegal Arjun
on 2 Nov 2017
Hi Ben,
I understand you want to know why your program that uses 'parfor' shows performance degradation with increase in number of workers.
There are a lot of factors that need to be taken into account when measuring the performance of parallel programs. Here are some troubleshooting steps which can be used to determine the bottlenecks and improve the performance:
1. Check the utilization of CPU cycles by the client MATLAB. Since each task is independent while using parfor, typically the client should utilize 0% of CPU cycles and the workers should utilize close to 100% of CPU cycles. If there is a lot of CPU utilization from the client MATLAB during parfor execution then it is quite possible that a continuous data exchange is happening between the client and the worker MATLAB sessions. Such a constant communication is an overhead and therefore should be avoided.
2. Profile the parallel code using the serial profiler, in order to obtain the profiling information on the client as well as the workers.
In general,the following suggestions can be adopted in order to improve the performance of a program that uses "parfor":
1. Make a local copy of the variable created outside "parfor", inside "parfor" and use the local copy in subsequent calculations. For variables of smaller size, this prevents continuous communication between the client MATLAB and the workers, thereby avoiding the communication overhead.
2. If the data is too large, it is better to save the data in a MAT file and load the MAT file directly on the workers by specifying the complete path to the MAT file. This prevents the overhead of a large data transfer between the client and the workers.
3. It is recommended that "parfor" should not be used inside a loop (say for loop).
4. The following documentation provides some tips on improving parfor performance:
Hope this helps.
See Also
Categories
Find more on Parallel for-Loops (parfor) in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!