Very bad performance of parallel tasks calling a Matlab precompiled executable

1 view (last 30 days)
Hi all,
I’m having trouble understanding the behaviour of Matlab executables in a parallel environment. I’m not talking at all about the Parallel Toolbox of Matlab, but about a much simpler procedure.
My C++ program creates 128 MPI tasks. Each task contains a system call that invokes an executable that was written in Matlab and compiled with mcc (with the explicite flag “-singleCompThread”). Each one of these syscalls will apply the same function to an independent subset of my data, and there is no need for communication between processors.
That should be the ideal setting for parallel computing… but my performance times are miserable. The speed-up for 128 processors is not even near 100x: it is about 15x.
This I cannot understand…. I would expect either rather good speed up numbers (as typically obtained in this “embarrisingly parallelizable ” applications…) or speedups of 1x (the MCR allows a single instance). This behaviour in-between puzzles me…
Has somebody come across this problem at some point?
Thanks, Daniel

Accepted Answer

Daniel
Daniel on 27 Aug 2012
Well, I find the answer to my own question, in case some other people come across the problem: I just needed to provide a path for a temporal folder used by the MCR library.
I included the lines in the script that I submit to the cluster:
export MCR_CACHE_ROOT=./tmp/mcr_cache mkdir -p $MCR_CACHE_ROOT
... and that's it...

More Answers (1)

Walter Roberson
Walter Roberson on 25 Aug 2012
There is a quite high start-up time for MATLAB executables. They have to do the internal equivalent of starting up MATLAB.
Note that there is expected to be little speed-up for a MATLAB executable compared to starting MATLAB. You do get the benefit of not having to parse the routines (but you could pcode to avoid that within MATLAB), but on the other hand each executable could end up unpacking all the CTF components into a directory, which would usually be more work than the parsing.
You might get better performance by using MATLAB as the coordinating routine, starting a pool of workers.
  2 Comments
Daniel
Daniel on 26 Aug 2012
Thanks Walter, but I guess that should'nt be the problem in this case: each one of the MPI tasks assigned to a single core takes hours to complete, so that the starting time should be neglectable against the computing time.

Sign in to comment.

Categories

Find more on Performance and Memory in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!