Are accumarray​/cellfun/a​rrayfun/et​c multithreaded?

6 views (last 30 days)
Eric Sampson
Eric Sampson on 19 Apr 2013
Commented: Jan on 22 Jan 2018
As per the subject line, anyone know? If not, could this be done by TMW without a lot of effort?

Answers (2)

Jan
Jan on 21 Apr 2013
Edited: Jan on 21 Apr 2013
It is not trivial to add multi-threading to e.g. cellfun. Its source code was shipped with older Matlab versions and multi-threading for the builtin commands (defined as strings and unfortunately marked as "Backward Compatibility" only, although they are very efficient) can be added easily. But for the not-built-in methods, the called functions could have side-effects, e.g. persistent counters, output to files, etc. Then distributing to job to different tasks would cause serious errors. Example:
x = {1,2; 3,4}
cellfun(@(c) fprintf('%d ', c), x)
With multi-threading the result is not well-defined anymore.
How could cellfun (etc.) decide, if there are dependencies between the function calls? As long as independence is not guaranteed, an automatic multi-threading would be a bug.
  4 Comments
Jan
Jan on 22 Jan 2018
@Alexander Laut: The "uniformoutput" flag is not for unpredictable functions, but if the reply of the functions cannot be concatenated to an array.
MathWorks will surely not introduce a flag, which is used on own risk only. But you can do this easily, if you have an old version of Matlab, e.g. R13, which included many C-sources as cellfun.c.
A mutli-threaded cellfun would have to call Matlab multiple times, but Matlab is not thread-safe. Even if you call functions from the C-mex libs like mxCreateNumericMatrix, a crash is guaranteed.
But if a future version of Matlab is thread-safe, there is still the problem of defining a suitable number of threads: MathWorks decided to implement a multi-threading for sum, if it applied to a vector with more than 88.999 elements. This is a double-edged decision, because the summation is not a stable operation and depends on the order of operands (example: 1e17 + 1 - 1e17 returns 0, but 1e17 - 1e17 + 1 yields 1). In R2009a the results a and b of
x = rand(1, 89000);
a = sum(x)
b = sum(x)
could differ randomly due to rounding errors. The sum was calculated in 2 threads and the result depended on which thread was finished at first. This was fixed in following versions, but since sum was multi-threaded, the result of the sum of huge vectors depends on the number of used cores.
But back to the problem: sum has the perfect property that you can predict how much processing time it needs. MathWorks decided to set a limit at 89000, because starting a thread is very expensive and for short vectors the single threaded version is faster in consequence. But what would you do in cellfun. How could the function decide how many threads to start? There is no chance to estimate how much time the called function needs of if it need the same time for the different cell elements. There are methods to control this dynamically, but they are expensive.
If you have a huge data set stored in a cell and want to apply a function, which can be distributed to multiple threads, use the method suggested by Nimrod: Run a parfor loop and check carefully, if it is faster than a single-threaded cellfun approach.

Sign in to comment.


Nimrod
Nimrod on 6 Sep 2016
I found an easy solution..
you will have to divided your original cell array into several cell arrays (in cell arrays), lets say 32
Than you iterate over the cells with parfor and cell2mat everything back together

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!