Are accumarray/cellfun/arrayfun/etc multithreaded?

Question

Eric Sampson on 19 Apr 2013

2
Link

Direct link to this question

https://ch.mathworks.com/matlabcentral/answers/72798-are-accumarray-cellfun-arrayfun-etc-multithreaded

Commented: Jan on 22 Jan 2018

As per the subject line, anyone know? If not, could this be done by TMW without a lot of effort?

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Sign in to answer this question.

Answer 1

Jan on 21 Apr 2013

3
Link

Direct link to this answer

https://ch.mathworks.com/matlabcentral/answers/72798-are-accumarray-cellfun-arrayfun-etc-multithreaded#answer_82994

Edited: Jan on 21 Apr 2013

Open in MATLAB Online

It is not trivial to add multi-threading to e.g. cellfun. Its source code was shipped with older Matlab versions and multi-threading for the builtin commands (defined as strings and unfortunately marked as "Backward Compatibility" only, although they are very efficient) can be added easily. But for the not-built-in methods, the called functions could have side-effects, e.g. persistent counters, output to files, etc. Then distributing to job to different tasks would cause serious errors. Example:

x = {1,2; 3,4}
cellfun(@(c) fprintf('%d ', c), x)

With multi-threading the result is not well-defined anymore.

How could cellfun (etc.) decide, if there are dependencies between the function calls? As long as independence is not guaranteed, an automatic multi-threading would be a bug.

4 Comments
Show 2 older commentsHide 2 older comments

Alexander Laut on 22 Jan 2018

I believe I understand your point given your example printing the contents of the cell. I am curious if your concern is however that the outputs would be in some arbitrary order or would they conflict in critical way (specifically for your example).

If backwards compatibility is the main issue then so be it, but i think that it may still be a nice feature to include and optional flag that would allow it to run in parallel, if not to run at your own risk. The flag for 'uniformoutput',false already seems like an option that may have been added to deal with unpredictable functions.

Jan on 22 Jan 2018

Open in MATLAB Online

@Alexander Laut: The "uniformoutput" flag is not for unpredictable functions, but if the reply of the functions cannot be concatenated to an array.

MathWorks will surely not introduce a flag, which is used on own risk only. But you can do this easily, if you have an old version of Matlab, e.g. R13, which included many C-sources as cellfun.c.

A mutli-threaded cellfun would have to call Matlab multiple times, but Matlab is not thread-safe. Even if you call functions from the C-mex libs like mxCreateNumericMatrix, a crash is guaranteed.

But if a future version of Matlab is thread-safe, there is still the problem of defining a suitable number of threads: MathWorks decided to implement a multi-threading for sum, if it applied to a vector with more than 88.999 elements. This is a double-edged decision, because the summation is not a stable operation and depends on the order of operands (example: 1e17 + 1 - 1e17 returns 0, but 1e17 - 1e17 + 1 yields 1). In R2009a the results a and b of

x = rand(1, 89000);
a = sum(x)
b = sum(x)

could differ randomly due to rounding errors. The sum was calculated in 2 threads and the result depended on which thread was finished at first. This was fixed in following versions, but since sum was multi-threaded, the result of the sum of huge vectors depends on the number of used cores.

But back to the problem: sum has the perfect property that you can predict how much processing time it needs. MathWorks decided to set a limit at 89000, because starting a thread is very expensive and for short vectors the single threaded version is faster in consequence. But what would you do in cellfun. How could the function decide how many threads to start? There is no chance to estimate how much time the called function needs of if it need the same time for the different cell elements. There are methods to control this dynamically, but they are expensive.

If you have a huge data set stored in a cell and want to apply a function, which can be distributed to multiple threads, use the method suggested by Nimrod: Run a parfor loop and check carefully, if it is faster than a single-threaded cellfun approach.

Sign in to comment.

Answer 2

Nimrod on 6 Sep 2016

0
Link

Direct link to this answer

https://ch.mathworks.com/matlabcentral/answers/72798-are-accumarray-cellfun-arrayfun-etc-multithreaded#answer_233940

I found an easy solution..

you will have to divided your original cell array into several cell arrays (in cell arrays), lets say 32

Than you iterate over the cells with parfor and cell2mat everything back together

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Are accumarray/cellfun/arrayfun/etc multithreaded?

0 Comments
Show -2 older commentsHide -2 older comments

Answers (2)

4 Comments
Show 2 older commentsHide 2 older comments

0 Comments
Show -2 older commentsHide -2 older comments

See Also

Categories

Tags

Products

Community Treasure Hunt

Are accumarray​/cellfun/a​rrayfun/et​c multithreaded?

0 Comments Show -2 older commentsHide -2 older comments

Answers (2)

4 Comments Show 2 older commentsHide 2 older comments

0 Comments Show -2 older commentsHide -2 older comments

See Also

Categories

Tags

Products

Community Treasure Hunt

Are accumarray/cellfun/arrayfun/etc multithreaded?

0 Comments
Show -2 older commentsHide -2 older comments

4 Comments
Show 2 older commentsHide 2 older comments

0 Comments
Show -2 older commentsHide -2 older comments