Distributing arrays to workers for local processing ?

2 views (last 30 days)
How to access parts of distributed data on the workers/labs ?
I have a large timeseries data and want to run some functions on smaller chunks of it. Have a working parfor implementation.
Want to try it with distributed array/spmd, but don't know how to access the local data once I distribute the array.
matlabpool start 100;
size(myMat) = 800000 x 2
myMatdb= distributed(myMat');
spmd
chunk_of_data = myMatdb;
[out_of_chunk] = objFun(params, chunk_of_data);
end
Works but all the labs/workers have full data rather than a small chunk of it.
I would like to explore codistributed with codistributor1d option to have more control over the distribution. Still, how do I tell the worker to operate only on its local copy but not the total Composite.
For some strange reason, functions like getLocalPart,localPart etc., aren't available on my Matlab r2011b.

Accepted Answer

Jill Reese
Jill Reese on 19 Sep 2012
Hello. If you are able to successfully open a matlabpool with your installation of R2011b, then you must have the Parallel Computing Toolbox. In that case, the getLocalPart function should also be available to you. What is the output from typing the following at the MATLAB command line:
which getLocalPart
Assuming that you can get the issue with getLocalPart sorted out (perhaps by calling technical support), this is how you would proceed with distributed arrys/spmd:
matlabpool open 100 % this will open 100 workers
% using your default configuration
% I assume that myMat was already loaded as a standard MATLAB array
size(myMat) % You've stated that myMat is 800000 x 2
% There are a lot of rows, so let's use codistributor1d to
% distribute the rows across all the workers in the pool. This must
% be done inside the spmd block because that's where
% codistributed arrays and codistributors live.
spmd
codist = codistributor1d(1); % Create a scheme to distribute the first
% dimension of a matrix (its rows) as evenly as
% possible across all the workers in the
% pool
myMatdb = codistributed(myMat, codist); % Use the scheme to create
% distributed data
chunk_of_data = getLocalPart(myMatdb); % Each worker operates on its data
[out_of_chunk] = objFun(params, chunk_of_data);
fullOutput = codistributed.build(out_of_chunk, codist); % Create a new
% array from the
% local outputs. I
% assume that
% out_of_chunk is
% the same size as
% chunk_of_data on
% each worker so
% that the
% codistributor can
% be reused.
end
% fullOutput and myMatdb can be used as distributed arrays outside of the spmd block
You can find more information here:
help getLocalPart
help codistributor.build
  1 Comment
nah
nah on 20 Sep 2012
Thanks Jill. Yes, a which getLocalPart outputs:
which getLocalPart
/storage/shares/matlabr2011b/toolbox/distcomp/parallel/@codistributed/getLocalPart.m % codistributed method
For some weird reason, it is not in the PATH and hence the call is not recognized as a command. Will have to add it manually.

Sign in to comment.

More Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!