How I can modify this code to split the cells of a cell array?

Question

MA on 19 Sep 2021

0
Link

Direct link to this question

https://ch.mathworks.com/matlabcentral/answers/1456284-how-i-can-modify-this-code-to-split-the-cells-of-a-cell-array

Commented: Walter Roberson on 6 Oct 2022

I am working on the following code, that it will go through some folder and read data files as tables,

for k = 1 : length(theFiles)
  baseFileName = theFiles(k).name;
  fullFileName = fullfile(myFolder, baseFileName);
  
  t = readtable(fullFileName);
  %output = t.output;
  %fprintf(1, 'Now reading %s\n', fullFileName);
  
 idxBkpt = find(diff([t.GDALT])<0);
 split_indices = sort(1+idxBkpt);  %beginnings of blocks
 blk = diff([1 reshape(split_indices, 1, []) size(t,1)+1]);
 splits = mat2cell(t, blk, size(t,2));
 celldisp(splits);
end

Now what I am struggling with is the following:

I want to create a matrix that in the first cell of each row it will store the name of the data file i am processing and in the rest of the cells of that row it will store the cell array 'splits', any efficient way to do that?

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Sign in to answer this question.

Answer 1

Walter Roberson on 19 Sep 2021

1
Link

Direct link to this answer

https://ch.mathworks.com/matlabcentral/answers/1456284-how-i-can-modify-this-code-to-split-the-cells-of-a-cell-array#answer_790529

Open in MATLAB Online

nfiles = length(theFiles);
results = cell(nfiles,2);
fullnames = fullfile({theFiles.folder}, {theFiles.name});
results(:,1) = fullnames(:);
for k = 1 : nfiles
  fullFileName = fullnames{k};  
  t = readtable(fullFileName);
  idxBkpt = find(diff([t.GDALT])<0);
  split_indices = sort(1+idxBkpt);  %beginnings of blocks
  blk = diff([1 reshape(split_indices, 1, []) size(t,1)+1]);
  splits = mat2cell(t, blk, size(t,2));
  results{k,2} = splits;
end

So results will be a cell array that is nfiles x 2. results{k,1} will be filename #k. results{k,2} will be the cell array splits -- and you will need to index that cell array to get to the pieces.

You cannot just use a cell array with N + 1 columns because it appears that the number of splits for each file may be different.

11 Comments
Show 9 older commentsHide 9 older comments

Walter Roberson on 19 Sep 2021

Open in MATLAB Online

I want to create a matrix that in the first cell of each row it will store the name of the data file i am processing and in the rest of the cells of that row it will store the cell array 'splits',

An example of that desired outcome according to what you explained

{
 filename1 file1_splits{1} file1_splits{2} file1_splits{3}
 filename2 file2_splits{1} file2_splits{2}
 filename3 file3_splits{1} file3_splits{2} file3_splits{3} file3_splits{4}
 }

That is "first cell of each row it will store the name of the data file" -- that's column 1

"and in the rest of the cells of that row it will store the cell array 'splits'" -- that's the rest of the columns.

That's what you are asking for. But there is a problem: cell arrays cannot have variable numbers of columns for different rows. You need one of two approaches:

{
 filename1 {file1_splits{1} file1_splits{2} file1_splits{3}}
 filename2 {file2_splits{1} file2_splits{2}}
 filename3 {file3_splits{1} file3_splits{2} file3_splits{3} file3_splits{4}}
 }

which is what I programmed; OR

{
 filename1 file1_splits{1} file1_splits{2} file1_splits{3}             []  [] []
 filename2 file2_splits{1} file2_splits{2}             []              []  [] []
 filename3 file3_splits{1} file3_splits{2} file3_splits{3} file3_splits{4} [] []
 }

in which case each cell row is padded with [] (or similar) for the unused columns, with the number of columns being according to the maximum number of splits for any file.

The second possibility requries one of a small number strategies:

Figure out the maximum number of splits ahead of time by scanning each file, throwing away the content each time and just retaining knowledge of the maximum number of splits seen so far; then constructing a cell array with that number of columns and going back and re-reading the files; this requires re-reading and re-splitting each file
Figure out the maximum number of splits ahead of time as you go, by reading the file and figuring out the splits, and keeping the unsplit tables and the knowledge of the maximum number of splits so far; then constructing a cell array with that number of columns and going back and splitting the saved table into the cell array. This requires re-splitting each file but not re-reading it
Read and split the tables into cell array like I already posted; afterwards, scan to find the maximum number of splits, and repack the cell of splits into columns, without having to re-read or re-split, and never needs to grow an array dynamically
Read and split the tables as you go. As you read in one and split it, if the number of splits it would need is greater than the maximum number of splits so far, then pad out the existing cell to the required number of columns; then either way, store into only the number of columns needed for the data, leaving the other columns for the row empty. The logic for this approach is relatively easy, but it does mean that you are not pre-allocating the cell array and need to grow it from time to time

The most efficient one of those would be #3... which starts by reading the files and splitting in exactly the way I posted before, but then follows-up with a counting phase (easy) and then a repacking phase.

MA on 6 Oct 2022

Open in MATLAB Online

hello again @Walter Roberson,

I have been trying to use the code you have provided previosly on another data and i am getting the following error, would you mind please if you can help me in figuring out how to fix the follwing error..

Unable to perform assignment because the size of the left side is 8-by-23 and the size of the right side is 22-by-1.
Error in Ionosonde_DatA_2 (line 88)
   packed(:,1:numsplits(K)+1) = DATA{K,1};

and this is the code i am implementing

   t1 = JRO(JRO.YEAR == Years(1), :);
doy=unique(tt.DayOfYear);
DATA = cell(length(doy),1);
 for n = 1 : length(doy)
    t2=t1(t1.DayOfYear == doy(n), :);  
      idxBkpt = find(diff([t2.GDALT])<0);
      split_indices = sort(1+idxBkpt);  %beginnings of blocks
      blk = diff([1 reshape(split_indices, 1, []) size(t2,1)+1]);
      splits = mat2cell(t2, blk, size(t2,2));
      DATA{n,1} = splits;
 end
numsplits = cellfun(@numel, DATA(:,1));
maxsplits = max(numsplits);
packed = cell(length(doy), 1+maxsplits);
for K = 1 : length(doy)
   packed(:,1:numsplits(K)+1) = DATA{K,1};
end

the table t1 is attached

thank you so much in advance.

Walter Roberson on 6 Oct 2022

Open in MATLAB Online

t1 = JRO(JRO.YEAR == Years(1), :);
doy = unique(t1.DayOfYear);    %changed
DATA = cell(length(doy),1);
 for n = 1 : length(doy)
    t2=t1(t1.DayOfYear == doy(n), :);  
      idxBkpt = find(diff([t2.GDALT])<0);
      split_indices = sort(1+idxBkpt);  %beginnings of blocks
      blk = diff([1 reshape(split_indices, 1, []) size(t2,1)+1]);
      splits = mat2cell(t2, blk, size(t2,2));
      DATA{n,1} = splits;
 end
numsplits = cellfun(@numel, DATA(:,1));
maxsplits = max(numsplits);
packed = cell(length(doy), 1+maxsplits);
for K = 1 : length(doy)
   packed(K,2:numsplits(K)+1) = DATA{K,1};   %changed
end

and remember to put the file names into packed(:,1)

Sign in to comment.

How I can modify this code to split the cells of a cell array?

0 Comments
Show -2 older commentsHide -2 older comments

Accepted Answer

11 Comments
Show 9 older commentsHide 9 older comments

More Answers (0)

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

How I can modify this code to split the cells of a cell array?

0 Comments Show -2 older commentsHide -2 older comments

Accepted Answer

11 Comments Show 9 older commentsHide 9 older comments

More Answers (0)

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

0 Comments
Show -2 older commentsHide -2 older comments

11 Comments
Show 9 older commentsHide 9 older comments