How I can modify this code to split the cells of a cell array?

I am working on the following code, that it will go through some folder and read data files as tables,
for k = 1 : length(theFiles)
baseFileName = theFiles(k).name;
fullFileName = fullfile(myFolder, baseFileName);
t = readtable(fullFileName);
%output = t.output;
%fprintf(1, 'Now reading %s\n', fullFileName);
idxBkpt = find(diff([t.GDALT])<0);
split_indices = sort(1+idxBkpt); %beginnings of blocks
blk = diff([1 reshape(split_indices, 1, []) size(t,1)+1]);
splits = mat2cell(t, blk, size(t,2));
celldisp(splits);
end
Now what I am struggling with is the following:
I want to create a matrix that in the first cell of each row it will store the name of the data file i am processing and in the rest of the cells of that row it will store the cell array 'splits', any efficient way to do that?

 Accepted Answer

nfiles = length(theFiles);
results = cell(nfiles,2);
fullnames = fullfile({theFiles.folder}, {theFiles.name});
results(:,1) = fullnames(:);
for k = 1 : nfiles
fullFileName = fullnames{k};
t = readtable(fullFileName);
idxBkpt = find(diff([t.GDALT])<0);
split_indices = sort(1+idxBkpt); %beginnings of blocks
blk = diff([1 reshape(split_indices, 1, []) size(t,1)+1]);
splits = mat2cell(t, blk, size(t,2));
results{k,2} = splits;
end
So results will be a cell array that is nfiles x 2. results{k,1} will be filename #k. results{k,2} will be the cell array splits -- and you will need to index that cell array to get to the pieces.
You cannot just use a cell array with N + 1 columns because it appears that the number of splits for each file may be different.

11 Comments

Thank you for your quick reply. you are right the number of splits for each file is different. but isn't there any other way that may help me in getting each piece in an independent cell for a specific raw?
or if it does not work for doing it to a specific row.. is it the same case with columns (i.e. having the names of the files to be in the first row and under each name in a column we store the coressponding cells of the cell array )?
I want to create a matrix that in the first cell of each row it will store the name of the data file i am processing and in the rest of the cells of that row it will store the cell array 'splits',
An example of that desired outcome according to what you explained
{
filename1 file1_splits{1} file1_splits{2} file1_splits{3}
filename2 file2_splits{1} file2_splits{2}
filename3 file3_splits{1} file3_splits{2} file3_splits{3} file3_splits{4}
}
That is "first cell of each row it will store the name of the data file" -- that's column 1
"and in the rest of the cells of that row it will store the cell array 'splits'" -- that's the rest of the columns.
That's what you are asking for. But there is a problem: cell arrays cannot have variable numbers of columns for different rows. You need one of two approaches:
{
filename1 {file1_splits{1} file1_splits{2} file1_splits{3}}
filename2 {file2_splits{1} file2_splits{2}}
filename3 {file3_splits{1} file3_splits{2} file3_splits{3} file3_splits{4}}
}
which is what I programmed; OR
{
filename1 file1_splits{1} file1_splits{2} file1_splits{3} [] [] []
filename2 file2_splits{1} file2_splits{2} [] [] [] []
filename3 file3_splits{1} file3_splits{2} file3_splits{3} file3_splits{4} [] []
}
in which case each cell row is padded with [] (or similar) for the unused columns, with the number of columns being according to the maximum number of splits for any file.
The second possibility requries one of a small number strategies:
  1. Figure out the maximum number of splits ahead of time by scanning each file, throwing away the content each time and just retaining knowledge of the maximum number of splits seen so far; then constructing a cell array with that number of columns and going back and re-reading the files; this requires re-reading and re-splitting each file
  2. Figure out the maximum number of splits ahead of time as you go, by reading the file and figuring out the splits, and keeping the unsplit tables and the knowledge of the maximum number of splits so far; then constructing a cell array with that number of columns and going back and splitting the saved table into the cell array. This requires re-splitting each file but not re-reading it
  3. Read and split the tables into cell array like I already posted; afterwards, scan to find the maximum number of splits, and repack the cell of splits into columns, without having to re-read or re-split, and never needs to grow an array dynamically
  4. Read and split the tables as you go. As you read in one and split it, if the number of splits it would need is greater than the maximum number of splits so far, then pad out the existing cell to the required number of columns; then either way, store into only the number of columns needed for the data, leaving the other columns for the row empty. The logic for this approach is relatively easy, but it does mean that you are not pre-allocating the cell array and need to grow it from time to time
The most efficient one of those would be #3... which starts by reading the files and splitting in exactly the way I posted before, but then follows-up with a counting phase (easy) and then a repacking phase.
The time and the effort you are putting to write such an explanation to help me out is sencerley appreciated, I am speachless in front of this and I am really really thankful for your help, you have no idea how this encouriging me to continue learining matlab and not give up.
Allow me to ask about method 3, After finding out the maximum number of splits, I tried using cell2mat to repack the cell array but it gave me an error, what do you recommend to use instead?
numsplits = cellfun(@numel, results(:,2));
maxsplits = max(numsplits);
packed = cellfun(nfile, 1+maxsplits);
packed(:,1) = results(:,1); %filenames
for K = 1 : nfile
packed(:,2:numsplits(K)+1) = results{K,2};
end
thank you for the quick reply.
it is giving me this error....
@Walter Roberson any advice how to fix such an error?
numsplits = cellfun(@numel, results(:,2));
maxsplits = max(numsplits);
packed = cell(nfile, 1+maxsplits);
packed(:,1) = results(:,1); %filenames
for K = 1 : nfile
packed(:,2:numsplits(K)+1) = results{K,2};
end
Thank you very very much for all the help, it worked perfectly now.
hello again @Walter Roberson,
I have been trying to use the code you have provided previosly on another data and i am getting the following error, would you mind please if you can help me in figuring out how to fix the follwing error..
Unable to perform assignment because the size of the left side is 8-by-23 and the size of the right side is 22-by-1.
Error in Ionosonde_DatA_2 (line 88)
packed(:,1:numsplits(K)+1) = DATA{K,1};
and this is the code i am implementing
t1 = JRO(JRO.YEAR == Years(1), :);
doy=unique(tt.DayOfYear);
DATA = cell(length(doy),1);
for n = 1 : length(doy)
t2=t1(t1.DayOfYear == doy(n), :);
idxBkpt = find(diff([t2.GDALT])<0);
split_indices = sort(1+idxBkpt); %beginnings of blocks
blk = diff([1 reshape(split_indices, 1, []) size(t2,1)+1]);
splits = mat2cell(t2, blk, size(t2,2));
DATA{n,1} = splits;
end
numsplits = cellfun(@numel, DATA(:,1));
maxsplits = max(numsplits);
packed = cell(length(doy), 1+maxsplits);
for K = 1 : length(doy)
packed(:,1:numsplits(K)+1) = DATA{K,1};
end
the table t1 is attached
thank you so much in advance.
t1 = JRO(JRO.YEAR == Years(1), :);
doy = unique(t1.DayOfYear); %changed
DATA = cell(length(doy),1);
for n = 1 : length(doy)
t2=t1(t1.DayOfYear == doy(n), :);
idxBkpt = find(diff([t2.GDALT])<0);
split_indices = sort(1+idxBkpt); %beginnings of blocks
blk = diff([1 reshape(split_indices, 1, []) size(t2,1)+1]);
splits = mat2cell(t2, blk, size(t2,2));
DATA{n,1} = splits;
end
numsplits = cellfun(@numel, DATA(:,1));
maxsplits = max(numsplits);
packed = cell(length(doy), 1+maxsplits);
for K = 1 : length(doy)
packed(K,2:numsplits(K)+1) = DATA{K,1}; %changed
end
and remember to put the file names into packed(:,1)

Sign in to comment.

More Answers (0)

Products

Release

R2021a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!