How I can modify this code to split the cells of a cell array?
You are now following this question
- You will see updates in your followed content feed.
- You may receive emails, depending on your communication preferences.
An Error Occurred
Unable to complete the action because of changes made to the page. Reload the page to see its updated state.
Show older comments
0 votes
I am working on the following code, that it will go through some folder and read data files as tables,
for k = 1 : length(theFiles)
baseFileName = theFiles(k).name;
fullFileName = fullfile(myFolder, baseFileName);
t = readtable(fullFileName);
%output = t.output;
%fprintf(1, 'Now reading %s\n', fullFileName);
idxBkpt = find(diff([t.GDALT])<0);
split_indices = sort(1+idxBkpt); %beginnings of blocks
blk = diff([1 reshape(split_indices, 1, []) size(t,1)+1]);
splits = mat2cell(t, blk, size(t,2));
celldisp(splits);
end
Now what I am struggling with is the following:
I want to create a matrix that in the first cell of each row it will store the name of the data file i am processing and in the rest of the cells of that row it will store the cell array 'splits', any efficient way to do that?
Accepted Answer
Walter Roberson
on 19 Sep 2021
nfiles = length(theFiles);
results = cell(nfiles,2);
fullnames = fullfile({theFiles.folder}, {theFiles.name});
results(:,1) = fullnames(:);
for k = 1 : nfiles
fullFileName = fullnames{k};
t = readtable(fullFileName);
idxBkpt = find(diff([t.GDALT])<0);
split_indices = sort(1+idxBkpt); %beginnings of blocks
blk = diff([1 reshape(split_indices, 1, []) size(t,1)+1]);
splits = mat2cell(t, blk, size(t,2));
results{k,2} = splits;
end
So results will be a cell array that is nfiles x 2. results{k,1} will be filename #k. results{k,2} will be the cell array splits -- and you will need to index that cell array to get to the pieces.
You cannot just use a cell array with N + 1 columns because it appears that the number of splits for each file may be different.
11 Comments
Thank you for your quick reply. you are right the number of splits for each file is different. but isn't there any other way that may help me in getting each piece in an independent cell for a specific raw?
or if it does not work for doing it to a specific row.. is it the same case with columns (i.e. having the names of the files to be in the first row and under each name in a column we store the coressponding cells of the cell array )?
Walter Roberson
on 19 Sep 2021
I want to create a matrix that in the first cell of each row it will store the name of the data file i am processing and in the rest of the cells of that row it will store the cell array 'splits',
An example of that desired outcome according to what you explained
{
filename1 file1_splits{1} file1_splits{2} file1_splits{3}
filename2 file2_splits{1} file2_splits{2}
filename3 file3_splits{1} file3_splits{2} file3_splits{3} file3_splits{4}
}
That is "first cell of each row it will store the name of the data file" -- that's column 1
"and in the rest of the cells of that row it will store the cell array 'splits'" -- that's the rest of the columns.
That's what you are asking for. But there is a problem: cell arrays cannot have variable numbers of columns for different rows. You need one of two approaches:
{
filename1 {file1_splits{1} file1_splits{2} file1_splits{3}}
filename2 {file2_splits{1} file2_splits{2}}
filename3 {file3_splits{1} file3_splits{2} file3_splits{3} file3_splits{4}}
}
which is what I programmed; OR
{
filename1 file1_splits{1} file1_splits{2} file1_splits{3} [] [] []
filename2 file2_splits{1} file2_splits{2} [] [] [] []
filename3 file3_splits{1} file3_splits{2} file3_splits{3} file3_splits{4} [] []
}
in which case each cell row is padded with [] (or similar) for the unused columns, with the number of columns being according to the maximum number of splits for any file.
The second possibility requries one of a small number strategies:
- Figure out the maximum number of splits ahead of time by scanning each file, throwing away the content each time and just retaining knowledge of the maximum number of splits seen so far; then constructing a cell array with that number of columns and going back and re-reading the files; this requires re-reading and re-splitting each file
- Figure out the maximum number of splits ahead of time as you go, by reading the file and figuring out the splits, and keeping the unsplit tables and the knowledge of the maximum number of splits so far; then constructing a cell array with that number of columns and going back and splitting the saved table into the cell array. This requires re-splitting each file but not re-reading it
- Read and split the tables into cell array like I already posted; afterwards, scan to find the maximum number of splits, and repack the cell of splits into columns, without having to re-read or re-split, and never needs to grow an array dynamically
- Read and split the tables as you go. As you read in one and split it, if the number of splits it would need is greater than the maximum number of splits so far, then pad out the existing cell to the required number of columns; then either way, store into only the number of columns needed for the data, leaving the other columns for the row empty. The logic for this approach is relatively easy, but it does mean that you are not pre-allocating the cell array and need to grow it from time to time
The most efficient one of those would be #3... which starts by reading the files and splitting in exactly the way I posted before, but then follows-up with a counting phase (easy) and then a repacking phase.
MA
on 19 Sep 2021
The time and the effort you are putting to write such an explanation to help me out is sencerley appreciated, I am speachless in front of this and I am really really thankful for your help, you have no idea how this encouriging me to continue learining matlab and not give up.
Allow me to ask about method 3, After finding out the maximum number of splits, I tried using cell2mat to repack the cell array but it gave me an error, what do you recommend to use instead?
Walter Roberson
on 19 Sep 2021
numsplits = cellfun(@numel, results(:,2));
maxsplits = max(numsplits);
packed = cellfun(nfile, 1+maxsplits);
packed(:,1) = results(:,1); %filenames
for K = 1 : nfile
packed(:,2:numsplits(K)+1) = results{K,2};
end
thank you for the quick reply.
it is giving me this error....

MA
on 19 Sep 2021
@Walter Roberson any advice how to fix such an error?
Walter Roberson
on 19 Sep 2021
numsplits = cellfun(@numel, results(:,2));
maxsplits = max(numsplits);
packed = cell(nfile, 1+maxsplits);
packed(:,1) = results(:,1); %filenames
for K = 1 : nfile
packed(:,2:numsplits(K)+1) = results{K,2};
end
Walter Roberson
on 20 Sep 2021
The code is intended to go after the reading loop, when results() has been fully populated. After the code I posted in https://www.mathworks.com/matlabcentral/answers/1456284-how-i-can-modify-this-code-to-read-files-in-a-folder-into-tables#answer_790529
MA
on 20 Sep 2021
Thank you very very much for all the help, it worked perfectly now.
MA
on 6 Oct 2022
I have been trying to use the code you have provided previosly on another data and i am getting the following error, would you mind please if you can help me in figuring out how to fix the follwing error..
Unable to perform assignment because the size of the left side is 8-by-23 and the size of the right side is 22-by-1.
Error in Ionosonde_DatA_2 (line 88)
packed(:,1:numsplits(K)+1) = DATA{K,1};
and this is the code i am implementing
t1 = JRO(JRO.YEAR == Years(1), :);
doy=unique(tt.DayOfYear);
DATA = cell(length(doy),1);
for n = 1 : length(doy)
t2=t1(t1.DayOfYear == doy(n), :);
idxBkpt = find(diff([t2.GDALT])<0);
split_indices = sort(1+idxBkpt); %beginnings of blocks
blk = diff([1 reshape(split_indices, 1, []) size(t2,1)+1]);
splits = mat2cell(t2, blk, size(t2,2));
DATA{n,1} = splits;
end
numsplits = cellfun(@numel, DATA(:,1));
maxsplits = max(numsplits);
packed = cell(length(doy), 1+maxsplits);
for K = 1 : length(doy)
packed(:,1:numsplits(K)+1) = DATA{K,1};
end
the table t1 is attached
thank you so much in advance.
Walter Roberson
on 6 Oct 2022
t1 = JRO(JRO.YEAR == Years(1), :);
doy = unique(t1.DayOfYear); %changed
DATA = cell(length(doy),1);
for n = 1 : length(doy)
t2=t1(t1.DayOfYear == doy(n), :);
idxBkpt = find(diff([t2.GDALT])<0);
split_indices = sort(1+idxBkpt); %beginnings of blocks
blk = diff([1 reshape(split_indices, 1, []) size(t2,1)+1]);
splits = mat2cell(t2, blk, size(t2,2));
DATA{n,1} = splits;
end
numsplits = cellfun(@numel, DATA(:,1));
maxsplits = max(numsplits);
packed = cell(length(doy), 1+maxsplits);
for K = 1 : length(doy)
packed(K,2:numsplits(K)+1) = DATA{K,1}; %changed
end
and remember to put the file names into packed(:,1)
More Answers (0)
Categories
Find more on Logical in Help Center and File Exchange
See Also
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!Select a Web Site
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .
You can also select a web site from the following list
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.
Americas
- América Latina (Español)
- Canada (English)
- United States (English)
Europe
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)