Delete Empty Rows in a Cell Array

Hello. Here's what I am trying to do.
1. Read in data from Excel in a cell array.
2. Delete the empty rows (i.e., rows with no data).
3. Write the array to a sheet in Excel.
So far, my code deletes the first empty row but then goes no farther.
Here is what I have based on https://stackoverflow.com/questions/31818057/deleting-empty-rows-in-a-cell-array:
[~, ~, DeleteEmptyWilliams]=xlsread('StimuliInExcel.xlsx', 'WilliamsList')
for m = 1:124
NewWilliams=DeleteEmptyWilliams(~all(cellfun('isempty',DeleteEmptyWilliams(:,1:8)), 2),:)
end
xlswrite('StimuliInExcel.xlsx',NewWilliams,'WilliamsListNoEmpty')
I have also tried the following code by adding (m,1:8), but it doesn't work either---instead, I get this error: "Index exceeds matrix dimensions". I expect this error occurred because I am deleting rows as I go, so the array's dimensions change.
[~, ~, DeleteEmptyWilliams]=xlsread('StimuliInExcel.xlsx', 'WilliamsList')
for m = 1:124
NewWilliams=DeleteEmptyWilliams(~all(cellfun('isempty',DeleteEmptyWilliams(m,1:8)), 2),:)
end
xlswrite('StimuliInExcel.xlsx',NewWilliams,'WilliamsListNoEmpty')
I tried the following code based on the suggestion at https://www.mathworks.com/matlabcentral/newsreader/view_thread/164617, but that did not work. I get this error: "Input array is empty".
[~, ~, DeleteEmptyWilliams]=xlsread('StimuliInExcel.xlsx', 'WilliamsList')
for m = 1:124
NewWilliams=DeleteEmptyWilliams(all(cellfun(@isempty,DeleteEmptyWilliams(:,1:8)), 2),:)
end
xlswrite('StimuliInExcel.xlsx',NewWilliams,'WilliamsListNoEmpty')
I would appreciate any help you can provide.

 Accepted Answer

Image Analyst
Image Analyst on 30 Jun 2017
Edited: Image Analyst on 30 Jun 2017
Why not just use readtable() to read in the file, then use all() and ismissing() to remove totally empty rows?
Attach a workbook with the paperclip icon if you want a demo.

3 Comments

Hello. I am attaching my code and the Excel file. My "delete empty row" attempts are commented out near the bottom.
Try this:
t = readtable('StimuliInExcel.xls', 'Sheet', 'WilliamsList')
% Extract columns 3-8
t = t(:, 3:8)
% Find out which entries are missing
notThere = ismissing(t)
% Find out what rows have every column as missing
badRows = all(notThere, 2)
% Extract only good rows
t = t(~badRows, :)
When I use this code, it works perfectly. I really appreciate your help!
I added Name-Value pair arguments to readtable() and writetable() so that the first row of my data would not be read or written as variable names.
The entire code looks like this:
t=readtable('StimuliInExcel.xls', 'Sheet', 'WilliamsList', 'ReadVariableNames', false)
notThere=ismissing(t)
badrows=all(notThere,2)
t=t(~badrows,:)
writetable(t, 'StimuliInExcel.xls', 'Sheet', 'WilliamsListNoEmpty', 'WriteVariableNames', false)

Sign in to comment.

More Answers (1)

dpb
dpb on 26 Jun 2017
Edited: dpb on 27 Jun 2017
[~,~,raw]=xlsread('StimuliInExcel.xlsx', 'WilliamsList');
raw=raw(~all(cellfun(isempty,raw),2),:);
xlswrite('StimuliInExcel.xlsx',raw,'WilliamsListNoEmpty')

14 Comments

Before I try this, can you tell me where x is coming from?
Jan
Jan on 26 Jun 2017
Edited: Jan on 26 Jun 2017
I assume
raw = raw(cellfun(@(x),~all(isempty(x),2),raw),:);
will not work. The anonymous functions gets each cell element as x. The "~all" belongs to the outside of cellfun. Or did I miss a point?
raw = raw(~all(cellfun('isempty',raw(:, 1:8)), 2), :);
This means, that your first approach looks fine already, only the "for m" loop is not required.
Anonymous functions are slow in cellfun.
The "x" is the placeholder for the anonymous function here. "x" is set iteratively to all cell elements.
Bad typing at the keyboard here...don't need the anonymous function, was overthinking it...fixed up the Answer. The ~all() goes around the result of the cellfun to make sure the whole row is empty first, then the negation of that is the row index vector to keep.
Thank you for getting back to me! I tried what you suggested. When I type this in
[~, ~, raw]=xlsread('StimuliInExcel.xlsx', 'WilliamsList')
raw=raw(~all(cellfun('isempty',raw(:,1:8)),2),:)
xlswrite('StimuliInExcel.xlsx',raw,'WilliamsListNoEmpty')
it's not working. The result is just copying what's in 'WilliamsList' to the sheet called 'WilliamsListNoEmpty'. Am I missing something?
Perhaps the cells aren't all empty?
Take a smaller subsection of the raw array at the command line that you can see visually and test...
~all(cellfun(@isempty,raw(1:5,1:8)),2)
for example will give you the first five rows.
cellfun(@isempty,raw(1:5,1:8))
will show the 2D array of what each cell result is.
Don't have your actual data file so can't check it, but the logic does work if each cell is empty as you say is.
There is an issue here if the cell contains a string so that the result of isempty on the cell is a vector rather than a single T/F value, but that will show up as an error that need to use the 'uniformoutput',false optional argument which you don't say you're getting so shouldn't be the issue.
When I run my code and then type
raw
on the command line, I get a row of [NaN]s on the lines that correspond to the empty rows in my Excel spreadsheet. That suggests they are empty, right?
dpb
dpb on 29 Jun 2017
Edited: dpb on 29 Jun 2017
No. NaN isn't empty, it's "not a number". If they were empty you would see simply a set of empty [].
Illustration...
>> c=num2cell(nan(2,3)) % a cell array of nan's
c =
[NaN] [NaN] [NaN]
[NaN] [NaN] [NaN]
>> cellfun(@isempty,c) % is it empty?? (No)
ans =
0 0 0
0 0 0
>> c=cell(2,3) % an empty cell array
c =
[] [] []
[] [] []
>> cellfun(@isempty,c) % shows each cell is (empty, that is) (&)
ans =
1 1 1
1 1 1
>> c=num2cell([nan(2,3); randn(1,3)]) % a mixed array
c =
[ NaN] [ NaN] [ NaN]
[ NaN] [ NaN] [ NaN]
[-1.1471] [-1.0689] [-0.8095]
>> all(cellfun(@isfinite,c),2) % which rows are all ok (finite)
ans =
0
0
1
>> c(all(cellfun(@isfinite,c),2),:) % save those only
ans =
[-1.1471] [-1.0689] [-0.8095]
>> cell2mat(c(all(cellfun(@isfinite,c),2),:))
ans =
-1.1471 -1.0689 -0.8095
>>
NB: the last which is what you really may want to do before writing back out to convert from the cell array to a numeric native double array.
Doing these kinds of exercises at the keyboard/command window will go a long ways towards learning how to manipulate cell arrays in particular and to figure out how to accomplish goals such as this.
Using small subsets that are able to see results in toto on one screen helps too; the size of real datasets often overwhelms the key lessons by just the sheer volume of output but the syntax is universal irrespective of size.
(&) Note that the cell array itself is not empty; it's only the individual cell content within the array. Hence the need for cellfun --
> c=cell(2,3) % the same cell array
c =
[] [] []
[] [] []
>> isempty(c) % check if the array is empty (no!)
ans =
0
>>
You note the array itself is NOT empty; it contains six cells. It just so happens that all six cells are empty, but the container isn't--big difference.
"... I get a row of [NaN]s on the lines that correspond to the empty rows in my Excel spreadsheet."
NB: Unless there are "holes" in the numeric data and other text such that [num,txt,raw]=xlsread(... creates a mismatch between rows returned by data and text vis a vis raw and you need both text and numeric values, there's no reason to use raw and have to do the post-processing cleanup--just read the numeric data as numeric and the empty rows will be imported as NaN in numeric 2D array which can be handled with all(num,2) directly since is already numeric. Or, depending on the arrangement, xlsread may simply skip the empty cell entirely and you'll get only the values you want from the git-go.
Thank you for this very helpful information! Perhaps I want something like this:
[~, ~, raw]=xlsread('StimuliInExcel.xlsx', 'WilliamsList')
raw=raw(~all(cellfun(@isnan,raw(:,1:8), 'UniformOutput', false),2),:)
xlswrite('StimuliInExcel.xlsx',raw,'WilliamsListNoEmpty')
When I use that code, I get this error:
Undefined function 'all' for input arguments of type 'cell'.
Well, that's a symptom that you're using @isnan on an empty cell where your previous posting indicated that the returned array contained array of [NaN]. Which is it?
As IA points out an alternate route similar to mine of suggesting using the specific returns from xlsread instead of trying to parse the raw data could be simpler solution.
As also noted, without the actual file (or a similar one if it's extremely large that illustrates the form) we're kinda' shootin' in the dark.
Both solutions shown above work for the appropriate input form; that form, however, seems to not be the same by the symptoms given.
Hello. I am attaching my code and the Excel file. My "delete empty row" attempts are commented out near the bottom.
[d,t,r]=xlsread('StimulilnExcel.xls');
returns for r
>> r
r =
[1] [NaN] 'RunExperiment' 'Scene1_ABS_CON' '1_CUP' [1] [0] [ 1] '.'
[1] [NaN] 'RunExperiment' 'Scene2_ABS_CON' '1_CUP' [0] [1] [ 1] '.'
[1] [NaN] 'RunExperiment' 'Scene1_ABS_CON' '2_TOILET_PAPER' [0] [1] [ 1] '.'
[1] [NaN] 'RunExperiment' 'Scene2_ABS_CON' '2_TOILET_PAPER' [1] [0] [ 1] '.'
[1] [NaN] 'RunExperiment' 'Scene3_ABS_CON' '3_BOWL' [1] [0] [ 2] '.'
[1] [NaN] 'RunExperiment' 'Scene4_ABS_CON' '3_BOWL' [0] [1] [ 2] '.'
[1] [NaN] 'RunExperiment' 'Scene3_ABS_CON' '4_GLOVE' [0] [1] [ 2] '.'
...
There are no records with missing values for the full record; the second column is all [NaN] and it looks like the last column isn't of any value, either, but the records are all complete with data in every column except 2.
Hence, you're barking up the wrong tree here...what was the real problem??
If you visually inspect the Excel file, the CompleteList sheet has no missing values, but the WilliamsList and PalmerList sheets do. Image Analyst provided a great answer.
Thank you for all your help and the information you provided!

Sign in to comment.

Categories

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!