Removing empty cells with non-zero dimensions
Show older comments
My code needs to deal with a cell array X, each cell of which is itself a cell array, containing a double array. For example, X could look as follows:
X = cell(N,1);
for i=1:N
X{i}=cell(1,10);
for j=1:10
X{i}{j} = randi(10, 5,2); %each cell contains a double array of size (5,2)
end
end
While manipulating my code, some rows of these double arrays might get removed. For example:
for i=1:N
for j=1:10
X{i}{j}(X{i}{j}(:,1) < 3,:) = [];
end
end
In some cases, all elements of some double arrays get removed, resulting in a 0×2 empty double matrix. This nonzero size is causing problems elsewhere in my code, how do I efficiently replace these with empty arrays?
My current approach is to call the following forloop after each set of manipulatoins that might result in empty arrays with nonzero size.
for i=1:N
for j=1:10
if isempty(X{i}{j})
X{i}{j} = [];
end
end
end
However, I'm fairly certain that there is no better way of doing this. Any suggestions?
Edit: I want to emphasize that I do not want to remove the empty cells. What I do want is to replace any 0x2 empty double matrices with 0x0 matrices.
The 10 cells inside each X{i} represent "physical" lattice sites in my simulation. An empty cell does have a meaning, and should not be removed.
Accepted Answer
More Answers (1)
Bruno Luong
on 24 Aug 2020
Edited: Bruno Luong
on 24 Aug 2020
I like your for-loop; you might speed up a little bit
for i=1:N
Xi = X{i};
Xi(cellfun('isempty',Xi)) = {[]}; % switch to string from Rik's remark
X{i} = Xi;
end
13 Comments
Bruno Luong
on 24 Aug 2020
Edited: Bruno Luong
on 24 Aug 2020
You can replace the outer for-loop with cellfun
X = cellfun(@ReplaceEmpty, X, 'unif', 0)
function Xi = ReplaceEmpty(Xi)
Xi(cellfun('isempty',Xi)) = {[]}; % switch to string from Rik's remark
end
The OP's original nested loops are actually 1.99x faster than the one in your answer and 1.84x faster than the one in my answer, on average, mainly thanks to cellfun.
Each timed 1000 times, comparing the median values.
Your loops isn't really different than mine. It unpacks and repacks the cell array which adds a tiny bit more time.
Bruno Luong
on 24 Aug 2020
Yeah, that's why I first state that I like OP's for-loop.
I'm still outthere looking for example where CELLFUN/ARRAYFUN beats FOR-LOOP.
Bruno Luong
on 24 Aug 2020
"I expected somthing using cellfun to be faster"
I don't understand why a lot of people get this expectation from. CELLFUN/ARRAYFUN is a scam. It does provide compact code that's all.
"CELLFUN/ARRAYFUN is a scam" 😄
Generally vectorization is faster than loops which initially gave for-loops a bad rep. But speed has generally increased, especially with Matlab's JIT compilation. cellfun, arrayfun, etc all have internal loops anyway. Their main attraction is the reduction of lines of code and, sometimes, improved readability (certainly not always; sometimes they are very difficult to interpret). For simple operations, loops, even nested loops, are often faster.
Though in this case the main slowdown is due to your use of the handle style, instead of the char input to cellfun:
N=100;
X = cell(N,1);for i=1:N,X{i}=cell(1,10);for j=1:10,X{i}{j}=randi(10,5,2);end,end
for i=1:N,for j=1:10,X{i}{j}(X{i}{j}(:,1)<3,:)=[];end,end
[timeit(@()cellfun_handle(X)) %42 microseconds
timeit(@()cellfun_str(X)) % 2.1 microseconds
timeit(@()for_fun(X))] % 1.5 microseconds
function out=cellfun_handle(X)
out=cellfun(@isempty, X);
end
function out=cellfun_str(X)
out=cellfun('isempty', X);
end
function out=for_fun(X)
out=false(size(X));
for n=1:numel(X)
out(n)=isempty(X);
end
end
Bruno Luong
on 24 Aug 2020
This is the fatest according to my benchmark
for i=1:N
Xi = X{i};
for j=1:10
if isempty(Xi{j})
Xi{j} = [];
end
end
X{i} = Xi;
end
I suppose that extra time is saved by not sorting through overloaded versions of the function. Thanks for that reminder!
Bruno Luong
on 24 Aug 2020
Edited: Bruno Luong
on 24 Aug 2020
@Rik, Historically the CELLFUN has special speedy implementation for a small number of functions and they are invoked through string 'xx' and not @xx. 'isempty' is among them.
At some point TMW recommended not using string, I would though they move the special implementation for @xx syntax, obviously not. So thanks for reminding us and TMW must get to work and implement what they still left over.
Bruno Luong
on 25 Aug 2020
Edited: Bruno Luong
on 25 Aug 2020
Well very simple explanation:
with X{i}{j} you tells matlab to indexing twice with i variable then with j.
With Xi{j} only one indexing once with j since Xi is a variable. In the for-loop it makes a difference.
Categories
Find more on Loops and Conditional Statements in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!