'splitapply' syntax to index multiple columns in a timetable

14 views (last 30 days)
Eric Escoto
Eric Escoto on 19 Nov 2021
Commented: Kelly Kearney on 20 Nov 2021
I would like to create a timetable using the split apply function applied to an existing timetable.
testTT = timetable(tmin, duration_min, splitapply(@mean, TT1.I, grp), ...
splitapply(@mean, TT1.KE, grp), ...
splitapply(@sum, TT1.num, grp), ...
splitapply(@mean, TT1.D, grp), ...
splitapply(@sum, TT1{10:1033}), grp));
You can see in the last splitapply command where I'm trying to apply the 'sum' function to 1024 total columns according to the group called 'grp'. I know my syntax is incorrect. But, I also don't know if what I'm trying here is possible (at least the way I'm doing it).
Is there a better approach?
  2 Comments
Kelly Kearney
Kelly Kearney on 20 Nov 2021
Unfortunately that syntax isn't valid. You'd need to modify the call to sum to loop over multiple inputs:
% Small sample table
TT1 = array2table(rand(10,5), 'variablenames', {'one','two','three','four','five'});
grp = randi(2, 10, 1);
% One-liner with sum wrapper
testTT1 = array2table(...
splitapply(@(varargin) cellfun(@sum, varargin), TT1, grp), ...
'variableNames', TT1.Properties.VariableNames)
testTT1 = 2×5 table
one two three four five _______ ______ _______ ______ ______ 0.25613 1.6855 0.49839 1.1605 1.032 5.3389 4.5527 1.9127 3.246 4.0721
% Using loops
testTT2 = table;
vname = TT1.Properties.VariableNames;
for ii = 1:length(vname)
testTT2.(vname{ii}) = splitapply(@sum, TT1.(vname{ii}), grp);
end
testTT2
testTT2 = 2×5 table
one two three four five _______ ______ _______ ______ ______ 0.25613 1.6855 0.49839 1.1605 1.032 5.3389 4.5527 1.9127 3.246 4.0721
% The I-wish-it-worked-this-way syntax
testTTbad = splitapply(@sum, TT1, grp);
Error using splitapply (line 132)
Applying the function 'sum' to the 1st group of data generated the following error:

Too many input arguments.

Sign in to comment.

Answers (1)

Kelly Kearney
Kelly Kearney on 19 Nov 2021
Edited: Kelly Kearney on 20 Nov 2021
I really wish splitapply did allow you to pass a table as input and apply the same function to each column of that input. Unfortunately, it doesn't (it will accept table input, but then expects a function that treats each table column as a separate input variable). Instead, a loop is probably the easiest way to do this:
testTT = timetable(tmin, duration_min, splitapply(@mean, TT1.I, grp), ...
splitapply(@mean, TT1.KE, grp), ...
splitapply(@sum, TT1.num, grp), ...
splitapply(@mean, TT1.D, grp));
% Add the summed variables
vname = TT1.Properties.VariableNames(10:1033);
for ii = 1:length(vname)
testTT.(vname{ii}) = splitapply(@sum, TT1.(vname{ii}), grp);
end

Products


Release

R2021a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!