Extracting data based on categories

1 view (last 30 days)
Allie
Allie on 11 Jul 2019
Edited: Guillaume on 12 Jul 2019
I have a table (z) with two columns, year and annual precipitation. I have a separate vector (E) of three categories (0,1,2) that correspond to the years in the first table. The categories are unevenly spaced. I would like to create three new tables of year and annual precipitation, one for each category. How do I extract the correct year and precipitation data for each category and populate new tables?
z=['01-Jan-1973' 0.0114583333333333
'01-Jan-1974' 0.0918518518518519
'01-Jan-1975' 0.0529032258064516
'01-Jan-1976' 0.213571428571429
'01-Jan-1977' 0.0857692307692308]
E=[0 1 0 2 1]
I would like to create three new tables in the same format as z, like below.
A=['01-Jan-1973' 0.0114583333333333
'01-Jan-1975' 0.0529032258064516]
B=['01-Jan-1974' 0.0918518518518519
'01-Jan-1977' 0.0857692307692308]
C=['01-Jan-1976' 0.213571428571429]
My actual tables and vectors are 1x79, making extracting the three new tables challenging to do by hand.
  2 Comments
Guillaume
Guillaume on 11 Jul 2019
Can you give an example of inputs and desired output. Preferably using valid matlab syntax.
Allie
Allie on 11 Jul 2019
I updated the question. Do you need more information still?

Sign in to comment.

Accepted Answer

Star Strider
Star Strider on 11 Jul 2019
The splitapply function (R2015b and later releases) is perfect for this!
z = {'01-Jan-1973' 0.0114583333333333
'01-Jan-1974' 0.0918518518518519
'01-Jan-1975' 0.0529032258064516
'01-Jan-1976' 0.213571428571429
'01-Jan-1977' 0.0857692307692308};
E=[0 1 0 2 1];
Out = splitapply(@(x){x}, z, E(:)+1);
A = Out{1}
B = Out{2}
C = Out{3}
producing:
A =
2×2 cell array
{'01-Jan-1973'} {[0.0114583333333333]}
{'01-Jan-1975'} {[0.0529032258064516]}
B =
2×2 cell array
{'01-Jan-1974'} {[0.0918518518518519]}
{'01-Jan-1977'} {[0.0857692307692308]}
C =
1×2 cell array
{'01-Jan-1976'} {[0.213571428571429]}
  4 Comments
Star Strider
Star Strider on 12 Jul 2019
I did not know your dates were in datetime format, so that is likely the reason the splitapply code failed.
I added the accumarray code later when I realised that you may not have splitapply. I’m glad it worked!
Guillaume
Guillaume on 12 Jul 2019
Edited: Guillaume on 12 Jul 2019
Note: If you are actually using a table (or a timetable, which would be better probably), the best solution might be not to split the table at all and just add the E as an extra table variable.
Matlab has plenty of aggregation functions (rowfun, varfun, splitapply as shown here, groupsummary) which allows you to apply the same function to each group of a table all at once. More often, the code is actually simpler and faster if the data is not split beforehand.
e.g:
%demo table
T = table(datetime({'01-Jan-1973'; '01-Jan-1974'; '01-Jan-1975'; '01-Jan-1976'; '01-Jan-1977'}), ...
[0.0114583333333333; 0.0918518518518519; 0.0529032258064516; 0.213571428571429; 0.0857692307692308], ...
'VariableNames', {'Date', 'Value'});
T.group = [0; 1; 0; 2; 1]
groupsummary(T, 'group', {'mean', 'sum', 'std'}, 'Value')
outputs:
ans =
3×5 table
group GroupCount mean_Value sum_Value std_Value
_____ __________ __________________ __________________ ___________________
0 2 0.0321807795698924 0.0643615591397849 0.0293059645132893
1 2 0.0888105413105414 0.177621082621083 0.00430106261490964
2 1 0.213571428571429 0.213571428571429 0

Sign in to comment.

More Answers (0)

Categories

Find more on Cell Arrays in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!