MATLAB Answers

Allie
0

Extracting data based on categories

Asked by Allie
on 11 Jul 2019
Latest activity Edited by Guillaume
on 12 Jul 2019
I have a table (z) with two columns, year and annual precipitation. I have a separate vector (E) of three categories (0,1,2) that correspond to the years in the first table. The categories are unevenly spaced. I would like to create three new tables of year and annual precipitation, one for each category. How do I extract the correct year and precipitation data for each category and populate new tables?
z=['01-Jan-1973' 0.0114583333333333
'01-Jan-1974' 0.0918518518518519
'01-Jan-1975' 0.0529032258064516
'01-Jan-1976' 0.213571428571429
'01-Jan-1977' 0.0857692307692308]
E=[0 1 0 2 1]
I would like to create three new tables in the same format as z, like below.
A=['01-Jan-1973' 0.0114583333333333
'01-Jan-1975' 0.0529032258064516]
B=['01-Jan-1974' 0.0918518518518519
'01-Jan-1977' 0.0857692307692308]
C=['01-Jan-1976' 0.213571428571429]
My actual tables and vectors are 1x79, making extracting the three new tables challenging to do by hand.

  2 Comments

Guillaume
on 11 Jul 2019
Can you give an example of inputs and desired output. Preferably using valid matlab syntax.
Allie
on 11 Jul 2019
I updated the question. Do you need more information still?

Sign in to comment.

1 Answer

Answer by Star Strider
on 11 Jul 2019
 Accepted Answer

The splitapply function (R2015b and later releases) is perfect for this!
z = {'01-Jan-1973' 0.0114583333333333
'01-Jan-1974' 0.0918518518518519
'01-Jan-1975' 0.0529032258064516
'01-Jan-1976' 0.213571428571429
'01-Jan-1977' 0.0857692307692308};
E=[0 1 0 2 1];
Out = splitapply(@(x){x}, z, E(:)+1);
A = Out{1}
B = Out{2}
C = Out{3}
producing:
A =
2×2 cell array
{'01-Jan-1973'} {[0.0114583333333333]}
{'01-Jan-1975'} {[0.0529032258064516]}
B =
2×2 cell array
{'01-Jan-1974'} {[0.0918518518518519]}
{'01-Jan-1977'} {[0.0857692307692308]}
C =
1×2 cell array
{'01-Jan-1976'} {[0.213571428571429]}

  4 Comments

Show 1 older comment
Allie
on 12 Jul 2019
When I used both codes on my data, split array only produced single column arrays with the precipitation data and not the associated dates. Could this be because the dates are in datetime format?
The accumarry function produced both columns I wanted!
I did not know your dates were in datetime format, so that is likely the reason the splitapply code failed.
I added the accumarray code later when I realised that you may not have splitapply. I’m glad it worked!
Guillaume
on 12 Jul 2019
Note: If you are actually using a table (or a timetable, which would be better probably), the best solution might be not to split the table at all and just add the E as an extra table variable.
Matlab has plenty of aggregation functions (rowfun, varfun, splitapply as shown here, groupsummary) which allows you to apply the same function to each group of a table all at once. More often, the code is actually simpler and faster if the data is not split beforehand.
e.g:
%demo table
T = table(datetime({'01-Jan-1973'; '01-Jan-1974'; '01-Jan-1975'; '01-Jan-1976'; '01-Jan-1977'}), ...
[0.0114583333333333; 0.0918518518518519; 0.0529032258064516; 0.213571428571429; 0.0857692307692308], ...
'VariableNames', {'Date', 'Value'});
T.group = [0; 1; 0; 2; 1]
groupsummary(T, 'group', {'mean', 'sum', 'std'}, 'Value')
outputs:
ans =
3×5 table
group GroupCount mean_Value sum_Value std_Value
_____ __________ __________________ __________________ ___________________
0 2 0.0321807795698924 0.0643615591397849 0.0293059645132893
1 2 0.0888105413105414 0.177621082621083 0.00430106261490964
2 1 0.213571428571429 0.213571428571429 0

Sign in to comment.