Averaging values of unique ID's with duplicated dates

Question

Colkissi on 22 Jul 2019

0
Link

Direct link to this question

https://ch.mathworks.com/matlabcentral/answers/472908-averaging-values-of-unique-id-s-with-duplicated-dates

Edited: Adam Danz on 23 Jul 2019

I have a table of size 10743 x 5 but more interested in working with 3 columns namely; 3, 4 and 5.

I want to select the unique ID's in column 5, and find the averages of the corresponding value in column 3 with corresponding duplicated dates in column 4.

Please, how do I go about it?

Also, part of column 3 contains NaN values.

I have pasted part of the table as follows;

"AAE551" "02/02/1995" 228.35 728692 AAE551

"AAE551" "02/02/2006" 242.72 732710 AAE551

"AAE551" "02/20/2009" 246.05 733824 AAE551

"AAE551" "02/23/2010" 246.27 734192 AAE551

"AAE551" "02/26/1992" 226.60 727620 AAE551

"AAE551" "02/27/1991" 225.30 727256 AAE551

"AAE551" "03/05/1990" 222.70 726897 AAE551

"AAE551" "03/14/2006" 242.58 732750 AAE551

"AAE551" "03/21/2006" 242.72 732757 AAE551

"AAE551" "04/04/1990" 224.30 726927 AAE551

"AAE551" "04/09/2006" 242.81 732776 AAE551

"AAE551" "04/13/2006" 242.92 732780 AAE551

"AAE551" "05/02/1991" 224.70 727320 AAE551

"AAE551" "05/05/1993" 227.45 728054 AAE551

"AAE551" "05/06/2010" 246.33 734264 AAE551

"AAE551" "05/07/1990" 222.90 726960 AAE551

"AAE551" "05/07/1992" 226.33 727691 AAE551

"AAE551" "05/30/1989" 220.60 726618 AAE551

"AAE560" "02/04/1981" 46.300 723581 AAE560

"AAE560" "02/07/1980" 46.300 723218 AAE560

"AAE560" "02/10/1995" 42.820 728700 AAE560

"AAE560" "02/14/1986" 40.100 725417 AAE560

"AAE560" "02/17/1983" 44.800 724324 AAE560

"AAE560" "02/18/1987" 40.400 725786 AAE560

"AAE560" "02/25/1984" 42.000 724697 AAE560

"AAE560" "02/26/1988" 40.500 726159 AAE560

"AAE560" "02/28/1985" 40.700 725066 AAE560

"AAE560" "03/02/1992" 41.300 727625 AAE560

"AAE560" "03/04/1998" 37.650 729818 AAE560

"AAE560" "03/08/2006" 39.120 732744 AAE560

4 Comments
Show 2 older commentsHide 2 older comments

Colkissi on 22 Jul 2019

Hi Trung,

Thanks for your quick response. However, I am new to Matlab so can you please run by me the coding itself?

Colkissi on 22 Jul 2019

Mat_table.m

Hi the cyclist'

I have attached the MAT file.

Thanks.

Sign in to comment.

Sign in to answer this question.

Answer 1

Adam Danz on 22 Jul 2019

1
Link

Direct link to this answer

https://ch.mathworks.com/matlabcentral/answers/472908-averaging-values-of-unique-id-s-with-duplicated-dates#answer_384276

Edited: Adam Danz on 22 Jul 2019

Open in MATLAB Online

We can't run your code since we do not have access to the csv file. We'd benefit more from having a mat file containing the table you're working with. Judging from the use of readtable(), it seems that you are working with tables rather than another data type but you haven't provided us with the header names so I will make them up.

    T =
  1×5 table
    
    CodeStr      DateStr      Value        Date         Code  
    ________    __________    ______    __________    ________
    "AAE551"    "02/02/95"    228.35    7.2869e+05    "AAE551"

"I want to select the unique ID's in column 5"

unqID = unique(T.Code);

"...and find the averages of the corresponding value in column 3 with corresponding duplicated dates in column 4. "

   [groupID,GroupKey] = findgroups(T.Date); 
   groupMeans = splitapply(@(x)mean(x,'omitnan'),T.Value,groupID); 

If you want to calculate the average "Value" for each date and for each 'Code', you use the lines below or use the lines above within a loop.

    [~,~,groupID] = unique([T.Value, T.Code],'rows','stable');
    groupMeans = splitapply(@(x)mean(x,'omitnan'),T.Value,groupID); 

*This is untested

8 Comments
Show 6 older commentsHide 6 older comments

Adam Danz on 23 Jul 2019

Edited: Adam Danz on 23 Jul 2019

Open in MATLAB Online

Replace the last block with this. There is no need for a loop.

%% Setting up dates with spring months
t1 = datetime(1980,1,1);
t2 = datetime(2018,12,31);
t = transpose(t1:t2);
spring_months = [2 3 4 5];
Lia = ismember(month(t),spring_months);
t3 = t(Lia);
date = datenum(t3);
set_dates = datenum(set_table.Field_Collection_Start_Date);
mem_of = ismember(set_dates, date);
new_dataset = set_table(mem_of,:);
[groupID,GroupKey] = findgroups(floor(new_dataset.Mat_Date)); %floor() removes time of day
groupMeans = splitapply(@(x)mean(x,'omitnan'),new_dataset.Result_Value,groupID);
% Create table of daily averages
dailyAvg = table(GroupKey, groupMeans, 'VariableNames',{'Date', 'Avg'})
% OR IF YOU WANT DATES TO BE STRINGS
dailyAvg = table(datestr(GroupKey,'dd/mm/yyyy'), groupMeans, 'VariableNames',{'Date', 'Avg'}); 

Result

 head(dailyAvg)    % day/month/year
ans =
  8×2 table
       Date        Avg  
    __________    ______
    02/02/1980       452
    04/02/1980    333.25
    05/02/1980    65.511
    06/02/1980    305.97
    07/02/1980      42.2
    10/02/1980     -16.3
    15/02/1980     22.35
    17/02/1980       662

Adam Danz on 23 Jul 2019

Edited: Adam Danz on 23 Jul 2019

Open in MATLAB Online

Hi Kissi, I put more time into this than I wanted so if you have further questions, please show that you've made several attempts to solve the problems. Note that T.Well_ID is kept as strings.

%% Read table from spreadsheet
filename = 'EIMResults_2019Jan09_24765.csv';
opts = detectImportOptions(filename);
opts = setvartype(opts,{'Location_ID','Field_Collection_Start_Date'},'string');
opts = setvartype(opts,{'Result_Value'},'double');
D = readtable(filename,opts);
A = D(:,3);
B = D(:,8);
C = D(:,50);
T = [A B C];
T.Mat_Date = datenum(T.(2));
% T.Well_ID = char(T.(1));    %KEEP THESE AS STRINGS
T.Well_ID = T.(1);    
%% Sort Main_Table with respect to well_id
[~,idx] = sortrows(T(:,5)); % sorts just the first column
set_table = T(idx,:); % sort the whole table using the sort indices
%% Setting up dates with spring months
t1 = datetime(1980,1,1);
t2 = datetime(2018,12,31);
t = transpose(t1:t2);
spring_months = [2 3 4 5];
Lia = ismember(month(t),spring_months);
t3 = t(Lia);
date = datenum(t3);
set_dates = datenum(set_table.Field_Collection_Start_Date);
mem_of = ismember(set_dates, date);
new_dataset = set_table(mem_of,:);
unqWellID = unique(new_dataset.Well_ID);
loopData = cell(3,numel(unqWellID));  % 3 because there will be 3 columns in the table
for i = 1:size(unqWellID,1)
    currIdx = strcmp(new_dataset.Well_ID,unqWellID(i)); 
    [groupID,GroupKey] = findgroups(floor(new_dataset.Mat_Date(currIdx))); %floor() removes time of day
    groupMeans = splitapply(@(x)mean(x,'omitnan'),new_dataset.Result_Value(currIdx),groupID);
    loopData{1,i} = repmat(unqWellID(i),size(GroupKey'));
    loopData(2:3,i) = {GroupKey'; groupMeans'}; 
end
% Create table 
T_DAILYAVG = table([loopData{1,:}]', ...
    datestr([loopData{2,:}]','dd/mm/yyyy'), ...
    [loopData{3,:}]', ...
    'VariableNames',{'Well_ID', 'Date', 'Avg'});  

Result

head(T_DAILYAVG)
ans =
  8×3 table
    Well_ID        Date        Avg  
    ________    __________    ______
    "AAE551"    30/05/1989     220.6
    "AAE551"    05/03/1990     222.7
    "AAE551"    04/04/1990     224.3
    "AAE551"    07/05/1990     222.9
    "AAE551"    27/02/1991     225.3
    "AAE551"    02/05/1991     224.7
    "AAE551"    26/02/1992     226.6
    "AAE551"    07/05/1992    226.33

Colkissi on 23 Jul 2019

Thank you Adam, this is exactly how I wanted it. I am sorry for taking much of your time, but then again I'm really grateful for that time spent to help me out.

I am new to Matlab and tried all I could but no results. Anyways, thanks again!

Adam Danz on 23 Jul 2019

Edited: Adam Danz on 23 Jul 2019

Glad I could help!

Sign in to comment.

Averaging values of unique ID's with duplicated dates

4 Comments
Show 2 older commentsHide 2 older comments

Accepted Answer

8 Comments
Show 6 older commentsHide 6 older comments

More Answers (0)

See Also

Categories

Tags

Community Treasure Hunt

Averaging values of unique ID's with duplicated dates

4 Comments Show 2 older commentsHide 2 older comments

Accepted Answer

8 Comments Show 6 older commentsHide 6 older comments

More Answers (0)

See Also

Categories

Tags

Community Treasure Hunt

4 Comments
Show 2 older commentsHide 2 older comments

8 Comments
Show 6 older commentsHide 6 older comments