Calculate avergae values per hour, day, month and year
5 views (last 30 days)
Show older comments
I have ~23 years of hourly data in a large matrix (5 columns and over 5 millions rows), like this:
YEAR / MONTH / DAY / HOUR / DATA
1994 3 7 4 25.786
1994 3 7 4 25.686
1994 3 7 5 25.746
1994 3 7 6 25.686
1994 3 7 6 24.786
1994 3 7 6 25.686
1994 3 7 7 26.746
1994 3 7 8 22.686
....
2016 10 24 0 27.686
2016 10 24 0 28.746
2016 10 24 1 25.686
Where...
YEAR= 1994:1:2016 (with leap and regular years)
MONTH= 1:12 (during leap and regular years)
DAY= 1:31 (with 28-31 days depending on leap and regular years)
HOUR= 0-23 (0=time between midnight and 1am)
Unfortunately series doesn't start at MONTH 1, DAY 1, HOUR 0, thinking in loop here. Also HOUR values do not have the same time step (some days can have 3 values other days can have 48 values, etc).
Any suggestions on how to obtain the data average at: 1) each hour (per day per month per year), 2) each day (per month per year), and 3) each month (per year).
I am also interested on how to calculate the data average per: 1) year (23 years), 2) month (12 months), 3) day (366 days), and 4) hour (24 hours).
Thank you for your suggestions.
0 Comments
Accepted Answer
Andrei Bobrov
on 20 Dec 2016
Edited: Andrei Bobrov
on 21 Dec 2016
Let data - your data.
%avergae values per hour
[ah,~,ch] = unique(data(:,1:4),'rows');
out_hour = [ah,accumarray(ch,data(:,5),[],@nanmean)];
%avergae values per day
[ad,~,cd] = unique(data(:,1:3),'rows');
out_day = [ad,accumarray(cd,data(:,5),[],@nanmean)];
%avergae values per month
[am,~,cm] = unique(data(:,1:2),'rows');
out_month = [am,accumarray(cm,data(:,5),[],@nanmean)];
%avergae values per year
[ay,~,cy] = unique(data(:,1:2),'rows');
out_year = [ay,accumarray(cy,data(:,5),[],@nanmean)];
9 Comments
Steven Lord
on 14 Jun 2019
If you have your data in a table or a timetable I recommend using Sean de Wolski's approach below. If you have a table you'll need to convert it into a timetable first using the table2timetable function as retime is only defined for timetable arrays.
Lucas Guimaraes
on 1 Apr 2021
Edited: Lucas Guimaraes
on 1 Apr 2021
Hello,
thank you for that. Helped me too in my case.
But help me again, please haha. If I have to calculate the standard deviations of these data. How do I do?
thank you!
Lucas
More Answers (1)
Sean de Wolski
on 20 Dec 2016
Edited: Sean de Wolski
on 20 Dec 2016
% Your data
D = ...
[1994 3 7 4 25.786
1994 3 7 4 25.686
1994 3 7 5 25.746
1994 3 7 6 25.686
1994 3 7 6 24.786
1994 3 7 6 25.686
1994 3 7 7 26.746
1994 3 7 8 22.686
2016 10 24 0 27.686
2016 10 24 0 28.746
2016 10 24 1 25.686];
% Make Datetime
dt = datetime(D(:,1),D(:,2),D(:,3),D(:,4),0,0);
% Make timetable
tt = timetable(dt,D(:,end),'VariableNames',{'Data'})
%%Retiming
% Monthly
rmmissing(retime(tt,'monthly',@mean))
% Yearly
rmmissing(retime(tt,'yearly',@mean))
You can pass whatever function you want in instead of @mean.
4 Comments
See Also
Categories
Find more on Timetables in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!