Combine two matrices according to datetime in first column

5 views (last 30 days)
Dear all,
I have two matrices 'rawT' and 'rawV' (see data.mat). I want to combine columns 2 until 4 of 'rawV' with the columns 1 until 4 of 'rawT' according to the date time in column 1.
For example, row 2 columns 2 until 4 of 'rawV'
must be in row 12 columns 5 until 7 of 'rawT'. This is because the first columns of both 'rawV' and 'rawT' contain nearly the same date and time.
My plan was to index the first column and match both first columns. I know that the date time values of the first columns don't match exactly with each other and the matrices are not the same length. Those pose problems.
tV = datetime(rawV(2:length(rawV),1),'Format','dd-MM-yyyy HH:mm');
tT = datetime(rawT(2:length(rawT),1),'Format','dd-MM-yyyy HH:mm');
idx = ismember(tV,tT);

Accepted Answer

Cris LaPierre
Cris LaPierre on 17 May 2022
You are going to have a challenge here because the tolerance indicated in your example (~1 minute) is much larger than the smallest step size in you data. This means that using +/- 1 min everywhere might result in combining duplicate rows.
In addition, you have a couple places where your time goes backwards, and then once sorted, several times where the adjacent rows have the same time, but different values. There is also one instance where you just have a date and no time. You will need to work out a solution for these situations.
There are two possible approaches I would start with: Convert your cells to timetables (cell2table and table2timetable), then use outerjoin or use synchronize to combine the two tables, horizontally concatenating the two data sets.
load TVKdata.mat
% Conver cells to timetables. I choose to sort
rawTt = cell2table(rawT(2:end,:),'VariableNames', rawT(1,:));
rawTt.tijd = datetime(rawTt.tijd,'InputFormat','dd-MM-yyyy HH:mm:ss');
rawTT = table2timetable(rawTt);
rawTT = sortrows(rawTT,"tijd");
rawVt = cell2table(rawV(2:end,:),'VariableNames', rawV(1,:));
rawVt.tijd = datetime(rawVt.tijd,'InputFormat','dd-MM-yyyy HH:mm:ss');
rawVV = table2timetable(rawVt);
rawVV = sortrows(rawVV,"tijd");
% Option 1: Join tables
joinedData = outerjoin(rawTT,rawVV)
joinedData = 210160×6 timetable
tijd Temperatuur asfalt Temperatuur dek Temperatuur onderflens Tilt dwars Tilt lengte Voegopening ____________________ __________________ _______________ ______________________ __________ ___________ ___________ 02-Jul-2020 15:45:43 NaN NaN 17.403 NaN NaN NaN 02-Jul-2020 16:00:01 NaN NaN 17.645 NaN NaN NaN 02-Jul-2020 16:15:01 NaN NaN 17.549 NaN NaN NaN 02-Jul-2020 16:30:01 NaN NaN 17.659 NaN NaN NaN 02-Jul-2020 16:45:02 NaN NaN 17.692 NaN NaN NaN 02-Jul-2020 17:00:01 NaN NaN 17.694 NaN NaN NaN 02-Jul-2020 17:00:41 NaN 20.689 NaN NaN NaN NaN 02-Jul-2020 17:15:01 NaN 20.806 17.594 NaN NaN NaN 02-Jul-2020 17:30:01 NaN 20.943 17.535 NaN NaN NaN 02-Jul-2020 17:45:01 NaN 20.872 17.843 NaN NaN NaN 02-Jul-2020 18:00:01 NaN 20.65 17.77 NaN NaN NaN 02-Jul-2020 18:00:55 NaN NaN NaN NaN NaN 35 02-Jul-2020 18:15:01 NaN 20.662 17.917 NaN NaN NaN 02-Jul-2020 18:15:02 NaN NaN NaN NaN NaN 35.2 02-Jul-2020 18:30:01 NaN 20.526 17.885 NaN NaN NaN 02-Jul-2020 18:30:02 NaN NaN NaN NaN NaN 34.2
% Option 2: synchronize
rawTT(ismissing(rawTT.tijd),:) = [];
finalTT = synchronize(rawTT,rawVV)
finalTT = 210135×6 timetable
tijd Temperatuur asfalt Temperatuur dek Temperatuur onderflens Tilt dwars Tilt lengte Voegopening ____________________ __________________ _______________ ______________________ __________ ___________ ___________ 02-Jul-2020 15:45:43 NaN NaN 17.403 NaN NaN NaN 02-Jul-2020 16:00:01 NaN NaN 17.645 NaN NaN NaN 02-Jul-2020 16:15:01 NaN NaN 17.549 NaN NaN NaN 02-Jul-2020 16:30:01 NaN NaN 17.659 NaN NaN NaN 02-Jul-2020 16:45:02 NaN NaN 17.692 NaN NaN NaN 02-Jul-2020 17:00:01 NaN NaN 17.694 NaN NaN NaN 02-Jul-2020 17:00:41 NaN 20.689 NaN NaN NaN NaN 02-Jul-2020 17:15:01 NaN 20.806 17.594 NaN NaN NaN 02-Jul-2020 17:30:01 NaN 20.943 17.535 NaN NaN NaN 02-Jul-2020 17:45:01 NaN 20.872 17.843 NaN NaN NaN 02-Jul-2020 18:00:01 NaN 20.65 17.77 NaN NaN NaN 02-Jul-2020 18:00:55 NaN NaN NaN NaN NaN 35 02-Jul-2020 18:15:01 NaN 20.662 17.917 NaN NaN NaN 02-Jul-2020 18:15:02 NaN NaN NaN NaN NaN 35.2 02-Jul-2020 18:30:01 NaN 20.526 17.885 NaN NaN NaN 02-Jul-2020 18:30:02 NaN NaN NaN NaN NaN 34.2
Note that the number of rows in both options is less than the sum of rows from rawT and rawV (210583), meaning it has combined rows that have the same exact time.
  2 Comments
Tessa van Kol
Tessa van Kol on 18 May 2022
Thank you for you solution. The next step is indeed to clean up the data further.
Cris LaPierre
Cris LaPierre on 18 May 2022
With your data in a timetable, you may be interested in the retime function.

Sign in to comment.

More Answers (0)

Categories

Find more on Data Type Conversion in Help Center and File Exchange

Products


Release

R2021b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!