time analysis of data

3 views (last 30 days)
michael
michael on 2 Nov 2022
Commented: Walter Roberson on 4 Nov 2022
Hi,
I have following text file
id, start_time, stop_time, value
1 XXX1 YYY1 13
9 XXX2 YYY2 65
7 XXX3 YYY3 82
1 XXX4 YYY4 5
1 XXX5 YYY5 6
etc...
the id is related to variable name, the times are Year-Month-Day-Hour-Minute-Second-Milisecond (all concatinated, without dashes or any other characters)
The times of each row can be partially (or fully or no) overlapping.
I'd like to perform some analysis and plot the data (lets say id 1 and 7 on same plot)
It can that id 7 will appear only once (with first time of data arrival till end of the data colleciton) while id 1 will appear 100 times (with first time of data arrival which can be differnt from id 7 till end of the data colleciton).
How would you recomment to handle the data in Matlab (read it, store it and plot it so that the access would be user friendly).

Answers (1)

Walter Roberson
Walter Roberson on 2 Nov 2022
I might suggest using detectImportOptions() on the file. Then use setvartype() to set the second and third columns to DateTime, and set the InputFormat for those variables to 'uuuuMMddHHmmssSSS' or as appropriate (year, month, day, 24 hour hour, minutes, seconds, fractions of a second.) Then readtable() using the altered options.
You can do subset selection such as
mask7 = T.id == 7;
subset7 = T(mask7, :);
t7 = [subset7.start_time, subset7.end_time];
t7(:,end+1) = nan;
v7 = repelem(subset7.value, 1, 2);
v7(:,end+1) = nan;
t7 = reshape(t7.', [], 1);
v7 = reshape(v7.', [], 1);
plot(t7, v7)
The purpose of this code would to draw a horizontal line at "value" from the start time for the value to the end time of the value. The nans are there to force it to break the drawing after each segment.
  3 Comments
Walter Roberson
Walter Roberson on 4 Nov 2022
Intersection of what and what?
If you have two different rows, A and B, and you want to know if their times overlap, then
latest_start = max(T.start_time(A), T.start_time(B));
earliest_end = min(T.end_time(A), T.end_time(B));
then if latest_start > earliest_end then they do not overlap in time.
If you want to find all of the rows that overlap with each other, then unfortunately that generally takes time proportional to the square of the number of rows, as generally you would need to compare each row to each other row. There just might be a way to reduce that cost using a modification of the ideas used for QuickSort, or perhaps some kind of quadtree representation; if so then you might be able to get the cost down to being proportional to n*log(n) instead of to n*n .
Walter Roberson
Walter Roberson on 4 Nov 2022
Did you end up switching to a newer version of MATLAB? You had earlier posted pointing out that you do not have readtable because you are using a very old version.
For your very old version, you would probably want to use textscan() -- which was introduced in R14 . The Year-Month-Day-Hour-Minute-Second-Milisecond stream of numbers are probably too long to be represented as double precision, unless the year is only 2 digit and there are at most 3 millisecond digits. If you have a 4 digit year then you would need to use %s fields to read the time representations and then use datenum() to convert them into serial date numbers. If you are using 2 digit years then you might be able to fit them in as floating point numbers instead of as character vectors, but since you still need to convert them to serial date numbers it is probably easier to use %s anyhow.

Sign in to comment.

Categories

Find more on Dates and Time in Help Center and File Exchange

Products


Release

R14SP2

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!