MATLAB Answers

How to check for gaps between datetimes in list of files in a folder?

7 views (last 30 days)
I have a list of files in a folder which are named in datetimes of two intervals. I want to check that each file is in fact every 2 minutes, and that there are no gaps in files. I should have one file every 2 mins.
To do this I have converted the datetimes of the files to serial date numbers, and calculated what 2 mins is in terms of serial date. I then use this to check if the gap between each file is equal to this or not. However, even though I have checked manually that the numbers 'gap' and 'correctgap' are the same, my script tells me that they are not. I am also having a problem in that the second for loop in my script reverts back to the first, rather than cycling within the loop to the number of files within d.
Can anyone see the problem? Thanks a lot!!
%Checking for gaps between files
paths={'H:\SoundTrap\wav\GoatIsland\001_GoatIsland_5100'};
dateFormat='yymmddHHMMSS';
correctgap=0.006944444496185;
lastfile=strsplit('5100.190610124350.wav','.'); %filename of first file in folder
lastfile=lastfile(2);
lastfile=datenum(lastfile,dateFormat);
for i=1:length(paths)
path=char(paths(i));
d=dir(fullfile(path,'*.wav'));
filecount=length(d);
%for loop that doesn't work:
for j=2:length(d) %start on second file (comparing second to first)
DateTime=strsplit(d(j).name,'.');
DateTime=DateTime(2);
DateTime=datenum(DateTime,dateFormat);
gap=DateTime - lastfile; % dateTime
if gap==correctgap %doesn't work, when values are the same, error('gap!') is returned
disp('good')
else
error('gap!')
end
lastfile=DateTime;
end
end

  0 Comments

Sign in to comment.

Accepted Answer

Mohammad Sami
Mohammad Sami on 22 Apr 2020
This is likely an issue with the precision of the double value. instead of checking for exact equality you may want to check for abs difference less then a certain threshold
threshold = 0.1 / (24*3600); % 0.1 second. change as needed
%.... your code .....%
if abs(gap-correctgap) < threshold
disp('good');
else
error('gap!');
end

  2 Comments

Louise Wilson
Louise Wilson on 26 Apr 2020
Thanks Mohammad! That fixes it. Just to clarify, are you saying that 0.1/(24*3600) is one second in the datenum format?
Louise Wilson
Louise Wilson on 26 Apr 2020
I have this now and it works, but in my output table the only thing I can get to work is single digits. Ideally I would have datetime in the first column and either 'good' or 'gap!' in the second column. How do I change the format of the output so that the output table can contain both a datenum in the first column and text in the second column?
paths={'Y:\SoundTrap\wav\GoatIsland\001_GoatIsland_5100'};
row=1;
output=[];
dateFormat='yymmddHHMMSS';
correctgap=0.006944444496185; %correct gap when 10 minute interval
threshold=0.1/(24*3600);
lastfile=strsplit('5100.190610124350.wav','.'); %filename of first file in folder
lastfile=lastfile(2);
lastfile=datenum(lastfile,dateFormat);
for i=1:length(paths)
path=char(paths(i));
d=dir(fullfile(path,'*.wav'));
filecount=length(d);
for j=2:length(d) %start on second file (comparing second to first)
DateTime=strsplit(d(j).name,'.');
DateTime=DateTime(2);
DateTime=datenum(DateTime,dateFormat);
gap=DateTime - lastfile; % dateTime
output(row,1)=DateTime;
if abs(gap-correctgap) < threshold
disp('good')
output(row,2)='1'; %1=49
else
error('gap!')
output(row,2)='0'; %0=48
end
lastfile=DateTime;
row=row+1;
end
end

Sign in to comment.

More Answers (1)

Peter Perkins
Peter Perkins on 27 Apr 2020
Louise, if you really want one array that contains both time and text, timetable is the right thign to use (though I would argue that you want logical, not text).
I recommend that you don't use datenum at all unless you are using a very old version of MATLAB. This will not suffer from the round-off problems you are having:
>> fname = '5100.190610124350.wav';
>> timestamp = datetime(fname,'InputFormat',"'5100.'yyMMddHHmmss'.wav'")
timestamp =
datetime
10-Jun-2019 12:43:50
When you subtract each timestamp from the previous, you will get a duration, and that's fine. Better, in fact.
Notice how I've embedded '5100.' and '.wav' as literals in the input format. There are other ways to peel that onion (e.g. the strplit you were using), but the above is straight-forward. Maybe if the '5100.' part is not always the same, you'd want to use strsplit.
There are easy ways to get your output with no loop. Assume you have the struct that you get back from dir, with a name field.
>> d = dir
d =
5×1 struct array with fields:
name
folder
date
bytes
isdir
datenum
>> fnames = {d.name}'
fnames =
6×1 cell array
{'5100.190610124350.wav'}
{'5100.190610125350.wav'}
{'5100.190610130351.wav'}
{'5100.190610131350.wav'}
{'5100.190610132350.wav'}
{'5100.190610133350.wav'}
>> timestamps = datetime(fnames,'InputFormat',"'5100.'yyMMddHHmmss'.wav'")
timestamps =
6×1 datetime array
10-Jun-2019 12:43:50
10-Jun-2019 12:53:50
10-Jun-2019 13:03:51
10-Jun-2019 13:13:50
10-Jun-2019 13:23:50
10-Jun-2019 13:33:50
>> gaps = [NaN; diff(timestamps)]
gaps =
6×1 duration array
NaN
00:10:00
00:10:01
00:09:59
00:10:00
00:10:00
>> output = timetable(timestamps,gaps,gaps==minutes(10))
output =
6×2 timetable
timestamps gaps Var2
____________________ ________ _____
10-Jun-2019 12:43:50 NaN false
10-Jun-2019 12:53:50 00:10:00 true
10-Jun-2019 13:03:51 00:10:01 false
10-Jun-2019 13:13:50 00:09:59 false
10-Jun-2019 13:23:50 00:10:00 true
10-Jun-2019 13:33:50 00:10:00 true
I don't know if that's actually what you want: notice that the 3rd file's timestamp is off by 1sec, and you get a false, but the 4th also gets a false even though its timestamp is on a ten minute boundary, because the gap frm the 3rd is not 10min. Maybe you want to look at differences from the first file, not the previous, I don't know. That would be easy in the above code.
>> (timestamps(4) - timestamps(1)) / minutes(10)
ans =
3

  2 Comments

Louise Wilson
Louise Wilson on 28 Apr 2020
Hi Peter-thanks for your response.
Using duration makes sense to me! The four digit number at the start is the serial number of the device I used to record the file, so that can be any of ten different serial numbers.
I do want to look at differences between subsequent files, not from the first file-I am interested in the difference between each file. Basically this is a hydrophone which is scheduled to record every 10 minutes, and I want to make sure that none were skipped e.g. the gap between recordings is 20 minutes. Seconds aren't important. So I guess I could change 'minutes(10)' to 'minutes(18)' or something and that would catch anything?

Sign in to comment.

Products


Release

R2019a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!