Removing Rows from Array Based on Date/Time Value in Cell
7 views (last 30 days)
Show older comments
Liza Miller
on 29 Jul 2020
Commented: Liza Miller
on 30 Jul 2020
I am trying to remove all lines of data for whom the "last order date" (row 17 of my CSV file) is after 8/31/2019. My code was weeding out some, but there were a few later dates left behind. I switched all the date values to Serial Date Numbers in hopes that would simplify the process, but it's still not weeding out anything after 7183 (the serial date number for 8/31/19). I am out of ideas and would greatly appreciate any feedback.
Code:
segment = readtable('myFile.csv');
a = 1;
rows = height(segment);
newDate = [];
for f = 1:rows
addDate = datenum(segment{f, 17});
newDate(f) = addDate;
end
cutoffDateOG = datetime('08/31/0019','InputFormat','MM/dd/uuuu', 'Format', 'preserveinput');
cutoffDate = datenum(cutoffDateOG);
while a < rows
if newDate(a)>cutoffDate || isnat(segment{a,17}) == 1
segment(a,:) = [];
rows = rows-1;
end
a = a+1;
end
end
0 Comments
Accepted Answer
Cris LaPierre
on 30 Jul 2020
Ok, I'll take you through the steps I'd follow. I'll be sharing videos from our Exploratory Data Analysis specialization on Coursera. If you have time, you might find it a great way to learn the latest features in MATLAB for data analysis. You can access all but the graded content for free.
You'll have to get the data into MATLAB first. I suggest using the Import Tool. This video may be helpful for that. If you prefer, you can generate the code used to import the data from the Import Tool. This video on generating and reusing code created by the import tool shows you how.
Make sure to set the data type of the variables you want to use (each column of a table is a variable). Most important is to set the Last Order Date to datetime. You can learn more about working with dates and times here.
Then you'll want to access specific variables from the table for your comparison. This video will explain how to do this. And since you are actually wanting to access a subset of the data in your table, perhaps this video will also be helpful.
% Import data setting date formats
opts = detectImportOptions("MatLab Test Data.csv");
opts = setvaropts(opts,{'x_LastOrderDate','x_FirstShowAttendedDate','x_LastShowAttendedDate','x_NextShowAttendingDate','x_LastCompShowDate'},'InputFormat',"MM/dd/yy");
opts = setvaropts(opts,'DateCreated','InputFormat','MM/dd/yy HH:mm');
segment = readtable("MatLab Test Data.csv",opts);
% This is the part that removes the rows of data with Last Order Date after cutoff date
% Specify the cutoff date
cutoffDate = datetime(2019,08,13);
% Remove rows where Date is after cut off date
segment(segment.x_LastOrderDate>cutoffDate,:)=[];
% Remove rows where Dates has a value of NaT
segment=rmmissing(segment,"DataVariables","x_LastOrderDate")
More Answers (1)
Cris LaPierre
on 29 Jul 2020
Don't use datenums. I would suggest converting your dates to datetimes (segment{f,17}). Then you can just use regular conditional expressions (>,<.==,etc.).
4 Comments
Cris LaPierre
on 29 Jul 2020
Edited: Cris LaPierre
on 29 Jul 2020
You may be making this more complicated than it needs to be. Without a representative data set, we can only guess, but I suspect something like this should work. Ignore the first few lines. I'm creating a dummy table of various data types for the example.
% Dummy data set including a range of dates including NaT entries.
Dates = [datetime(2019,01,01):days(1):datetime('today') repmat(NaT,1,24)];
segment = table(Dates',zeros(length(Dates),1),zeros(length(Dates),1,'single'),cell(length(Dates),1),strings(length(Dates),1));
segment.Properties.VariableNames=["Dates","Double","Single","Cell","String"];
% Specify the cutoff date
cutoffDate = datetime(2019,08,13);
% Remove rows where Dates has a value of NaT
segment=rmmissing(segment,"DataVariables","Dates")
% Remove rows where Date is after cut off date
segment(segment.Dates>cutoffDate,:)=[];
Since I start Dates on the first day of 2019, and since 8/13/2019 is the 225th day of the year (in 2019), the resulting size of segment should be a table with 225 rows and 5 columns.
day(cutoffDate,'dayofyear')
ans =
225
See Also
Categories
Find more on Dates and Time in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!