Time info is not read correctly by readtable
Show older comments
I'm using the below command to read my Excel files (see attached for an example):
A = readtable('test.xlsx');
I notice one interesting thing:
My time column is read as NaN, when they are formatted as 'time' (like 02:45:31 PM)within Excel.
The fix is pretty simple, I open the Excel file, right click and choose "Format cell" and choose "Number" and select their decimal places. Then, everything will be read properly.
Here is my question. I have a lot of such Excel files, is there a way I can fix my Matlab code to address this issue, instead of manually going through my time columns?
Thanks!
6 Comments
J. Alex Lee
on 8 Apr 2020
Does it work in releases prior to 2020a? It seems they changed default behavior of readtable in some annoying ways (i'm sure on balance it's an improvement).
Does your excel file have only time information, or does it have date as well? I find Excel's handling of times very confusing, as you can subtract dates (to get a duration) and still format with "am/pm" so that a 23 hour duration can be called a 11:00PM duration (it seems).
Anyway, maybe you can use detectImportOptions with setvartype to force reading the column as a duration?
Leon
on 8 Apr 2020
J. Alex Lee
on 8 Apr 2020
i wouldn't fault TMW for that...Excel tries to guess what we want when we enter "3:00". Does it mean 3:00am today? or a time duration of 3 hours? or 3 minutes? And why should Matlab try to guess what Excel guessed we wanted? The reality is that it DOES try to do so with the convenience function readtable(). In a lot of cases it works (kudos), but when it doesn't work, we can tell Matlab what we wanted by using detectImportOptions. However, it can't account for the fact that Excel files were generated poorly or with inconsistent storage of dates and times.
If you still have issues post an example of your Excel for forum members to check out.
Leon
on 8 Apr 2020
J. Alex Lee
on 9 Apr 2020
Odd, readtable cannot read your 4th column at all, no matter what datatype is specified (inc. string and char). detectImportOptions decided it was a char, for me, and returned {0x0 char}s. Couldn't read in 2019b either.
But, when I copy-pasted the rows of your excel into a new excel file, readtable worked on the new file; detectImportOptions decided TIME_UTC was of type double.
I'm at my wit's end, maybe someone else can shed light on this situation. I attached the copy-pasted excel (matched the sheet name for good measure), and noticed the file sizes are different.
Walter Roberson
on 9 Apr 2020
My Excel 2011 for Mac says that the original file is invalid and will not open it.
Answers (1)
Divya Gaddipati
on 13 Apr 2020
Hi,
This is a known issue and our development team is working on it.
Until this issue is resolved, please use Microsoft Excel to save this file as an "Excel Workbook (.XLS)" instead. The readtable function should work as expected on the .XLS workbook.
>> A = readtable('test.xls');
>> opts = detectImportOptions('test.xls');
>> preview('test.xls',opts)
ans =
8×4 table
Year Month Day TIME_UTC
____ _____ ___ ______________________________
2017 9 18 {'20:32:53.99999999999522700'}
2017 9 18 {'20:35:53.99999999999458800'}
2017 9 18 {'20:37:14.00000000000069775'}
2017 9 18 {'20:38:40.99999999999510950'}
2017 9 18 {'20:40:04.00000000000168875'}
2017 9 18 {'20:41:28.00000000000522950'}
2017 9 18 {'20:42:39.00000000000033900'}
2017 9 18 {'20:43:45.00000000000106275'}
Hope this helps!
Categories
Find more on Data Import from MATLAB in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!