MATLAB Answers

0

Import of tables from R where the first line describing the column names is one element shorter

Asked by Johan Gustafsson on 15 Sep 2019
Latest activity Answered by Johan Gustafsson on 15 Sep 2019
Many files exported from R looks something like this:
"var1" \t "var2"
"row1" \t val1 \t val2
"row2" \t val2 \t val2
The problem is that the line describing the variables is one element shorter, which readtable doesn't like much. Is there any way I can make that work? Editing the input file by changing first row to
\t "var1" \t "var2"
fixes the problem
I'm trying to read it with the line
f = readtable(filename, 'ReadVariableNames',true, 'ReadRowNames', true, 'Delimiter', '\t');
This should be a standard thing, but I just cannot make it work. I don't want to have to edit the input files all the time?

  0 Comments

Sign in to comment.

Products


Release

R2016b

2 Answers

Answer by Guillaume
on 15 Sep 2019
Edited by Guillaume
on 15 Sep 2019
 Accepted Answer

Yes, readtable expects the variable name line to have a placeholder (DimensionName) for the row name column. I suggest you raise an enhancement request with Mathworks.
Here is a roundabout way to get it to work:
%1st grap the variable names. Matlab should add an extra variable name at the end of the list to match the number of data columns
%ignore row names for now
opts = detectImportOptions(yourfile, 'ReadVariableNames', true, 'VariableNamesLine', 1)
varnames = opts.VariableNames;
%then tell matlab that there are row names. That messes up the variable names. So get these from the previous opts
opts = detectImportOptions(yourfile, 'ReadVariableNames', true, 'VariableNamesLine', 1, 'ReadRowNames', true)
opts.varnames = ['RowNames', varnames(1:end-1)]; %Still need a name for the row names columns.
opts.Datalines = [2, Inf]; %that's also messed up
result = readtable(yourfile, opts)
It works on the file I've tested but because of the complex heuristics of detectImportOptions it may break on more complex files.
Tested on 2019b. Not sure how it behaves with 2016b where detectImportOptions may not be as sophisticated.

  0 Comments

Sign in to comment.


Answer by Johan Gustafsson on 15 Sep 2019

Thanks, however this does not work for me, I suspect that the ReadVariableNames property is something that comes with 2019b, is that so? I tried to upgrade to 2018b, but it didn't help. I get the following error:
Error using detectImportOptions
'ReadVariableNames' is not a recognized parameter. For a list of valid name-value pair arguments, see the
documentation for detectImportOptions.
Is there another trick I could use? I was thinking I could do something using fgetl and a regexp, but it is kind of a messy way to do it?

  0 Comments

Sign in to comment.