readtable error : read all columns into a single column

8 views (last 30 days)
I am using readtable to read a *.txt file into table
data = readtable('test.txt', 'Format','auto')
the data file have 10 columns, but in the loaded data table, all columns are cancatecated into a single column, any ideas how to fix this?
thank you.
  4 Comments
dpb
dpb on 9 Nov 2023
You forgot to attach the data file so we have to guess...
First, try
data = readtable('test.txt');
with no additional parameters; generally the input scanner is pretty competent about finding the delimiter on its own.
If that fails, then telling it what the delimiter is will be next; the image you attached is from Excel where you imported it, it is NOT the text file; the Q? is whether Excel was able to detect the delimiter on its own or did you have to use the whizard to separate columns -- if the text file is fixed-width, that is generally the case and would be one likely cause for your result in MATLAB.
Guohua
Guohua on 9 Nov 2023
Please see the attached .txt file, adding "ExpectedNumVariables" won't solve the issue.
I also tried "Format"
readtable(filepath,'Format','%s%s%s%f%f%f%f%f%f%f%f%f%f')
This wont work either.

Sign in to comment.

Answers (2)

Cris LaPierre
Cris LaPierre on 9 Nov 2023
It looks like MATLAB is not correctly identifying the delimitar, which is a tab. You can manually specify that.
Some fo the datatypes are also incorrectly selected. You can use theImport Tool to ineratively set the import parameters and then generate a script.
You could also do that manually.
opts = delimitedTextImportOptions("NumVariables",13);
opts.Delimiter = "\t";
opts.VariableNamesLine = 1;
opts.DataLines = [2,inf];
opts = setvartype(opts,["datetime", "string", "string", "double", "double", "double", "double", "double", "double", "double", "double", "double", "double"]);
data = readtable("data_20091130.txt",opts)
Warning: Column headers from the file were modified to make them valid MATLAB identifiers before creating variable names for the table. The original column headers are saved in the VariableDescriptions property.
Set 'VariableNamingRule' to 'preserve' to use the original column headers as table variable names.
data = 939×13 table
Date x_cusip x_sedol ModelComposite_USHISTORY GrowthComposite_USHISTORY QualityComposite_USHISTORY SentimentComposite_USHISTORY ValuationComposite_USHISTORY GrowthOrientedIndustriesForBacktest EquityDistressedFlag_USHistory DividendGrowthProb DistressedCompositeGlobal_Rank_9 DIV_GR_PROB_LogitGlobal_Monthly ___________ ___________ _________ ________________________ _________________________ __________________________ ____________________________ ____________________________ ___________________________________ ______________________________ __________________ ________________________________ _______________________________ 30-Nov-2009 "03216N103" <missing> NaN NaN -0.26873 NaN NaN 1 NaN NaN NaN NaN 30-Nov-2009 "016275109" <missing> NaN NaN 0.028927 NaN NaN 0 NaN NaN NaN NaN 30-Nov-2009 "860341106" "B298752" NaN NaN -0.19644 NaN NaN 0 NaN NaN NaN NaN 30-Nov-2009 "816074405" "B01RQH8" NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 30-Nov-2009 "002034957" <missing> NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 30-Nov-2009 "00846U101" "2520153" 0.025038 1.3957 -1.3517 0.052121 -1.4895 1 0 NaN 7 NaN 30-Nov-2009 <missing> <missing> NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 30-Nov-2009 "004403101" "2005650" NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 30-Nov-2009 "043436104" "2855855" 0.097163 -0.91015 0.72884 0.43454 1.1715 1 0 NaN 10 NaN 30-Nov-2009 "013104104" "2012467" NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 30-Nov-2009 "002824100" "2002305" -0.17297 0.83537 0.02089 -0.04318 -0.27188 0 0 NaN 10 0.49791 30-Nov-2009 "00081T108" "B0G7SZ5" -0.57832 -1.5548 -0.4272 -0.62305 -0.11965 0 0 NaN 3 NaN 30-Nov-2009 "G0450A105" "2740542" -0.40851 0.65944 0.38131 -1.0877 -0.058327 0 0 NaN 5 NaN 30-Nov-2009 "025195405" "B064919" -0.33473 -1.2372 0.94208 -2.3694 0.79164 0 1 NaN 1 NaN 30-Nov-2009 "008190100" "2034524" -0.068901 0.29572 -0.19788 -0.19018 0.45785 1 0 NaN 9 NaN 30-Nov-2009 "032654105" "2032067" 1.3453 -0.11685 0.14799 2.7512 0 1 0 NaN 9 0.65386
  1 Comment
Cris LaPierre
Cris LaPierre on 9 Nov 2023
Using your second approach also works, but with some additional specifications.
readtable("data_20091130.txt","Format",'%{MM/dd/yyyy}D%s%s%f%f%f%f%f%f%f%f%f%f',...
'Delimiter','\t','TextType','string')
Warning: Column headers from the file were modified to make them valid MATLAB identifiers before creating variable names for the table. The original column headers are saved in the VariableDescriptions property.
Set 'VariableNamingRule' to 'preserve' to use the original column headers as table variable names.
ans = 939×13 table
Date x_cusip x_sedol ModelComposite_USHISTORY GrowthComposite_USHISTORY QualityComposite_USHISTORY SentimentComposite_USHISTORY ValuationComposite_USHISTORY GrowthOrientedIndustriesForBacktest EquityDistressedFlag_USHistory DividendGrowthProb DistressedCompositeGlobal_Rank_9 DIV_GR_PROB_LogitGlobal_Monthly __________ ___________ _________ ________________________ _________________________ __________________________ ____________________________ ____________________________ ___________________________________ ______________________________ __________________ ________________________________ _______________________________ 11/30/2009 "03216N103" "" NaN NaN -0.26873 NaN NaN 1 NaN NaN NaN NaN 11/30/2009 "016275109" "" NaN NaN 0.028927 NaN NaN 0 NaN NaN NaN NaN 11/30/2009 "860341106" "B298752" NaN NaN -0.19644 NaN NaN 0 NaN NaN NaN NaN 11/30/2009 "816074405" "B01RQH8" NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 11/30/2009 "002034957" "" NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 11/30/2009 "00846U101" "2520153" 0.025038 1.3957 -1.3517 0.052121 -1.4895 1 0 NaN 7 NaN 11/30/2009 "" "" NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 11/30/2009 "004403101" "2005650" NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 11/30/2009 "043436104" "2855855" 0.097163 -0.91015 0.72884 0.43454 1.1715 1 0 NaN 10 NaN 11/30/2009 "013104104" "2012467" NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 11/30/2009 "002824100" "2002305" -0.17297 0.83537 0.02089 -0.04318 -0.27188 0 0 NaN 10 0.49791 11/30/2009 "00081T108" "B0G7SZ5" -0.57832 -1.5548 -0.4272 -0.62305 -0.11965 0 0 NaN 3 NaN 11/30/2009 "G0450A105" "2740542" -0.40851 0.65944 0.38131 -1.0877 -0.058327 0 0 NaN 5 NaN 11/30/2009 "025195405" "B064919" -0.33473 -1.2372 0.94208 -2.3694 0.79164 0 1 NaN 1 NaN 11/30/2009 "008190100" "2034524" -0.068901 0.29572 -0.19788 -0.19018 0.45785 1 0 NaN 9 NaN 11/30/2009 "032654105" "2032067" 1.3453 -0.11685 0.14799 2.7512 0 1 0 NaN 9 0.65386

Sign in to comment.


Voss
Voss on 9 Nov 2023
filename = 'data_20091130.txt';
% detect import options, specifying tab as delimiter:
opts = detectImportOptions(filename,'Delimiter','\t');
% set variable 2 and 3's data to be interpreted as character vectors:
opts = setvartype(opts,opts.VariableNames([2 3]),{'char','char'});
% read the file:
T = readtable(filename,opts)
Warning: Column headers from the file were modified to make them valid MATLAB identifiers before creating variable names for the table. The original column headers are saved in the VariableDescriptions property.
Set 'VariableNamingRule' to 'preserve' to use the original column headers as table variable names.
T = 939×13 table
Date x_cusip x_sedol ModelComposite_USHISTORY GrowthComposite_USHISTORY QualityComposite_USHISTORY SentimentComposite_USHISTORY ValuationComposite_USHISTORY GrowthOrientedIndustriesForBacktest EquityDistressedFlag_USHistory DividendGrowthProb DistressedCompositeGlobal_Rank_9 DIV_GR_PROB_LogitGlobal_Monthly ___________ _____________ ___________ ________________________ _________________________ __________________________ ____________________________ ____________________________ ___________________________________ ______________________________ __________________ ________________________________ _______________________________ 30-Nov-2009 {'03216N103'} {0×0 char } NaN NaN -0.26873 NaN NaN 1 NaN {0×0 char} NaN NaN 30-Nov-2009 {'016275109'} {0×0 char } NaN NaN 0.028927 NaN NaN 0 NaN {0×0 char} NaN NaN 30-Nov-2009 {'860341106'} {'B298752'} NaN NaN -0.19644 NaN NaN 0 NaN {0×0 char} NaN NaN 30-Nov-2009 {'816074405'} {'B01RQH8'} NaN NaN NaN NaN NaN NaN NaN {0×0 char} NaN NaN 30-Nov-2009 {'002034957'} {0×0 char } NaN NaN NaN NaN NaN NaN NaN {0×0 char} NaN NaN 30-Nov-2009 {'00846U101'} {'2520153'} 0.025038 1.3957 -1.3517 0.052121 -1.4895 1 0 {0×0 char} 7 NaN 30-Nov-2009 {0×0 char } {0×0 char } NaN NaN NaN NaN NaN NaN NaN {0×0 char} NaN NaN 30-Nov-2009 {'004403101'} {'2005650'} NaN NaN NaN NaN NaN NaN NaN {0×0 char} NaN NaN 30-Nov-2009 {'043436104'} {'2855855'} 0.097163 -0.91015 0.72884 0.43454 1.1715 1 0 {0×0 char} 10 NaN 30-Nov-2009 {'013104104'} {'2012467'} NaN NaN NaN NaN NaN NaN NaN {0×0 char} NaN NaN 30-Nov-2009 {'002824100'} {'2002305'} -0.17297 0.83537 0.02089 -0.04318 -0.27188 0 0 {0×0 char} 10 0.49791 30-Nov-2009 {'00081T108'} {'B0G7SZ5'} -0.57832 -1.5548 -0.4272 -0.62305 -0.11965 0 0 {0×0 char} 3 NaN 30-Nov-2009 {'G0450A105'} {'2740542'} -0.40851 0.65944 0.38131 -1.0877 -0.058327 0 0 {0×0 char} 5 NaN 30-Nov-2009 {'025195405'} {'B064919'} -0.33473 -1.2372 0.94208 -2.3694 0.79164 0 1 {0×0 char} 1 NaN 30-Nov-2009 {'008190100'} {'2034524'} -0.068901 0.29572 -0.19788 -0.19018 0.45785 1 0 {0×0 char} 9 NaN 30-Nov-2009 {'032654105'} {'2032067'} 1.3453 -0.11685 0.14799 2.7512 0 1 0 {0×0 char} 9 0.65386

Tags

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!