Best way to read large text files (over 2 mil rows) into matlab?

I need to read in a .csv file with 4 columns and over 2 million rows. The columns consist of a 3 row header followed by 50,000 numerical values; this pattern of the header followed by the 50,000 numbers repeats hundreds of times within the same columns until i have over 2 million rows worth of data.
What is the fastest and most efficient way to read these columns into matlab? It isn't a big deal if the cells that contain strings get read in as NaN, i can always fix that after the file has been read in.
The code that i am currently using to try and read in the data (seen below) is taking over 3 hours and it completely freezes my computer while it is computing.
filename = 'input.csv';
delimiter = ',';
startRow = 1;
%%Read columns of data as strings:
% For more information, see the TEXTSCAN documentation.
formatSpec = '%s%s%s%s%[^\n\r]';
%%Open the text file.
fileID = fopen(filename,'r');
%%Read columns of data according to format string.
% This call is based on the structure of the file used to generate this
% code. If an error occurs for a different file, try regenerating the
% code from the Import Tool.
dataArray = textscan(fileID, formatSpec, 'Delimiter', delimiter ...
, 'HeaderLines' ,startRow-1, 'ReturnOnError', false);
%%Close the text file.
fclose(fileID);

Answers (1)

The file consists of many blocks of header-lines followed by numerical data(?). There is no high-level function in Matlab, which read your file.
  • "50,000 numbers"&nbsp translates to 12,500 rows?
  • the entire file as one string variable in Matlab will be approx. 0.2GB
  • the numerical data converted to double will be less than 0.1GB
That should fit comfortably in memory.
&nbsp
I think that "fastest and most efficient way" is
  1. read the entire file to one string variable
  2. split the string into sub-strings, which contains header-lines followed by numerical data
  3. parse the sub-strings with textscan
To fill in the details requires more info on the format of the file.

Asked:

on 8 Aug 2014

Edited:

on 9 Aug 2014

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!