Reading a huge text file using Textscan()

5 views (last 30 days)
I have a .txt file with 3235K number of lines, and 25 space delimited columns. I use the following code to read the file:
fread = fopen('data.txt', 'r');
formatSpec ='%f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f';
C = textscan(fread, formatSpec);
A = [C{:}];
fclose(fread)
My problem is, it only reads up to 249K lines. A is 249000x25 double.
This is the output when I typed 'memory':
Maximum possible array: 55317 MB (5.800e+10 bytes) *
Memory available for all arrays: 55317 MB (5.800e+10 bytes) *
Memory used by MATLAB: 859 MB (9.002e+08 bytes)
Physical Memory (RAM): 8101 MB (8.494e+09 bytes)
Is there any way I can read the whole file?

Accepted Answer

Star Strider
Star Strider on 15 Aug 2015
The file itself may be the problem. There could be a text line in it that is stopping textscan.
Try this to see:
C = textscan(fread, formatSpec);
A = [C{:}];
fseek(fread,0,0);
C = textscan(fread, formatSpec, 'HeaderLines',1);
A = [A{:}; C{:}];
You can also use the fgets or fgetl to see what the next lines are, and then change the subsequent textscan calls to make it work with your file.
NOTE: You are inadvertently ‘shadowing’ (or ‘overshadowing’) the fread function. This could be a problem if you need it in your code later. Rename your file ID ‘fidin’ or some such instead to avoid the problem.
  2 Comments
Hg
Hg on 15 Aug 2015
You're right. I combined multiple text files so the there are EOF characters between the files. by the way, what are the codes above supposed to mean?
Star Strider
Star Strider on 15 Aug 2015
Thank you.
The fseek call repositions the file pointer, essentially starting it from the point where textscan stopped reading it. The second ‘A’ assignment simply concatenates ‘A’ with its previous data and the new data.
I just wrote one iteration for illustration purposes. You can put the second and subsequent iterations into a loop if you need to.

Sign in to comment.

More Answers (0)

Products

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!