A faster way to read a large text file

23 views (last 30 days)
Azza Ahmed
Azza Ahmed on 2 Jul 2012
Hi,
I am reading a large text file ~3GB in chunks. Each time I open the file,I read the header lines, then read a chunk "of lets say 1000 lines", then I close the file before I process my data. The second time I want to read another chunk, I re-open my file and go through the file with "fgetl" line by line till I reach the line where I stopped previously before I grab a new chunk of data. I was wondering if there is a faster way to grab chunks of lines instead going through the beginning of the file line by line in each and every time.
Any help with this will be highly appreciated.
Best wishes
AA

Answers (2)

Jonathan Sullivan
Jonathan Sullivan on 2 Jul 2012
Edited: Jonathan Sullivan on 2 Jul 2012
Before you exit, store what line you are at, then, when you reopen it, seek to that line.
Right before you close:
file_position = ftell(fid);
Then after you open it:
fseek(fid,file_position,'bof');

Azza Ahmed
Azza Ahmed on 2 Jul 2012
Edited: Walter Roberson on 2 Jul 2012
Hi Jonathan,
Many thanks for getting back to me. I have tried to do something similar earlier by I had difficulties on where to have the fseek in my code. Here is the code to read the file:
StartLine=11;
NumberOflinesToGet=1000;
fid1 = fopen(filename);
if fid1 == -1
disp(' ')
disp('Operation was unsuccessful - Check the filename')
y = [];
return
end
for k=1:StartLine-1 %this reads the header lines
fgetl(fid1);
end
k=1;
while ~feof(fid1) && k<=NumberOflinesToGet %this reads my data of 1000 lines
d = fgetl(fid1);
a = str2num(d);
a = a(ColumnsToGet);
y(k,:)=a;
k=k+1;
end
if feof(fid1)
file_position=0;
else
file_position=ftell(fid1);
end
fclose(fid1);
So would you kindly show me where to have the fseek? I tried to have it immediately after end of the first "if" loop but that seemed to have made things complicated.
Many thanks
AA

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!