Load parts of VERY LARGE text file content and create a smaller matrix
5 views (last 30 days)
I have a very large file. Sample format is attached. There is a header with comment marks $$.
The rest of the data begins from Start 1, Pos #, , followed by two columns of data the length
to Start 2, Pos # etc. Note that the columns after Pos # is NOT fixed.
The length of the two columns after Start #, Pos # ranges between 100 to around 500,000.
The Scan # ranges from 1 to around 4000.
I want to be able to read in sequentially the two columns after each Start 1, Pos# to just before Start 2 and Pos # and then move on to Start 2, Pos # etc.
I have tried textscan with block size but this is not working well.
It is not possible to load all data directly into Matlab.
Any directions will be greatly appreciated.
dpb on 11 Mar 2015
Edited: dpb on 12 Mar 2015
OK, stuff's taken care of and I'm in for the evening (we're back to family farm; retired from the consulting gig so this is my fun at keeping hand in a little). Anyway, the basic outline is--
id=cell2mat(textscan(fid,'Start %d','headerlines',5)); % first section ID
pos=cell2mat(textscan(fid,'Pos %f')); fprintf('\n Section %d\n', id),
dat=cell2mat(textscan(fid,'%f %f')); fprintf('%.3f %d\n',dat.')
% process first section here
while ~feof(fid) % w/ the header out of the way, do rest of file...
fprintf('\n Section %d\n', id),fprintf('%.3f %d\n',dat.')
% proces subsequent sections here, of course...
As you see, you're lucky with the blank line in the file that terminates the translation and that all you need for the indeterminate section lengths is the two fields to return the array in the right shape. Note I also went ahead and cast the cell output from textscan to an ordinary array at the time of the read; I almost always do this unless there's some specific reason for needing a cell array.