read ascii non-delimited file
Show older comments
How do I read a numeric non-delimited file in matlab program? I tried using
fid=fopen('filename','r')
this generated fid=-1 and then textscan does not work.
Load function does not work on non-delimited files.
Is there any other way to read such file?
Here is an example how my data file looks -
-99.9999-99.9999-99.9999-99.9999 -1.2828 -1.2812
-1.0910 -1.1864 -1.2920-99.9999-99.9999-99.9999
-99.9999-99.9999-99.9999-99.9999-99.9999-99.9999
Thanks in advance!
6 Comments
Joseph Cheng
on 25 Jun 2014
Edited: Joseph Cheng
on 25 Jun 2014
there might be something a bit strange going on with your fopen statement. It shouldn't matter what the format it is. you should be getting a positive number. a -1 means it failed to open due to unable to find file, permission(?), etc. so since it failed textscan wouldn't work anyways since the file didn't open. I'd revisit the fopen statement.
Is the data all negative? or is is deliminated by '-'. how do you differentiate 99.999999.9999, or even 99.99990.9999?
Is the data all negative? or is is deliminated by '-'. how do you differentiate 99.999999.9999, or even 99.99990.9999?
It's a fixed-width field, each is 8 columns wide. The values with preceding '-' signs are negative, the rest aren't. The substring you made up of 99.999999.9999 isn't valid; two positive 99.9999 values would be printed as ' 99.9999 99.9999' as the format used was F8.4 (%8.4f in C/Matlab). It appears in the file the -99.9999 value is probably a missing value indicator.
Joseph Cheng
on 26 Jun 2014
My comment was before he formatted it in that fashion. It previously was just all concatenated together where it appeared to be all in one line.
dpb
on 26 Jun 2014
Perhaps you had to have come from Fortran background going back to punch-card days where column-delimited fields were/are the norm to have just presumed that was only a fig-newton of the word wrap formatting...I'll not bother to mention how old that must make me. :)
Joseph Cheng
on 26 Jun 2014
luckily/unluckily i never really "had" to do stuff on punch-cards. However i've been on the receiving end of some horrendous text file formatting that deliminators even fixed widths cannot be taken for granted.
If i saw how the poster reformatted the question the way it is now it would be clear there were some formatting pattern.
dpb
on 26 Jun 2014
...been on the receiving end of some horrendous text file formatting that deliminators even fixed widths cannot be taken for granted.
And probably most came from C or like languages where the "stream file" concept of run-on runs rampant. In Fortran even "list-directed" output is quite easy to parse.
I still can't believe I whiffed on the precision specifier for so long with Matlab in reading a fixed-width field, though. I've not yet taken the time to go back to the earlier release to ensure it worked there, too, though...maybe I'll find it was broken before but I'm not holding my breath.
Accepted Answer
More Answers (3)
per isakson
on 26 Jun 2014
Edited: per isakson
on 27 Jun 2014
Try
fid = fopen( 'cssm.txt', 'r' );
cac = textscan( fid, '%8.4f%8.4f%8.4f%8.4f%8.4f%8.4f', 'Whitespace',' ');
fclose( fid )
celldisp( cac )
which returns
cac{1} =
-99.9999
-1.0910
-99.9999
cac{2} =
-99.9999
-1.1864
-99.9999
cac{3} =
-99.9999
-1.2920
-99.9999
cac{4} =
-99.9999
-99.9999
-99.9999
cac{5} =
-1.2828
-99.9999
-99.9999
cac{6} =
-1.2812
-99.9999
-99.9999
>>
and where cssm.txt contains
-99.9999-99.9999-99.9999-99.9999 -1.2828 -1.2812
-1.0910 -1.1864 -1.2920-99.9999-99.9999-99.9999
-99.9999-99.9999-99.9999-99.9999-99.9999-99.9999
.
In response to comments:
To read many, N, columns with identical format use
N = 6;
format_spec = repmat( '%8.4f', 1, N );
fid = fopen( 'cssm.txt', 'r' );
cac = textscan( fid, format_spec, 'CollectOutput', true );
fclose( fid );
celldisp( cac )
which returns
cac{1} =
-99.9999 -99.9999 -99.9999 -99.9999 -1.2828 -1.2812
-1.0910 -1.1864 -1.2920 -99.9999 -99.9999 -99.9999
-99.9999 -99.9999 -99.9999 -99.9999 -99.9999 -99.9999
.
One more revisit: The precision specifier is NOT needed
N = 6;
% format_spec = repmat( '%8.4f', 1, N );
format_spec = repmat( '%8f', 1, N );
fid = fopen( 'cssm.txt', 'r' );
cac = textscan( fid, format_spec, 'CollectOutput', true );
fclose( fid );
celldisp( cac )
returns
cac{1} =
-99.9999 -99.9999 -99.9999 -99.9999 -1.2828 -1.2812
-1.0910 -1.1864 -1.2920 -99.9999 -99.9999 -99.9999
-99.9999 -99.9999 -99.9999 -99.9999 -99.9999 -99.9999
7 Comments
+123!!!
Per, I don't know why I hadn't ever thought to explicitly encode the precision as well as the field width (although in my defense I'll note that in all my years of complaining, nobody else has ever pointed it out, either). The latter includes the response from TMW on the previous behavior/enhancement request.
That does, however, cause the scanner to actually stop and then pick up again at the right spot.
Now can you explain how/why that's enough and a simple '%8f' isn't and is that somehow/somewhere documented within C? I've never seen anything that implied it, certainly, in all the looking I've done, but I'm certainly anything but a C-lizard.
Anyway, thanks for sticking that in here! I'll make the note on the aforementioned request and ask for a documentation enhancement.
PS. The 'whitespace' parameter isn't needed--I wondered at first if that was the trick that somehow I had missed initially but it is simply having the explicit exact format that is the deal, it seems.
vacube
on 26 Jun 2014
per isakson
on 26 Jun 2014
Edited: per isakson
on 26 Jun 2014
vacube, I've added a modified script to my answer.
per isakson
on 26 Jun 2014
Edited: per isakson
on 26 Jun 2014
dpb, I cannot recall how I came to try this format specification. Anyhow, it was definitely not based on the documentation of C. I've never used C. To me, this works automagically.
I'm not always comfortable with default values. Obviously, whitespace is not needed. My first reading was interrupted by " -1." I added whitespace,' ' and it worked. Now, I cannot reproduce the problem. However, with whitespace,'' it doesn't work.
It is a good idea to ask TMW for documentation.
OK, Per, I thought perhaps you did have intimate knowledge of the C spec for a basis. Certainly the explicit format string should work; again why I hadn't used it previously in my many previous set-tos with Matlab and fixed-column-width files I can't fathom; certainly did much with them in Fortran.
I understand the empty string for whitespace not working; that would eliminate the needed blanks in some of the fields. Who knows on the occasional irreproducible glitch?--generally it's a typo that can't reproduce or other "gotcha'!".
Anyway, again, I'm certainly pleased to have discovered the result even if there's still nuances don't fully understand.
I've had multitude of conversations w/ TMW support over the years on it altho I guess I have never posed the question of what's different in parsing a fixed-width field of W characters with %Wf vis a vis %W.Pf, that's true.
Anyway, thanks again, and thanks for the followup...I'll go away now on this particular subject. :)
per isakson
on 27 Jun 2014
Edited: per isakson
on 27 Jun 2014
Surprise:
>> sscanf( '1.23.45.6', '%3f%3f%3f' )
ans =
1.2000
3.4000
5.6000
and
>> textscan( '1.23.45.6', '%3f%3f%3f' )
ans =
[1.2000] [3.4000] [5.6000]
The sample case here is an anomaly; all columns are separated by either a '-' or a blank and that's enough to get the parser back on track. When it's truly only column-width w/o the help of either blanks or signs to help, then the parser gets lost. To see the issue, try the test.dat file in the earlier thread.
Or, even easier to see...
>>sscanf( '1.2 5.6', '%3f%3f%3f' )
ans =
1.2000
5.6000
>>
The intermediate 'bbb' column gets skipped over...
dpb
on 25 Jun 2014
I see IA answered the problem about opening the file; that's just the beginning as Matlab has no way to read a nondelimited, fixed-width file with native methods. There's an extended thread here that I just checked on status of earlier today...
for a long and ugly story.
For your case with a fixed set of fields for all the data, probably the simplest is to just read as character data and then arrange to parse the columns in order...I put the two records into a file and...
>> type vacube.dat
-99.9999-99.9999-99.9999-99.9999 -1.2828 -1.2812 -1.0910 -1.1864 -1.2920
-99.9999-99.9999-99.9999-99.9999-99.9999-99.9999-99.9999-99.9999-99.9999
>> c=textread('vacube.dat','%s','delimiter','\n'); c=char(c{:});
>> d=zeros(size(c,1),size(c,2)/8);
>> j=0;for i=1:8:length(c),j=j+1;d(:,j)=str2num(c(:,i:i+7));end
>> d
d =
-99.9999 -99.9999 -99.9999 -99.9999 -1.2828 -1.2812 -1.0910 -1.1864 -1.2920
-99.9999 -99.9999 -99.9999 -99.9999 -99.9999 -99.9999 -99.9999 -99.9999 -99.9999
>>
Above
a) reads into cell array, converts to character
b) preallocates a data array based on length of line and known 8-wide data fields
c) loops over the columns 8 at a time and converts the substrings, storing results in the appropriate column of the data vector.
Image Analyst
on 25 Jun 2014
Is filename a variable? If so, try it without the quotes:
fid = fopen(filename,'rt')
2 Comments
vacube
on 26 Jun 2014
That's the repmat trick...
fmt=repmat('%8.4f',1,360);
data=cell2mat(textscan(fid,fmt));
The above cell2mat will convert the cell array from textscan to a regular array for subsequent ease of addressing (by eliminating the {} notation).
As an aside, I still wish followed the FORMAT style so could write
fmt='360F8.4'
-- much simpler and more legible syntax than repmat
Since it's regular and you know the number, you can also just use '%8.4f' and then reshape the returned vector, remembering storage order is row-major.
dat=reshape(textread(filename,'%8.4f'),360,[]).';
While deprecated, textread has much of the functionality of textscan and a couple of advantages where don't need all of its fancier cousin--namely, does away with the fopen/fclose by handling opening the file internally and returns a regular double array instead of cell array saving the extra step of casting.
Categories
Find more on Whos in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!