how can I read in this file of varying data types

1 view (last 30 days)
I need to read in the year, magnitude, and location (city and state) from this file and the print it back out. I am running into issues in the second line because of New Madrid, MO.
.txt file:
January 7, 2016 4.8 Fairview, OK
February 7, 1812 7.5 New Madrid, MO
February 10, 2016 2.8 Avenal, CA
August 23, 2011 5.8 Mineral, VA
fid = fopen('earthquake_data.txt','r');
fprintf('Year\t\tMoment Magnitude\t\tLocation\t\tMercalli Intensity\n');
for i = 1:4
yearis = fscanf(fid,'%*s%*s%f',1);
mag = fscanf(fid,'%f',1);
townis = fscanf(fid,'%*s%s',1);
stateis = fscanf(fid,'%s',1);
intensity = 0;
if mag < 2
intensity = '1';
elseif mag < 3
intensity = '1-2';
elseif mag < 4
intensity = '2-4';
elseif mag < 5
intensity = '4-6';
elseif mag < 6
intensity = '6-8';
elseif mag < 7
intensity = '7-10';
else
intensity = '>8';
end
fprintf('%d\t\t %.2f \t\t\t%s \t\t%s\n',yearis,mag,townis,intensity)
end
  1 Comment
dpb
dpb on 28 Feb 2019
Edited: dpb on 28 Feb 2019
The problem is a poorly-constructed file that contains records with embedded blanks but is written as '%q' "quoted" string.
First option would be to fix the generator for the input file.
Failing that, would have to read each record in its entirety and parse by context to separate the various pieces.

Sign in to comment.

Answers (2)

Jan
Jan on 28 Feb 2019
Edited: Jan on 28 Feb 2019
One way - old-fashioned without textscan:
% UNTESTED!!! Written in the forum's interface without testing
[fid, msg] = fopen('earthquake_data.txt', 'r');
if fid == -1
error('Cannot open file: %s', msg);
end
fprintf('Year\t\tMoment Magnitude\t\tLocation\t\tMercalli Intensity\n');
IntensityValue = [0, 3, 4, 5, 6, 7, Inf];
IntensityList = {'1', '1-2', '2-4', '4-6', '6-8', '7-10', '>8'};
while ~feof(fid)
S = fgetl(fid);
if ~ischar(S) || isempty(S) % Care about trailing new line
continue;
end
C = strtrim(strsplit(S, ','));
[Num, n] = sscanf(C{2}, '%d %g');
Year = Num(1);
Mag = Num(2);
Town = strtrim(C{2}(n+1:end));
State = C{3};
index = discretize(Mag, IntensityValue);
Intensity = IntensityList{index};
fprintf('%d\t\t %.2f \t\t\t%s \t\t%s\n', Year, Mag, Town, Intensity);
end
fclose(fid);

Akira Agata
Akira Agata on 14 Mar 2019
Another possible solution:
fid = fopen('earthquake_data.txt','r');
s = textscan(fid,'%s','Delimiter','\n');
s = s{1};
fclose(fid);
s = regexprep(s,'(?<=[0-9]) ',',');
s = split(s,',');
Year = str2double(s(:,2));
Mag = str2double(s(:,3));
Town = s(:,4);
Intensity = cell(size(Mag));
Intensity(Mag<2) = {'1'};
Intensity(Mag<3) = {'1-2'};
Intensity(Mag<4) = {'2-4'};
Intensity(Mag<5) = {'4-6'};
Intensity(Mag<6) = {'6-8'};
Intensity(Mag<7) = {'7-10'};
Intensity(Mag>=7) = {'>8'};
Result = table(Year,Mag,Town,Intensity);
The result is as follows:
>> Result
Result =
4×4 table
Year Mag Town Intensity
____ ___ ____________ _________
2016 4.8 'Fairview' '7-10'
1812 7.5 'New Madrid' '>8'
2016 2.8 'Avenal' '7-10'
2011 5.8 'Mineral' '7-10'

Categories

Find more on Dates and Time in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!