Importing line-by-line from a text file

Question

0 votes

Hi I have a problem choosing what kind of tool to import data from a text file. The data looks something like this although the lines are like 4000 and

35,       0401-BA1,1,6,8,00000,0,01,053,50530,'Channel                 ',,
35,       0401-CB1,1,6,8,00000,0,01,051,50510,'Spillway                ',,
35,       0401-EB1,1,3,2,00052,0,01,051,50510,'RioGrande               ',,
35,       0401-FS1,1,6,8,00203,0,01,011,27340,'Spillway                ',,
35,       0401-MB1,1,6,8,00000,0,01,037,0    ,'Canal                   ',,

I used the fscanf , textscan but I don't understand how to tackle the spaces. for the second space is it like %15c ?? Also How do I switch to another line should I use /n? Please help me here guys. Thank you

1 Comment
Show -1 older comments Hide -1 older comments

Star Strider on 3 Apr 2015

Attaching the text file (or a representative part of it) would be nice. (Use the ‘paperclip’ icon.)

Sign in to comment.

Sign in to answer this question.

Follow Question

Answer 1

per isakson on 3 Apr 2015

Edited: per isakson on 4 Apr 2015

Open in MATLAB Online

0 votes

Hint

    str = '35,  0401-BA1,1,6,8,00000,0,01,053,50530,''Channel     '',,';
    fmt = '%d%s%d%d%d%d%d%d%d%d%s%s';
    cac = textscan( str, fmt               ...
                ,   'Delimiter'    , ','   ...
                ,   'CollectOutput', true  );

" how to tackle the spaces" &nbsp depends on how you want the spaces in the resulting string variable values
"line-by-line" &nbsp what exactly do you mean?

&nbsp

Addendum I

This illustrates an approach. However, due to various mismatches between the header row and the data rows the result is not useful.

sas = cssm()

returns

    1x4000 struct array with fields:
        STATE_CODE_001
        STRUCTURE_NUMBER_008
        RECORD_TYPE_005A
        ROUTE_PREFIX_005B
        SERVICE_LEVEL_005C
        ROUTE_NUMBER_005D
        DIRECTION_005E
        HIGHWAY_DISTRICT_002
        COUNTY_CODE_003
        PLACE_CODE_004
        ....

where

    function    sas = cssm 
    ffs = 'c:\m\cssm\NM14.txt';
    fid = fopen( ffs );
    str = fgetl( fid );
    col_head_list = parse_header( str );
    sas = repmat( cell2struct( repmat({''},1,112), col_head_list(1:112), 2 ), 1,4000 ); 
    jj  = 0;
    while not( feof(fid) )
        str = fgetl( fid );
        buf = parse_data_row( str );
        jj  = jj + 1;
        sas(1,jj) = cell2struct( buf(1:112), col_head_list(1:112), 2 );
    end
    fclose( fid );
    end
    function    cac = parse_header( str )
        cac = strsplit( str, ',' );
        cac = strtrim( cac );
    end
    function    cac = parse_data_row( str )
        cac = strsplit( str, ',' );
        cac = strtrim( cac );
    end

&nbsp

Addendum II

I was fooled by the function, strsplit, which as default treats multiple delimiters as one. In my world that only applies to space. Anyhow, now I have fixed the function cssm.

Read the data to a structure

>> sas=cssm;

find all bridges, which have STATUS_WITH_10YR_RULE equal to '2'.

    >> is_sw10yrEq2 = strcmp('2',{sas.STATUS_WITH_10YR_RULE});
    >> sum(is_sw10yrEq2)
    ans =
       324

there are 324 of them. List their building year

    >> {sas(is_sw10yrEq2).YEAR_BUILT_027}
    ans = 
      Columns 1 through 9
        '1915'   '1936'   '1928'   '1931'   '1920'   '1951'   '1933'   '1939'   '1939'
      Columns 10 through 18
        '1924'   '1931'   '1931'   '1931'   '1932'   '1932'   '1936'   '1937'   '1936'
      ....

where

    function    sas = cssm 
        ffs = 'c:\m\cssm\NM14.txt';
        fid = fopen( ffs );
        str = fgetl( fid );
        colhead_list = parse_header( str );
        n_colhead = length( colhead_list );
        sas = repmat( cell2struct                                   ...
                    (   repmat({''},1,n_colhead), colhead_list, 2 ) ...
                    ,   1,4000                                      ); 
        jj  = 0;
        while not( feof(fid) )
            str = fgetl( fid );
            buf = parse_data_row( str );
            jj  = jj + 1;
            sas(1,jj) = cell2struct( buf, colhead_list, 2 );
        end
        fclose( fid );
        sas(1,jj+1:end) = [];
    end
    function    cac = parse_header( str )
        cac = strsplit( str, ',', 'CollapseDelimiters',false );
        cac = strtrim( cac );
    end
    function    cac = parse_data_row( str )
        cac = strsplit( str, ',', 'CollapseDelimiters',false );
        cac = strtrim( cac );
    end

&nbsp

Addendum III

Caveat: I hardly used the new (couple of years old), data type table. Nevertheless, table is targeted at problems like yours and I think you should

use Import Data interactively to read from a file of format B (one header line, comma separated, varying width) to a table. Make sure to uncheck Multiple Delimiters as One and select table. I've tested; it worked fine.
spend time with the documentation of table instead of fighting file formats.

&nbsp

Addendum IV

It is straightforward to read the file of format B (one header line, comma separated, varying width) with textscan

    >> cac = cssm
    cac = 
        {3951x134 cell}

where

    function    cac = cssm 
        ffs = 'c:\m\cssm\NM14.txt'; % one header line, comma separated, varying width 
        fid = fopen( ffs );
        str = fgetl( fid );
        chl = strtrim( strsplit( str, ',', 'CollapseDelimiters',false ) );
        len = length( chl );
        frm = repmat( '%s', 1, len );
        cac = textscan( fid, frm, 'CollectOutput',true ); 
        fclose( fid );
    end

7 Comments
Show 5 older comments Hide 5 older comments

adrooney on 3 Apr 2015

Open in MATLAB Online

NM14.txt

line-by-line means, scanning the text file row after row. As I put an example in the question. I tried coming up with this code.

NBIFile = 'D:\Re\nbi 2014\NM14.txt';
fid = fopen(NBIFile);
while ~feof(fid)
NBIData = fgetl(fid);
if feof(fid) break
end
% selects only the items of interest for a bridge
nos=['1' '5':'8'];  % define the numbers in the keep list
if (NBIRecord(19)=='1') & (NBIRecord(374)=='Y') & ismember(NBIRecord(200),nos)

What I did above was for the item at posiotn 19, 374 and 200 I took the value and if it is 1 and Y and values 1,5thru8 I need to extract them into another matrix in MATLAB. But I don't know how to do it.

My actual scenario is the text file please find it in the attached, has different items(120 of them) recorded for each bridge and stored row wise for each of the bridge. What I need to do is extract some of the items which I need to compare later and do some analysis. What I did was as the text file had fixed with for each item, I took the physical positions and put that in a if condition, but I am not able to go further. Please could suggest me on this.

adrooney on 3 Apr 2015

recordcount_layout_2014.xlsx

The C option is much more easy. It has got fixed widths. I tried to uses textscan but not successful. Could you give your ideas on the C option. I am attaching the field widths in an xls sheet

per isakson on 3 Apr 2015

Edited: per isakson on 4 Apr 2015

Open in MATLAB Online

"[...] ideas on the C option" &nbsp With Matlab, fixed widths should be avoided and when one cannot:

cac = textscan( fid, '%w1s%w2s%w3s%w4s%w5s...', .... );

where w1,w2,w3, etc. are the widths of the columns, e.g. '%8s%12s%30s'. Convert the "numeric columns" to numerical variables in a second step.

Sign in to comment.

Importing line-by-line from a text file

1 Comment
Show -1 older comments Hide -1 older comments

Accepted Answer

7 Comments
Show 5 older comments Hide 5 older comments

More Answers (0)

Categories

Tags

Community Treasure Hunt

Importing line-by-line from a text file

1 Comment Show -1 older comments Hide -1 older comments

Accepted Answer

7 Comments Show 5 older comments Hide 5 older comments

More Answers (0)

Categories

Tags

See Also

Community Treasure Hunt

1 Comment
Show -1 older comments Hide -1 older comments

7 Comments
Show 5 older comments Hide 5 older comments