Selecting rows from a file starting with a letter followed by 3 numbers

3 views (last 30 days)
Hi there, I have a function that selects information from two files and combines this information into one new file. Importantly, I only want certain rows from file 2 to be included in the new document, namely rows for which the second column starts with 'S1' (followed by 2 other random numbers, eg: S124, S132, S112 etc.). Below is my code. I don't get an error message, however, it also does not give me the file I want. It prints only the header of the new file, but not the newly compiled lines. What am I doing wrong?
function []=CreateNewMarkersEEGlab(pNumber)
dataFileName=strcat(int2str(pNumber),'_logfile.txt');
fid = fopen(dataFileName);
C = textscan(fid, '%s%s%s%s%s%s%s%s%s%s%s%s%s%s', ...
'headerlines', 1);%14 columns
rScore=C{12};
sNumber=C{5};
cNumber=C{6};
subNumber=C{7};
tcode=C{10};
fclose(fid);
% compute the new marker
for i=1:156
if strcmp(rScore{i}, '1')
respMarker='corrresp';
else
respMarker='incorrresp';
end
if strcmp(tcode{i},'1')
corrMarker='corr';
else
corrMarker='incorr';
end
newMarker{i} = sprintf('S1_con%s_sub%s_%s_%s_Snr_%s', ...
cNumber{i}, subNumber{i}, corrMarker, ...
respMarker, sNumber{i});
end
% read the old marker file
dataFileName=strcat('EEG_Anne_',int2str(pNumber),'.vmrk');
fid = fopen(dataFileName);
headline1=fgets(fid);
headline2=fgets(fid);
headline3=fgets(fid);
...
headline13=fgets(fid);
C = textscan(fid, '%s%s%d%d%d','Delimiter',',');
Type=C{1};
Stimulus=C{2};
Position=C{3};
Length=C{4};
Channel=C{5};
fclose(fid);
% rewrite the new marker file
outFileName=strcat('EEG_Anne_',int2str(pNumber),'_new.vmrk');
fid = fopen(outFileName,'w+');
fprintf(fid,headline1);
fprintf(fid,headline2);
fprintf(fid,headline3);
...
fprintf(fid,headline13);
for i=1:156
if strcmp(Stimulus,'S1\d*')
fprintf(fid, '%s,%s,%d,%d,%d\r\n', ...
Type{i}, newMarker{i}, Position(i), Length(i), Channel(i));
end
end
fclose(fid);

Accepted Answer

dpb
dpb on 12 Oct 2015
Not much can do w/o data file, but you can do what we can...
doc debug
Set a breakpoint and work thru the function to see where it fails.
  1 Comment
Anne Mickan
Anne Mickan on 12 Oct 2015
Ok, so where the function seems to run into trouble is this line
if strcmp(Stimulus,'S1\d*')
I assume it's something to do with the fact that I'm using regex to describe which lines to extract and it doesn't recognize it as regex?
I also attached two example files for you to run the script. It's the logfile (that is called first) and the marker file (called second). The marker file (EEG_Anne_117 should actually have the extension vmrk but since I cannot upload that file format I changed it to .txt). Both need to be in same folder.

Sign in to comment.

More Answers (1)

dpb
dpb on 12 Oct 2015
Edited: dpb on 12 Oct 2015
Indeed, per the doc for strcmp and friends, they "know nuthink!" about regular expressions. Use regexp instead.
if regexp(Stimulus{i},'S1\d*')
NB: You can test this at the command line simply by pasting in a line of typical text to work out the kinks...or, of course, at the debugger prompt.
ADDENDUM It may not be critical owing to the nature of the file generation process, but to guarantee there are three and only three digits after the S, use the count field instead of '*' ...
if regexp(Stimulus{i},'S1\d{2},') % find S1nn only, not S1n or S1nnn
  5 Comments
Anne Mickan
Anne Mickan on 13 Oct 2015
Hi, I actually just realized that it didn't work properly. The problem is that the lines from the second document (the one called with the regex function) are combined with the wrong lines from the first file. Just as some lines are jumpted in the second document, these lines are also jumped in the first file, which is not supposed to happen. File 1 has exactly 156 rows which are all relevant for the output file (which also should have 156 rows). So in other words, I want to combine for example rows 7,13 , 23 from file 2 with lines 1,2,3 from file 1. How do I tell Matlab not to jump rows in file 1 as it does for file 2?
PS: I thought that maybe it would be easier to only read in the relevant lines from file 2.
dpb
dpb on 13 Oct 2015
Edited: dpb on 13 Oct 2015
As you've written the loop, you could have a maximum of 156 lines if there were no other lines in the file excepting those which satisfied the test.
I don't know the file structure well enough otomh to know for certain, but perhaps without thinking about it much if you just did that loop as
j=0; % secondary counter for old file
for i=1:length(Type)
if strcmp(Stimulus{i},'S1\d{2}')
j=j+1;
fprintf(fid, '%s,%s,%d,%d,%d\r\n', ...
Type{i},newMarker{j},Position(i), Length(i), Channel(i));
end
end
nirvana just might appear. No promises; but if stuff is in the same order between the two, that at least should produce the right number of outputs as it will go thru the whole list which it would appear the counted loop wouldn't.
If that doesn't work, I suspect the way to solve the problem will revolve about ismember or similar logic to, as you mention, produce the matched pairs by direct lookup/comparison. But, try the above first; it just may suffice.

Sign in to comment.

Categories

Find more on Large Files and Big Data in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!