Extract rows from a cell array based on keywords

3 views (last 30 days)
Hello,
I have created the following cell array:
'6440.016' 'hold'
'7378.016' 'max-pressure-evac 60.0'
'7439.016' 'evac-end'
'8372.016' 'max-pressure-evac 60.0'
'8433.016' 'evac-end'
'9365.016' 'max-pressure-evac 60.0'
'9426.016' 'evac-end'
'10357.016' 'max-pressure-evac 60.0'
'10418.016' 'evac-end'
'11338.016' 'max-pressure-evac 60.0'
'11399.016' 'evac-end'
'12298.016' 'max-pressure-evac 60.0'
'12359.016' 'evac-end'
'13259.016' 'max-pressure-evac 60.0'
'13320.016' 'evac-end'
'14221.016' 'max-pressure-evac 60.0'
'14282.016' 'evac-end'
'15184.016' 'max-pressure-evac 60.0'
'15245.016' 'evac-end'
'16149.016' 'max-pressure-evac 60.0'
'16210.016' 'evac-end'
'17115.016' 'max-pressure-evac 60.0'
'17176.016' 'evac-end'
'18082.016' 'max-pressure-evac 60.0'
'18143.016' 'evac-end'
'19050.016' 'max-pressure-evac 60.0'
'19111.016' 'evac-end'
'19424.016' 'hold-end'
'19553.016' 'hold'
'20272.016' 'max-pressure-evac 60.0'
'20333.016' 'evac-end'
'21109.016' 'max-pressure-evac 60.0'
'21170.016' 'evac-end'
'21919.016' 'hold-end'
'22048.016' 'hold'
'22651.016' 'max-pressure-evac 60.0'
'22712.016' 'evac-end'
'23374.016' 'max-pressure-evac 60.0'
'23435.016' 'evac-end'
'23489.016' 'hold-end'
'23619.016' 'hold'
'24131.016' 'max-pressure-evac 60.0'
'24192.016' 'evac-end'
'24762.016' 'max-pressure-evac 60.0'
'24823.016' 'evac-end'
'25060.016' 'hold-end'
'25190.016' 'hold'
'25640.016' 'max-pressure-evac 60.0'
'25701.016' 'evac-end'
'26212.016' 'max-pressure-evac 60.0'
'26273.016' 'evac-end'
'26631.016' 'hold-end'
Clearly it can be seen that there are multiple occaisons (5 to be exact) where the lines 'max-pressure-evac 60.0' and 'evac-end' alternate with eachother more than once. When this happens I would ideally like to extract the numbers corresponding to the second to last 'evac-end'/'max-pressure-evac 60.0' combo. For example, in:
'25190.016' 'hold'
'25640.016' 'max-pressure-evac 60.0'
'25701.016' 'evac-end'
'26212.016' 'max-pressure-evac 60.0'
'26273.016' 'evac-end'
'26631.016' 'hold-end'
I would want to keep the numbers 25701.016 and 26212.016 which corresond to 'evac-end' and 'max-pressure-evac 60.0' respectively. Any help in solving this problem would be greatly appreciated.

Accepted Answer

Jon
Jon on 28 Jan 2022
Edited: Jon on 28 Jan 2022
Let say you call your cell array A, then you could do this:
% make first column into numerical values rather than strings
x = str2double(A(:,1));
% make another vector with the event names, just to make easier to work
% with
event = string(A(:,2));
% find instances of first keyword
idl1 = strcmp(event,['max-pressure-evac 60.0']);
% find instance of second keyword
idl2 = strcmp(event,'evac-end');
% mark all of the locations where first keyword is directly followed by
% second keyword
idl = idl1 & [idl2(2:end);0];
idx = find(idl); % get the corresponding indices
% also mark the locations of the second keyword in the group
idl(idx+1) = true;
% find the end of each group
idx = find(diff(idl) == -1);
% save the second to the last values
evacEndVals = x(idx-2)
maxPressVals = x(idx-1)
with your example data this gives, the following, which I think is what you are asking for.
evacEndVals =
1.0e+04 *
1.8143
2.0333
2.2712
2.4192
2.5701
maxPressVals =
1.0e+04 *
1.9050
2.1109
2.3374
2.4762
2.6212
  6 Comments
Jon
Jon on 28 Jan 2022
I think this does what you ask:
% make first column into numerical values rather than strings
x = str2double(A(:,1));
% make another vector with the event names, just to make easier to work
% with
event = string(A(:,2));
% find instances of first keyword
idl1 = strcmp(event,['max-pressure-evac 60.0']);
% find instance of second keyword
idl2 = strcmp(event,'evac-end');
% mark all of the locations where first keyword is directly followed by
% second keyword
idl = idl1 & [idl2(2:end);0];
idx = find(idl); % get the corresponding indices
% also mark the locations of the second keyword in the group
idl(idx+1) = true;
% find the start end of each group
idxStart = find(diff([0;idl]) == 1);
idxEnd = find(diff(idl) == -1);
% mark the number of repeats at the end of each group
rep = zeros(size(idl));
rep(idxEnd) = (idxEnd - idxStart + 1)/2; % each two elements in a group is one repeat
% store the x values that are to be kept in a location corresponding to the
% end of each group
% initialize with nan so we can tell later which have be reassigned
%
evacEndVals = nan(size(x));
maxPressVals = nan(size(x));
% save the values corresponding to single occurences
idx = find(rep==1); % index of end of each single
evacEndVals(idx) = x(idx);
maxPressVals(idx) = x(idx-1);
% save the second to the last values for multiple occurences
idx = find(rep > 1)
evacEndVals(idx) = x(idx-2);
maxPressVals(idx) = x(idx-1);
% get rid of the unassigned values (nan's)
evacEndVals = evacEndVals(~isnan(evacEndVals))
maxPressVals = maxPressVals(~isnan(maxPressVals))
For your new input data set this gives:
evacEndVals =
1.0e+05 *
1.2791
1.4540
1.6551
1.8611
2.0690
maxPressVals =
1.0e+05 *
1.2785
1.4534
1.7556
1.9503
2.1491
If you don't want to retain the evacEndVals when there aren't repeats you can modify these lines
% save the values corresponding to single occurences
idx = find(rep==1); % index of end of each single
evacEndVals(idx) = x(idx);
maxPressVals(idx) = x(idx-1);
to
% save the values corresponding to single occurences
idx = find(rep==1); % index of end of each single
maxPressVals(idx) = x(idx-1);
Justin Rosenthal
Justin Rosenthal on 31 Jan 2022
Thank you again so much Jon. This is again exactly what I was looking to accomplish.

Sign in to comment.

More Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!