How could I find the start index of an "approximate" pattern in a binary vector

3 views (last 30 days)
I have a binary time series which represents the Hi/Low status of a sensors output over time. I'm looking for a method which will help me efficiently find the start and end indices of a specific pattern of on/off pulses. which looks something like this:
case1 = [0 0 0 0 0 0 0 1 1 1 0 0 0 1 1 1 0 0 0 1 1 1 0 0 0 0 0 0 0 0]
In this case there's a pattern (three high pulses, separated by low pulses of equivalent length) starting at case1(8) and ending at case1(22). I've been able to bodge a partial solution for finding this pattern out of an answer to a similar question which looks for zero islands (below) by looking for three 1 islands of length 3 with start indices quite close together in time.
https://stackoverflow.com/questions/3274043/finding-islands-of-zeros-in-a-sequence
However in my case the length of the 1 pulses and the number of zeros between them varies slightly due to mechanical issues with the sensor so we may get variations on the pattern, example:
case2 = [ 0 1 1 0 0 0 0 1 1 1 0 0 0 1 1 1 1 0 ]
Alternatively two patterns can occur very close to one another so that they merge
case3 = [0 1 1 1 0 0 0 1 1 1 0 0 0 1 1 1 1 1 1 0 0 0 1 1 1 0 0 0 1 1 1 0 0 0]
The rules for finding the pattern in case1 would therefore not work for case2 or case3. Could anyone therefore offer any advice on how to approach locating a pattern which 'approximately' matches a defined template with some degree of tolerance on the length of pulses within the pattern?

Answers (1)

Grzegorz Knor
Grzegorz Knor on 11 Jul 2017
Edited: Grzegorz Knor on 11 Jul 2017
I would try regular expressions :
case1 = [0 0 0 0 0 0 0 1 1 1 0 0 0 1 1 1 0 0 0 1 1 1 0 0 0 0 0 0 0 0]
case2 = [ 0 1 1 0 0 0 0 1 1 1 0 0 0 1 1 1 1 0 ]
case3 = [0 1 1 1 0 0 0 1 1 1 0 0 0 1 1 1 1 1 1 0 0 0 1 1 1 0 0 0 1 1 1 0 0 0]
% transform it to strings
case1str = strrep(num2str(case1), ' ', '')
case2str = strrep(num2str(case2), ' ', '')
case3str = strrep(num2str(case3), ' ', '')
% expression for regexp
expression = '11[1,0]?00[0,1]?11[0,1]?00[0,1]?11[0,1]?';
% find start indices
regexp(case1str,expression)
regexp(case2str,expression)
regexp(case3str,expression)
% expressions & indices
[expr, ind] = regexp(case3str,expression,'match')
Based on expr and ind you can calculate last index of the pattern.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!