I want to implement an end point detection for a tall spike in a sound signal

10 views (last 30 days)
Hello, I need help extracting the two indices of marked with two red crosses on the figure below. I need those two indices to extract the first tall wave of the signal outlined in red. The two red crosses are points that intersect with zero, but I can't use find(y == 0) since there is no point exactly at 0.
I need to extract these two points since they make up one period of the tall signal. I have multiple signals like this, but all with different peak locations and amplitudes. However, all the other signals also have the tallest wave as the first signal.
So far, I've tried this method:
[~, loc] = findpeaks(signal, 'MinPeakHeight', 0.2); % Get index of max amplitude above 0.2 (this threshold is arbitrary).
firstpeak = loc(1);
signal_inverse = -signal; % Inverse the signal.
[~, loc] = findpeaks(signal, 'MinPeakHeight', 0.2); % Get index of max amplitude of the inverted signal.
secondpeak = loc(1);
pad = (secondpeak - firstpeak) / 2; % Get a quarter of the wave's period.
signal = signal(firstpeak-pad:secondpeak+pad); % Subtract pad from first peak and add pad to second peak.
This, however, assumes that all waves are perfect and don't catch any noise.
A new and better approach I'm thinking of is using the Zero-Crossing detection algorithm. I've already seen many on MATLAB community, but I don't seem to be applying those into my signals properly.
Any help would be appreciated. Thank you!!
  3 Comments
Adam Danz
Adam Danz on 22 Sep 2020
With these kinds of problems you need to define a set of rules before getting into algorithms and those rules should apply to all possible circumstances.
Judging from the singel example in the image, one set of rules may be,
  1. The segment starts the first time the signal passes 0.1% of the signal's height in the positive direction, where the height is defined by the range of y values in your signal.
  2. The segment ends at the first element greater than y=0 after the signal passes into negative y values.
  3. The length of the signal must be greater than 20 samples. If it's less than 20 samples, the segmented part is ignored and the algorithm starts again after the false end-point.
Write down these rules in plain language before developing the algorithm. It's likely that the rules will need changed in the process which is a sign that you're getting closer to the ideal set of rules needed to solve your problem.
Andrew Park
Andrew Park on 24 Sep 2020
Edited: Andrew Park on 24 Sep 2020
Thank you both for your inputs. I cannot figure out how to comment on your comments separately, so I'll address both of your questions on the same box.
****
In my reply, I used the word 'data' a lot, and here I'm referring to the array of y values shown in the figure above.
So,
"original data" -> array of y values as they were detected
"inverted data" -> array of -y values from they were detected
****
@Image Analyst, thank you for letting me know about that! I'll read it over. Unfortuantely, there are several peaks that could be considered "tall" in the data but only one should be the true "tall" peak (the first peak read in the data). Maybe that makes the problem a bit easier since I need to always extract the end points of the first wave detected in the data (the red outlined one above).
I don't remember the exact same data as shown above, but I'm attaching another data of the same form as a text file. Thank you!
@Adam Danz, thank you for the tip! I should note that for myself when I develop somethig in the future. As I was figuring out how the algorithm should work, I think I naturally formulated a set of rules that apply to the method I used above:
  1. Threshold TH of the original data for what is considered "tall" is TH = max(original_data) / 1.5 (this is according to my mere observations on a lot of similar data). So, the first peak of the original data that exceeds this threshold value is considered the positive peak of the wave that needs to be extracted.
  2. Threshold TH_inverse of the inverted data for what is considered "tall" is TH_inverse = TH / 2 (this is according to my mere observations on a lot of similar data).
The negative peak is a bit tricky since the first peak that exceeds the threshold value isn't always necessarily the global minimum negative peak. For instance, the figure above has a local minimum around x=4940 that could be incorrectly detected as the first peak value as long as the local minimum is greater than the threshold. So, I wrote a code that can work around this edge case (not perfect) :
(Using MATLAB 2018b)
% finding indices of possible positive and negative peaks that need to be extracted
new_TH = max(data)/1.5; % Set threshold value for "tall" peaks
[~, loc] = findpeaks(data, 'MinPeakHeight', new_TH);
fpeak = loc(1);
data_inverse = -data;
[~, loc] = findpeaks(data_inverse, 'MinPeakHeight', new_TH/2);
% this is the code that takes care of the negative peak bit.
% if findpeaks returns multiple indices of possible negative peaks, then check if the first peak
% and second peak are less than 20 away (chose 20 from observation). If so, consider
% first peak as an outlier peak and consider second peak as the global minimum.
% Otherwise, just consider first peak as the global minimum. If findpeaks only returns
% one index of a negative peak, then just consider that peak as the global minimum.
if numel(loc) > 1
if (loc(2) - loc(1)) < 20
speak = loc(2);
else
speak = loc(1);
end
else
speak = loc(1);
end

Sign in to comment.

Accepted Answer

Adam Danz
Adam Danz on 24 Sep 2020
Edited: Adam Danz on 24 Sep 2020
"only one should be the true "tall" peak (the first peak read in the data)"
Are you sure about that? That's this peak in the data you shared. I like your other defintion better, "the first peak of the original data that exceeds this threshold value is considered the positive peak".
I have a proposal.
  1. Find the main peak using the rule you already defined. That will give you point "A" in the image below.
  2. Assuming the baseline signal is centered at y=0, points 1, 2, and 3 in the image below are the first 0-crossing before A, the first 0-crossing after A, and second 0-crossing after A. What you need, if I understand correctly, are points #1 and #3.
0-crossings can be approximated with,
zeroCrossIdx = [false; diff(sign(x))~=0];
hold on
plot(find(zeroCrossIdx)-.5, 0, 'mo', 'MarkerSize', 4)
Although there are more precise methods (ie, interpolation, also see file exchange for intersection points).
If the baseline signal is not centered at y=0 you could center it by averaging the signal prior to the first peak or removing the peaks and then avg'ing the signal, and then shift vertically based on the avg.
  2 Comments
Andrew Park
Andrew Park on 25 Sep 2020
Edited: Andrew Park on 25 Sep 2020
Thank you for your elaborate answer! I agree with you on the definition of a tall peak.
I tried using this method and it worked very well for most data. There were some that looked like this even after averaging the points coming before tall peak to 0:
So I came up with a solution that instead of averaging all the points coming before the tall peak, I averaged just some points that come right before the tall peak location at x = 101, around x = 89 ~ 93.
That way, there will be a point that crosses 0 not too far away from the first peak, which can be assigned as the point #1 of this signal.
Other than that, it seemed to work very well. Thank you for your time :)

Sign in to comment.

More Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!