Derivation of noisy signal through Savitzky-Golay

28 views (last 30 days)
Thomas Debaucheron on 14 Jul 2024 at 13:55
I want to get an approximation of a noisy signal's second derivative through the use of a Savitzky-Golay differentation matrix. The final goal is to (1) smooth out and (2) drastically downsample the initial signal without losing information. I was inspired for this idea by Jan's post on https://nl.mathworks.com/matlabcentral/answers/1454924-downsample-data-adapively-intelligently where they used the gradient function to determine where the represented data has the strongest curvature and hence where it needs the most points.
The result is pretty much what I hoped for except for the edges of my domain where absent data is assumed to be zero which creates strong gradients across the width of the filter. This issue is not as impactful on the smoothing as it is on the derivation. In fact, function sgolayfilt satisfyingly corrects the edge effect from the straight up use of sgolay as explained in this https://nl.mathworks.com/help/signal/ref/sgolayfilt.html thread.
Then my problem is: how can I similarly resolve this effect on the approximated second derivative? I tried looking into the theory of this but it confuses me a little.
My current solution to the limitations of this filter was to equate the estimated derivative on these data points close to the edges to the closest value I was able to estimate properly. Here is the code I came up with so far:
%sgFilter - Adaptative downsampling using a Savitzky-Golay filter
%
% This MATLAB function uses a Savitzky-Golay differentiation filter in
% order to successively smooth out a signal x(t) and reduce its sample
% size down to n datapoints. An estimate of the 2nd derivative through
% the lens of the filter allows to map out the curvature of the signal
% without concern for the noise and helps identify hotspots needed for
% the accurate representation of the signal after downsampling.
%
% y filtered signal
% y(idx) filtered and sampled signal
% . . . .
% dt uniform step size
% order polynomial regression order
% window filter band width
%
% [y, idx] = sgFilter(x, dt, order, window, n)
function [y, idx] = sgFilter(x, dt, order, window, n)
% -----------------------------------------------------------------------
% Savitzky-Golay differentiation matrices
[b, g] = sgolay(order, window);
m = (window - 1)/2;
% -----------------------------------------------------------------------
% p-th order derivative estimates, i.e. 0: smooth and 2: 2nd derivative
dx = zeros(length(x), 2);
i = 0;
for p = [0, 2]
i = i + 1;
dx(:,i) = conv(x, factorial(p)/(-dt)^p * g(:, p + 1), 'same');
end
% -----------------------------------------------------------------------
% Correction of the edge effect on smoothing
y = dx(:, 1); % Smoothed signal
y(1:m) = b(1:m, :)*x(1:window);
y(end - m + 1:end) = b(window - m + 1:window, :)*x(end - window + 1:end);
dy = dx(:, 2); % Smoothed signal's derivative
dy(1:m) = dy(m + 1);
dy(end - m + 1:end) = dy(end - m);
% -----------------------------------------------------------------------
% Adaptative downsampling, inspired from Jan at <https://nl.mathworks.com
% (17th May 2024).
if nargout > 1
sy = cumsum(abs(dy));
sy = sy + linspace(0, max(sy)/100, numel(sy)).'; % Monotonic increasing
idx = interp1(sy, 1:numel(sy), linspace(sy(1), sy(end), n), 'nearest');
end
And following is a test script I created with this function and the associated result:
% -------------------------------------------------------------------------
% One should note that the success of this method is very dependent on the
% choice of polynomial regression order and window band width. In the
% particular case of periodic signals, the window should be wide enough to
% encompass periods with corresponding polynomial behaviour---e.g. a full
% sin(x) rotation from x to x + 2*pi cannot be represented accurately by a
% polynome of lower degree than 3, nor will it always be advantageous to
% have a degree higher than 3 due to Runge's phenomenon. On this note, one
% should be conscious of the size of the window relative to the overall
% behaviour of the signal.
% -------------------------------------------------------------------------
dt = 0.05;
t = (0:dt:20-1)';
order = 3;
window = 51;
x = 5*sin(2*pi*0.2*t)+0.5*randn(size(t));
[y, idx] = sgFilter(x, dt, order, window, 60);
plot(t, x, '-k')
hold on
plot(t, y, '-r')
plot(t(idx), y(idx), 'ro')
hold off
legend('x','x (smoothed)','x (sampled)')
As you can see, the current band width of the filter is approximately 2.5 long on the horizontal axis and half of that window is impacted by the edge effect with an overrepresentation close to t = 0 and an underrepresentation on the other end.