Polyfit isn't adjusting to decreasing sinoidal data

8 views (last 30 days)
Data is uploaded via excel and plots properly, finding peaks properly. Simply what am I doing wrong that polyfit produces such nonsense? I already tried inputing a dummy sinoidal wave in the command window and polyfit easily does it. I've tried with up to 100th degrees and nothing even comes close.
% Read the file
data = readtable("File Path");
% Extract time and gyroscope y columns
timeData = data.Time;
gyroYData = data.GyroscopeY;
% Filter data no longer than 16 seconds
idx = timeData <= 16;
timeData = timeData(idx);
gyroYData = gyroYData(idx);
% Plot the relevant data
figure;
plot(timeData, gyroYData, 'b-', 'LineWidth', 1.5);
xlabel('Time (s)');
ylabel('Gyroscope Y (rad/s)');
title('Gyroscope Y vs Time');
grid on;
% Find the peaks for the data
% Identify the peaks in the gyroscope data
[peaks, locs] = findpeaks(gyroYData, 'MinPeakHeight', 0.5, 'MinPeakDistance', 10);
peakTimes = timeData(locs);
% Plot the peaks alongside the data
hold on;
plot(peakTimes, peaks, 'ro', 'MarkerSize', 8, 'DisplayName', 'Detected Peaks');
% Find a polynomial aproximation and evaluate it
% Fit a polynomial to the gyroscope data
[p, S] = polyfit(timeData, gyroYData, 50);
fittedValues = polyval(p, timeData);
% Plot the polynomial alongside the original data and the peaks
plot(timeData, fittedValues, 'g--', 'LineWidth', 1.2, 'DisplayName', 'Polynomial Fit');
legend('Gyroscope Y Data', 'Detected Peaks', 'Polynomial Fit');
hold off;
This is the output I get:

Accepted Answer

John D'Errico
John D'Errico on 16 Sep 2025
Edited: John D'Errico on 16 Sep 2025
Do you appreciate just how NON-polynomial-like this data is? Your data appears to be a variation of decaying sinusoid.
While, in theory, you can use a polynomial to fit a huge variety of curves, in practice, it does not work that way, and certainly not using double precision arithmetic. Polymomial models have their limits, and this data is far beyond that limit. I would not expect a polynomial model to offer anything at all of value here.
What should you use to fit that data? The obvious (to me) is I would start with a decaying sinusoidal model. So a sinusoidal function, with an amplitude that varies with time, a phase angle, and a period, where all of those things are estimated by an optimization tool.
If I found that to work ok, but that maybe the period itself seems to vary a little with time, I might put a term in there to handle that.
Such a model would be reasonably parsimonious, only 4 needing or 5 parameters, if you choose the model wisely after looking at your data, and you supply decent starting values for those parameters. That should not be difficult to do.
If you were still unhappy with that, you could employ a sine interpolant. The problem with a spline is it gives you nothing you can use to try to understand what is happening. ll if gies you is a pretty curve.\
Other things you might use are series approximations, but ones where the basic series is more likely to fit your data. For example, a Fourier series might offer some information here, or even a Bessel series might be of interest. That is, pick some family that looks like what you have.
  4 Comments
Ander
Ander on 17 Sep 2025
Edited: Ander on 17 Sep 2025
I promise I do realize this is very much a decaying sinusoidal funciton, as stated in the title, I simply asked copilot what would be the best tool to model this data, explicitly mentioning its a decaying sinoid, and polynomial fit is what it recommended. I am not entirely familiar with the rest of the modelling tools, so thats why I chose to trust it.
And yes, there is variation between periods, which is exactly what I want to analyse after showing the theorietical model of the data.
John D'Errico
John D'Errico on 17 Sep 2025
Edited: John D'Errico on 18 Sep 2025
The number of times I've gotten a good, accurate answer about mathematics (that I could trust) from an AI tool? Very, very rare, far more rare than I would have believed given the hype. I always have needed to verify everything it told me. Honestly, that is a good thing. And sometimes these tools can push you in the right direction, as long as they don't push you in totally the wrong direction.
The number of times I got an answer that LOOKED reasonable at a glance? This is very often. again though, when you look more deeply, too often there is a flaw in what I have been given. But it looks good! And since a computer told me, it must be accurate, right? And sometimes, when that AI tool gets stuck, it makes something up. Not only have I seen that happen personally, but I can cite the example of a LAW AI that made up legal citations out of thin air. Sadly, those were then used by an unfortunate lawyer, without checking the citations. Again, NEVER trust what you see from an AI tool without doing your own research.
And part of the problem is a subtle change in how I ask a question can often yield completely different answers. It is terribly frustrating. But you need to take what it says here with a large grain of salt. A polynomial model is completely worthless for that problem. Absolute, complete BS.
If your goal is to investigate the changing period, I would look for zero crossings. Use interpolation (possibly involving a spline! In my case, I've written tools that will return zero crossings for a spline model. You can find such a tool in my SLM toolbox on the FEX. That would be SLMSOLVE, which will work on a PP-form spline, as a tool like spline returns.)
For example...
x = linspace(0,100,1000);
y = exp(-x/50).*sin(x.^1.3);
plot(x,y)
So a decaying sinusoid. But the period varies.
spl = spline(x,y);
Per = slmsolve(spl,0)
Per =
Columns 1 through 10
0 2.4123 4.1114 5.6162 7.0072 8.3194 9.572 10.777 11.943 13.075
Columns 11 through 20
14.179 15.258 16.314 17.35 18.368 19.369 20.355 21.327 22.285 23.232
Columns 21 through 30
24.167 25.091 26.005 26.91 27.805 28.692 29.571 30.442 31.306 32.162
Columns 31 through 40
33.012 33.855 34.692 35.523 36.348 37.168 37.982 38.791 39.595 40.394
Columns 41 through 50
41.189 41.978 42.764 43.545 44.322 45.095 45.864 46.629 47.39 48.148
Columns 51 through 60
48.902 49.652 50.399 51.143 51.884 52.622 53.356 54.087 54.816 55.541
Columns 61 through 70
56.264 56.984 57.701 58.416 59.128 59.837 60.544 61.249 61.951 62.65
Columns 71 through 80
63.347 64.042 64.735 65.426 66.114 66.8 67.484 68.166 68.846 69.524
Columns 81 through 90
70.2 70.874 71.546 72.217 72.885 73.551 74.216 74.879 75.54 76.2
Columns 91 through 100
76.858 77.514 78.168 78.821 79.472 80.121 80.769 81.416 82.061 82.704
Columns 101 through 110
83.346 83.986 84.625 85.263 85.899 86.533 87.167 87.799 88.429 89.058
Columns 111 through 120
89.686 90.313 90.938 91.562 92.184 92.806 93.426 94.045 94.663 95.279
Columns 121 through 127
95.894 96.509 97.121 97.733 98.344 98.953 99.562
As you can see, a clearly time varying period. You could as easily have used a tool like fzero to identify those zero crossings. Well, slightly more work, but not by a lot.
Perdiff = diff(Per)
Perdiff =
Columns 1 through 10
2.4123 1.6991 1.5048 1.3911 1.3122 1.2526 1.2051 1.1658 1.1326 1.1038
Columns 11 through 20
1.0786 1.0562 1.036 1.0178 1.0011 0.98585 0.97172 0.95861 0.94639 0.93496
Columns 21 through 30
0.92423 0.91412 0.90458 0.89555 0.88698 0.87882 0.87106 0.86364 0.85655 0.84976
Columns 31 through 40
0.84325 0.83699 0.83098 0.82519 0.8196 0.81421 0.80901 0.80398 0.79911 0.7944
Columns 41 through 50
0.78983 0.78539 0.78109 0.77691 0.77285 0.76889 0.76504 0.7613 0.75764 0.75408
Columns 51 through 60
0.75061 0.74722 0.74391 0.74068 0.73752 0.73444 0.73141 0.72845 0.72556 0.72274
Columns 61 through 70
0.71996 0.71724 0.71457 0.71196 0.7094 0.70688 0.70442 0.70199 0.69962 0.69728
Columns 71 through 80
0.69498 0.69273 0.69051 0.68834 0.68619 0.68408 0.682 0.67996 0.67796 0.67599
Columns 81 through 90
0.67403 0.67211 0.67023 0.66837 0.66653 0.66473 0.66294 0.66118 0.65946 0.65774
Columns 91 through 100
0.65607 0.6544 0.65277 0.65114 0.64955 0.64797 0.64642 0.64487 0.64337 0.64186
Columns 101 through 110
0.64039 0.63893 0.63747 0.63606 0.63464 0.63324 0.63188 0.63051 0.62916 0.62784
Columns 111 through 120
0.62653 0.62521 0.62393 0.62266 0.62141 0.62015 0.61892 0.6177 0.61649 0.6153
Columns 121 through 126
0.61412 0.61295 0.61179 0.61064 0.6095 0.60838
The result here will be approximately half the local period. Even better would be to use a moving sum, with a window of length 2 to compute that local period. Conv will suffice.
conv(Perdiff,[1 1],'valid')
ans =
Columns 1 through 10
4.1114 3.2039 2.8959 2.7032 2.5647 2.4576 2.3709 2.2984 2.2364 2.1825
Columns 11 through 20
2.1348 2.0922 2.0538 2.0189 1.987 1.9576 1.9303 1.905 1.8814 1.8592
Columns 21 through 30
1.8384 1.8187 1.8001 1.7825 1.7658 1.7499 1.7347 1.7202 1.7063 1.693
Columns 31 through 40
1.6802 1.668 1.6562 1.6448 1.6338 1.6232 1.613 1.6031 1.5935 1.5842
Columns 41 through 50
1.5752 1.5665 1.558 1.5498 1.5417 1.5339 1.5263 1.5189 1.5117 1.5047
Columns 51 through 60
1.4978 1.4911 1.4846 1.4782 1.472 1.4659 1.4599 1.454 1.4483 1.4427
Columns 61 through 70
1.4372 1.4318 1.4265 1.4214 1.4163 1.4113 1.4064 1.4016 1.3969 1.3923
Columns 71 through 80
1.3877 1.3832 1.3789 1.3745 1.3703 1.3661 1.362 1.3579 1.3539 1.35
Columns 81 through 90
1.3461 1.3423 1.3386 1.3349 1.3313 1.3277 1.3241 1.3206 1.3172 1.3138
Columns 91 through 100
1.3105 1.3072 1.3039 1.3007 1.2975 1.2944 1.2913 1.2882 1.2852 1.2822
Columns 101 through 110
1.2793 1.2764 1.2735 1.2707 1.2679 1.2651 1.2624 1.2597 1.257 1.2544
Columns 111 through 120
1.2517 1.2491 1.2466 1.2441 1.2416 1.2391 1.2366 1.2342 1.2318 1.2294
Columns 121 through 125
1.2271 1.2247 1.2224 1.2201 1.2179
What you should see is I never needed to do any kind of a polynomial model. A spline interpolant was perfect here. I had no need to worry about the lack of fit of some arbitrary model form, like a polynomial.

Sign in to comment.

More Answers (2)

Matt J
Matt J on 16 Sep 2025
Edited: Matt J on 17 Sep 2025
Polynomial fitting above 10 degrees or so is a famously ill-conditioned numerical task, see also,
It is not clear why you would be modeling as a polynomial when you already know it is a sinusoid.
  4 Comments
Sam Chak
Sam Chak on 17 Sep 2025
You mentioned "modeling". However, it is necessary to distinguish between modeling the gyroscope system and fitting the gyroscope response. Fitting the curve involves finding the mathematical function that describes the input-output relationship. If you intend to use the fitted function to predict other time responses, that is not modeling. Modeling the system is about determining the governing differential equation, or that produces the time response of the system from arbitrary initial conditions, such as initial position and initial velocity.
Do not rely 100% on Copilot. While Copilot may interpret your intention as fitting the data, your objective is probably to identify the governing model of the mechanical gyroscope, which may be a 2nd-order differential equation.
Matt J
Matt J on 18 Sep 2025
Edited: Matt J on 18 Sep 2025
@Ander if you have the Curve Fitting Toolbox, one approach would be:
STEP 1: FIt an envelope function to the (locs,peaks) data extracted in your code. For example an exponential function, if that's what the decay is supposed to be, would be:
x = locs(:); % ensure column vector
y = peaks(:); % ensure column vector
% Fit the exponential model y = a*exp(b*x)
fenv = fit(x, y, 'exp1');
STEP 2: Use step 1 to undo the decay of gyroData, convering it to a pure sinusoid. Then fit with that:
x = timeData(:);
y = gyroData(:)./fenv(gyroData(:)); % Flatten the decay
% Fit the sinusoidal model y=a*sin(b*x+c)
fsin = fit(x, y, 'sin1');
STEP 3: Refine the above parameter estimates by fitting with a simultaneous custom model. Use the the parameters from fenv and fsin to get a good startpoint for the iterative solution process:
ft = fittype('A*exp(-alpha*x).*sin(omega*x + phi)', ...
'independent', 'x', ...
'coefficients', {'A','alpha','omega','phi'});
startPoints = [fenv.a*fsin.a, -fenv.b, fsin.b, fsin.c];
% Redo fit with a comprehensive model
fTotal = fit(timeData(:), gyroData(:), ft, 'StartPoint', startPoints);
plot(fTotal, timeData(:), gyroData(:))

Sign in to comment.


Walter Roberson
Walter Roberson on 16 Sep 2025
[p, S] = polyfit(timeData, gyroYData, 50);
Your time data ranges from 0 to 16.
With a 50 degree polynomial, and without using centering and scaling, the contribution of the x coefficient would be from 0^50 to 16^50, which is a range from 0 to 1.607e60
Values in the range 1e60 completely wash out values in the range 0 to 8^50 . You are using a garbage fit.The maximum degree that can be used with your data without washing away coefficients is degree 13.
  1 Comment
Walter Roberson
Walter Roberson on 17 Sep 2025
On the right hand side of the graph, you have a section that is clearly going to zero.
Polynomial fits of data will go to +infinity or -infinity, because the leading coefficient a*x^n will eventually overwhelm everything else as x goes to infinity.
Therefore the only fit that makes sense for this graph, is the everywherre-zero polynomial.

Sign in to comment.

Categories

Find more on Spline Postprocessing in Help Center and File Exchange

Tags

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!