- the x data needed to be shifted so that the y minimum is relocated at x = 0 , otherwise the model provided cannot give any good fit. Simply look at the formula : the first (and dominant) harmonic is (1/2)*(ks2*(1+cos(x)) , so to have y = ks1 + (1/2)*(ks2*(1+cos(2*pi*f*x)) minimal (upper harmonics neglected here) , it requires ks2 to be negative and x be 0.
- the "frequency" in the cos(x) terms is incorrect as the data show a periodicity of 180 (degrees ?) so we have to modify the model and introduce that angular period : therefore cos(x) is modified into cos(2*pi*f*x) and idem for the upper harmonics
- at the end we can get a reasonnable good fit as shown below
- you can probably use these info's for your own code

# Improving fit of data to sum of cosines model

10 views (last 30 days)

Show older comments

I am trying to use fitnlm to fit a potential energy versus dihedral rotation angle dataset. The fit is rather poor. My guess is that this is due to the solver thinking that it is close to a solution due to the very small SSE when really it is quite far from a solution because the peaks in the data are so miniscule. I have tried specifying TolFun and TolX via statset as options for fitnlm, but the fit remains poor as shown below.

>> opts = statset('Display','iter','TolFun',1e-20,'TolX',1e-20);

>> mdl = fitnlm(dihedral,energy2,expr,ks,'options',opts)

Norm of Norm of

Iteration SSE Gradient Step

-----------------------------------------------------------

0 2.37101e+07

1 18739 539.542 737.749

2 3.1713 5.35842 132.581

3 0.000750361 0.00704237 1.77563

4 0.000744824 8.02138e-07 0.00234913

5 0.000744824 3.01561e-06 7.92359e-07

6 0.000744824 5.24565e-07 1.49189e-07

7 0.000744824 1.03256e-07 5.7797e-07

8 0.000744824 1.20236e-06 5.45475e-09

9 0.000744824 1.57206e-06 9.88352e-12

10 0.000744824 1.11136e-06 1.39401e-14

Iterations terminated: relative change in SSE less than OPTIONS.TolFun

mdl =

Nonlinear regression model:

y ~ ks1 + (1/2)*(ks2*(1 + cos(dihedral)) + (1/2)*ks3*(1 - cos(2*dihedral)) + (1/2)*ks4*(1 + cos(3*dihedral)) + (1/2)*ks5*(1 - cos(4*dihedral)))

Estimated Coefficients:

Estimate SE tStat pValue

___________ _________ ___________ ___________

ks1 -800.51 0.0025226 -3.1733e+05 1.5132e-153

ks2 0.0001656 0.0023234 0.071275 0.94362

ks3 0.00074661 0.004499 0.16595 0.86924

ks4 -0.00094231 0.0045123 -0.20883 0.8359

ks5 -0.00069346 0.0045822 -0.15134 0.88066

Number of observations: 37, Error degrees of freedom: 32

Root Mean Squared Error: 0.00482

R-Squared: 0.00333, Adjusted R-Squared -0.121

F-statistic vs. constant model: 0.0268, p-value = 0.999

Any help in how I can refine this fit would be appreciated.

The attached workspace contains the dihedral vs energy data (dihedral = X, energy2 = Y), the expression to be fit to (expr), and the starting guess vector needed for fitnlm (i.e. beta0 = ks). An example of a failed model (mdl) and options for fitnlm (opts) is also included.

Update

I have an answer below, but I would still like to learn what I'm doing wrong with fitnlm -- if anybody has insights, would be appreciated.

##### 0 Comments

### Accepted Answer

Mathieu NOE
on 25 Aug 2022

hello

some findings that may interest you ...

as I don't have the Curve Fitting Toolbox , this work has been done with the poor's man solution (with fminsearch)

so my findings :

load('Workspace_dihedral_vs_energy_fitting.mat');

x = dihedral;

y = energy2;

%% important !!

x = x - 42; % needed to shift x data to center the y minimum at x = 0 and make the fitting work !

%% curve fit using fminsearch

f_est = 1/180; % angular frequency

[y_fit,sol] = myfit(x,y,f_est); % NB : sol contains best estimates of ks1,ks2,ks3,ks4,ks5,f

Rsquared = my_Rsquared_coeff(y,y_fit); % correlation coefficient

figure(1)

plot(x,y, 'b .-',x, y_fit, 'r-', 'MarkerSize', 15)

title(['Model Fit : R² = ' num2str(Rsquared)], 'FontSize', 15)

xlabel('dihedral', 'FontSize', 14)

ylabel('energy²', 'FontSize', 14)

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

function [y_fit,sol] = myfit(x,y,f_est)

% Nonlinear regression model:

% y = ks1 + (1/2)*(ks2*(1 + cos(dihedral)) + (1/2)*ks3*(1 - cos(2*dihedral)) + (1/2)*ks4*(1 + cos(3*dihedral)) + (1/2)*ks5*(1 - cos(4*dihedral)))

func = @(ks1,ks2,ks3,ks4,ks5,f,x) ks1 + (1/2)*(ks2*(1+cos(2*pi*f*x)) + (1/2)*ks3*(1-cos(2*pi*f*2*x)) + (1/2)*ks4*(1+cos(2*pi*f*3*x)) + (1/2)*ks5*(1-cos(2*pi*f*4*x)));

obj_fun = @(params) norm(func(params(1), params(2), params(3), params(4), params(5), params(6),x)-y);

sol = fminsearch(obj_fun, [mean(y),0,0,0,0,f_est]); % to be updatd

a_sol = sol(1);

b_sol = sol(2);

c_sol = sol(3);

d_sol = sol(4);

e_sol = sol(5);

f_sol = sol(6);

y_fit = func(a_sol, b_sol, c_sol, d_sol, e_sol, f_sol,x);

end

function Rsquared = my_Rsquared_coeff(data,data_fit)

% R2 correlation coefficient computation

% The total sum of squares

sum_of_squares = sum((data-mean(data)).^2);

% The sum of squares of residuals, also called the residual sum of squares:

sum_of_squares_of_residuals = sum((data-data_fit).^2);

% definition of the coefficient of correlation is

Rsquared = 1 - sum_of_squares_of_residuals/sum_of_squares;

end

### More Answers (1)

Alan Stevens
on 25 Aug 2022

Try changing the dihedral angles to radians (or change cos to cosd in expr).

##### 3 Comments

### See Also

### Categories

### Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!