Fit of multiple data set with variable parameters

Hello everyone,
I have a problem in fitting my experimental data. I have a theoretical function of this kind: where A, b, c, d, e are my fitting parameters and R is the gas constant and T is the temperature value. The peculiarity of this function is that A and b varies with temperature.
Now i want ot fit this function to several set of data taken at different temperatures.
The fit i want to make has to be this output: A and b for each temperature and c,d,e global.
I have a code that make the global fit, namely A,b,c,d,e unique fot each temperature and i want to improve it.
Do you have any suggestion?
Below i report the code i am using:
clear
R=8.314;
yfcn= @(b,x) b(1)*exp(-x(:,2).^2.*b(2)).*(1-2*exp(b(3)./(R.*x(:,1))-b(5)/R).*(1-sin(x(:,2).*b(4))./(x(:,2)*b(4))));
x=[0.5215 0.7756 1.2679 1.4701 1.6702 1.8680 2.0633 2.2693 2.4584 2.6442 2.8264 3.0046 3.0890 3.2611 3.4287 3.5917 3.7497 3.9309 4.0774 4.2183 4.3535 4.4827 4.5427 4.6628];
y1=[0.9936 0.9375 0.9081 0.8648 0.8568 0.8114 0.7711 0.8010 0.7884 0.7389 0.7901 0.7825 0.7903 0.7501 0.7070 0.7489 0.6441 0.7105 0.6735 0.6385 0.6357 0.6962 0.5946 0.6783];
y1_err= [ 0.0637 0.0526 0.0330 0.0235 0.0298 0.0223 0.0388 0.0223 0.0333 0.0326 0.0410 0.0282 0.0561 0.0235 0.0271 0.0218 0.0333 0.0252 0.0344 0.0261 0.0499 0.0396 0.0655 0.0901];
y2=[0.8748 0.8726 0.7922 0.7782 0.7396 0.6958 0.6603 0.6503 0.6556 0.6461 0.6021 0.5820 0.6220 0.5768 0.4950 0.5300 0.5234 0.5170 0.4369 0.4508 0.4409 0.4392 0.4100 0.6699];
y2_err=[ 0.0562 0.0480 0.0287 0.0211 0.0260 0.0194 0.0339 0.0188 0.0287 0.0289 0.0332 0.0225 0.0460 0.0191 0.0211 0.0169 0.0280 0.0198 0.0256 0.0204 0.0392 0.0283 0.0504 0.0856];
T1=220; %temperature reffered to y1
T2=300; %temperature reffered to y2
%T3=320; %temperature reffered to y3
T1v = T1*ones(size(x));
T2v = T2*ones(size(x));
%T3v = T3*ones(size(x));
xm = x(:)*ones(1,2);
ym = [y1(:) y2(:)];%, y3(:)];
Tm = [T1v(:) T2v(:)];% T3v(:) ];
yerr=[y1_err(:) y2_err(:)];% y3_err(:)];
xv = xm(:);
yv = ym(:);
Tv = Tm(:);
yerrv=yerr(:);
weights=1./yerrv;
xTm = [Tv xv];
B0 =[0.2941 0.5306 0.8324 0.5975 0.3353];%rand(5,1); % Use Appropriate Initial Estimates 0.6596
%B1 = fminsearch(@(b) norm(sqrt(weights).*(yv - yfcn(b,xTm))), B0); % Estimate Parameters
[B,R,J,CovB] = nlinfit(xTm,yv,yfcn,B0,'Weights',weights);
% X=B(1);
% Y=B(2);
% Z=B(3); 0.7640 0.8182 0.1002 0.1781 0.3596
% H=B(4); 0.0567 0.5219 0.3358 0.1757 0.2089
figure(1)
for k = 1:2
idx = (1:numel(x))+numel(x)*(k-1);
subplot(2,1,k)
errorbar(x.^2, ym(:,k),yerr(:,k), '.')
hold on
plot(x.^2, yfcn(B,xTm(idx,:)), '-r')
hold off
grid
ylabel('Substance [Units]')
title(sprintf('y_{%d}, T = %d', k,xTm(idx(1),1)))
ylim([min(yv) max(yv)])
end
xlabel('Q^2')
%sgtitle(sprintf('$y=e^{-(\\frac{3xK_BT}{m})^2} (1-2%.3f\\ %.3f (1-\\frac{sin(x\\ %.3f)}{x\\ %.3f}))$',B,B(3)), 'Interpreter','latex')

 Accepted Answer

(I was in the middle of posting an Answer here when my desktop crashed. Again. I really hate Windows!)
That aside, the problem needs a better set of initial parameter estimates, and the GlobalSearch funciton (in the Global Optimization Toolbox) provides it:
fitfcnw = @(b) norm(weights.*(yv - yfcn(b,xTm)));
problem = createOptimProblem('fmincon', 'x0',B0, 'objective',fitfcnw);
gs = GlobalSearch('PlotFcns',@gsplotbestf);
[B,fval] = run(gs,problem)
with the result:
GlobalSearch stopped because it analyzed all the trial points.
All 26 local solver runs converged with a positive local solver exit flag.
B =
8.862485547533210e-01
1.509391962458558e-02
-1.870848919343233e+04
1.088251234643560e+00
-4.504651478455145e+01
fval =
7.223670210423392e+00
(The code before and after this remains the same.)
that with:
mdl = fitnlm(xTm,yv,yfcn,B,'Weights',weights)
(that I added simply to get some statistics), produces:
Nonlinear regression model:
y ~ b1*exp( - x2^2*b2)*(1 - 2*exp(b3/(R*x1) - b5/R)*(1 - sin(x2*b4)/(x2*b4)))
Estimated Coefficients:
Estimate SE tStat pValue
________ _________ _______ __________
b1 0.90447 0.018777 48.169 4.7384e-39
b2 0.013486 0.0039251 3.436 0.0013206
b3 -11214 5454.7 -2.0558 0.045905
b4 1.1185 0.094728 11.807 4.4142e-15
b5 -20.751 17.026 -1.2188 0.22956
Number of observations: 48, Error degrees of freedom: 43
Root Mean Squared Error: 0.215
R-Squared: 0.932, Adjusted R-Squared 0.926
F-statistic vs. zero model: 3.28e+03, p-value = 2.38e-54
demonstrating an excellent fit, and this plot:
This appears to be an improvement over the previous result using only nonlinfit.
Plot the results of the fitnlm with:
[ynew,ynewci] = predict(mdl,xTm);
figure(2)
for k = 1:2
idx = (1:numel(x))+numel(x)*(k-1);
subplot(2,1,k)
errorbar(x.^2, ym(:,k),yerr(:,k), '.')
hold on
plot(x.^2, ynew(idx), '-r')
plot(x.^2, ynewci(idx,:), '--r')
hold off
grid
ylabel('Substance [Units]')
title(sprintf('y_{%d}, T = %d', k,xTm(idx(1),1)))
ylim([min(yv) max(yv)])
end
xlabel('Q^2')
if you so desire.

6 Comments

Hello,
maybe i exposed bad my problem. The parameters b1 and b2 has to be specific for each temperature while the other are in common. As you can see in the code, i have switched to fmincon but it does not fits well. In particular the last parameter, the one you have called b4, is negative and it has to be positive and of the order of the unity.
I am trying with a slight different routine but it does not appear to produce good results. I post my code below:
function yfit = subfun2(params,xTm)
x1=xTm(:,1);
x2=xTm(:,2);
dsid = xTm(:,3);
R=8.314;
A0 = params(1:2); % different A2 for each dataset
A1= params(3:4); % different A1 for each dataset
A3=params(5); %common for each data set
A4=params(6); %common for each data set
A5=params(7); %common for each data set
%b(1)*exp(-x(:,2).^2.*b(2)).*(1-2*exp(b(3)./(R.*x(:,1))-b(5)/R).*(1-sin(x(:,2).*b(4))./(x(:,2)*b(4))));
yfit=A0(dsid).*exp(-x2.^2.*A1(dsid)).*( 1-2*exp(A3./(R*x1)-A4/R ).*(1-sin(x2.*A5)./(x2.*A5)));
end
clear
x=[0.5215 0.7756 1.2679 1.4701 1.6702 1.8680 2.0633 2.2693 2.4584 2.6442 2.8264 3.0046 3.0890 3.2611 3.4287 3.5917 3.7497 3.9309 4.0774 4.2183 4.3535 4.4827 4.5427 4.6628]';
y1=[0.9936 0.9375 0.9081 0.8648 0.8568 0.8114 0.7711 0.8010 0.7884 0.7389 0.7901 0.7825 0.7903 0.7501 0.7070 0.7489 0.6441 0.7105 0.6735 0.6385 0.6357 0.6962 0.5946 0.6783]';
y1_err= [ 0.0637 0.0526 0.0330 0.0235 0.0298 0.0223 0.0388 0.0223 0.0333 0.0326 0.0410 0.0282 0.0561 0.0235 0.0271 0.0218 0.0333 0.0252 0.0344 0.0261 0.0499 0.0396 0.0655 0.0901]';
y2=[0.8748 0.8726 0.7922 0.7782 0.7396 0.6958 0.6603 0.6503 0.6556 0.6461 0.6021 0.5820 0.6220 0.5768 0.4950 0.5300 0.5234 0.5170 0.4369 0.4508 0.4409 0.4392 0.4100 0.6699]';
y2_err=[ 0.0562 0.0480 0.0287 0.0211 0.0260 0.0194 0.0339 0.0188 0.0287 0.0289 0.0332 0.0225 0.0460 0.0191 0.0211 0.0169 0.0280 0.0198 0.0256 0.0204 0.0392 0.0283 0.0504 0.0856]';
dsid=[ones(size(x)); 2.*ones(size(x)) ]; %data identifier
T1=220; %temperature reffered to y1
T2=300; %temperature reffered to y2
T1v = T1*ones(size(x));
T2v = T2*ones(size(x));
xm = x(:)*ones(1,2);
ym = [y1(:) y2(:)];
Tm = [T1v(:) T2v(:)];
yerr=[y1_err(:) y2_err(:)];
xv = xm(:);
yv = ym(:);
Tv = Tm(:);
yerrv=yerr(:);
weights=1./yerrv;
xTm = [Tv xv dsid];
R=8.314;
B0=rand(7,1);%0.7350 0.9706 0.8669 0.0862 0.3664 0.3692 0.6850
%[B,R1,J,CovB] = nlinfit(xTm,yv,@subfun2,B0,'Weights',weights);
f=@(B0) subfun2(B0,xTm);
z=@(B0) norm((yv-f(B0)).*weights);
lb=[0.5,0,0.5,0,-1e6,-1e6,0];
ub=[1.5,10,0.5,0,1e6,1e6,10];
B=fmincon(z,B0,[],[],[],[],lb,ub)
figure(1)
gscatter(xv,yv,dsid)
line(x,B(1)*exp(-x.^2*B(3)).*( 1-2*exp(B(5)./(R.*T1v)-B(6)./R ).*(1-sin(x*B(7))./(x*B(7)))),'color','r')
line(x,B(2)*exp(-x.^2*B(4)).*( 1-2*exp(B(5)./(R.*T2v)-B(6)./R ).*(1-sin(x*B(7))./(x*B(7)))),'color','g')
It took me the entire day to figure it out, however this seems to me to be correct.
The changes were to add:
yfcn1 = @(b,x) b(1).*exp(-x(:,2).^2.*b(2)).*(1-2*exp(b(3)./(R.*x(:,1))-b(5)/R).*(1-sin(x(:,2).*b(4))./(x(:,2)*b(4))));
yfcn2 = @(b,x) b(6).*exp(-x(:,2).^2.*b(7)).*(1-2*exp(b(3)./(R.*x(:,1))-b(5)/R).*(1-sin(x(:,2).*b(4))./(x(:,2)*b(4))));
with:
yfcn = @(b,x) yfcn1(b,x).*(x(:,1)==T1) + yfcn2(b,x).*(x(:,1)==T2);
so the entire code is now:
R=8.314;
% yfcn= @(b,x) b(1)*exp(-x(:,2).^2.*b(2)).*(1-2*exp(b(3)./(R.*x(:,1))-b(5)/R).*(1-sin(x(:,2).*b(4))./(x(:,2)*b(4))));
T1=220; %temperature reffered to y1
T2=300; %temperature reffered to y2
yfcn1 = @(b,x) b(1).*exp(-x(:,2).^2.*b(2)).*(1-2*exp(b(3)./(R.*x(:,1))-b(5)/R).*(1-sin(x(:,2).*b(4))./(x(:,2)*b(4))));
yfcn2 = @(b,x) b(6).*exp(-x(:,2).^2.*b(7)).*(1-2*exp(b(3)./(R.*x(:,1))-b(5)/R).*(1-sin(x(:,2).*b(4))./(x(:,2)*b(4))));
x=[0.5215 0.7756 1.2679 1.4701 1.6702 1.8680 2.0633 2.2693 2.4584 2.6442 2.8264 3.0046 3.0890 3.2611 3.4287 3.5917 3.7497 3.9309 4.0774 4.2183 4.3535 4.4827 4.5427 4.6628];
y1=[0.9936 0.9375 0.9081 0.8648 0.8568 0.8114 0.7711 0.8010 0.7884 0.7389 0.7901 0.7825 0.7903 0.7501 0.7070 0.7489 0.6441 0.7105 0.6735 0.6385 0.6357 0.6962 0.5946 0.6783];
y1_err= [ 0.0637 0.0526 0.0330 0.0235 0.0298 0.0223 0.0388 0.0223 0.0333 0.0326 0.0410 0.0282 0.0561 0.0235 0.0271 0.0218 0.0333 0.0252 0.0344 0.0261 0.0499 0.0396 0.0655 0.0901];
y2=[0.8748 0.8726 0.7922 0.7782 0.7396 0.6958 0.6603 0.6503 0.6556 0.6461 0.6021 0.5820 0.6220 0.5768 0.4950 0.5300 0.5234 0.5170 0.4369 0.4508 0.4409 0.4392 0.4100 0.6699];
y2_err=[ 0.0562 0.0480 0.0287 0.0211 0.0260 0.0194 0.0339 0.0188 0.0287 0.0289 0.0332 0.0225 0.0460 0.0191 0.0211 0.0169 0.0280 0.0198 0.0256 0.0204 0.0392 0.0283 0.0504 0.0856];
format long E
T1=220; %temperature reffered to y1
T2=300; %temperature reffered to y2
%T3=320; %temperature reffered to y3
T1v = T1*ones(size(x));
T2v = T2*ones(size(x));
%T3v = T3*ones(size(x));
xm = x(:)*ones(1,2);
ym = [y1(:) y2(:)];%, y3(:)];
Tm = [T1v(:) T2v(:)];% T3v(:) ];
yerr=[y1_err(:) y2_err(:)];% y3_err(:)];
xv = xm(:);
yv = ym(:);
Tv = Tm(:);
yerrv=yerr(:);
weights=1./yerrv;
xTm = [Tv xv];
B0 =[0.2941 0.5306 0.8324 0.5975 0.3353];%rand(5,1); % Use Appropriate Initial Estimates 0.6596
B0 = randn(7,1)*0.1;
%B1 = fminsearch(@(b) norm(sqrt(weights).*(yv - yfcn(b,xTm))), B0); % Estimate Parameters
% [B,R,J,CovB] = nlinfit(xTm,yv,yfcn,B0,'Weights',weights);
yfcn = @(b,x) yfcn1(b,x).*(x(:,1)==T1) + yfcn2(b,x).*(x(:,1)==T2);
fitfcnw = @(b) norm(weights.*(yv - yfcn(b,xTm)));
problem = createOptimProblem('fmincon', 'x0',B0, 'objective',fitfcnw);
gs = GlobalSearch('PlotFcns',@gsplotbestf);
[B,fval] = run(gs,problem)
B(:)
format short eng
% X=B(1);
% % Y=B(2);
% % Z=B(3); 0.7640 0.8182 0.1002 0.1781 0.3596
% % H=B(4); 0.0567 0.5219 0.3358 0.1757 0.2089
%
% [B,R,J,CovB] = nlinfit(xTm,yv,yfcn,B,'Weights',weights);
mdl = fitnlm(xTm,yv,yfcn,B,'Weights',weights)
figure
for k = 1:2
idx = (1:numel(x))+numel(x)*(k-1);
subplot(2,1,k)
errorbar(x.^2, ym(:,k),yerr(:,k), '.')
hold on
plot(x.^2, yfcn(B,xTm(idx,:)), '-r')
hold off
grid
ylabel('Substance [Units]')
title(sprintf('y_{%d}, T = %d', k,xTm(idx(1),1)))
ylim([min(yv) max(yv)+0.2])
if k == 1
text(5, max(yv)+0.1, sprintf('$y = %.3f\\times e^{-x^2\\times %.3f}(1-2e^{\\frac{%.3f}{%3d\\times R}-\\frac{%.3f}{R}}(1-\\frac{sin(%.3fx}{%.3fx}))$',B(1:3),T1,B(4:5),B(5)), 'Interpreter','latex', 'FontSize',12)
elseif k == 2
text(5, max(yv)+0.1, sprintf('$y = %.3f\\times e^{-x^2\\times %.3f}(1-2e^{\\frac{%.3f}{%3d\\times R}-\\frac{%.3f}{R}}(1-\\frac{sin(%.3fx}{%0.3fx}))$',B(6:7),B(3),T2,B(4:5),B(5)), 'Interpreter','latex', 'FontSize',12)
end
end
xlabel('Q^2')
%sgtitle(sprintf('$y=e^{-(\\frac{3xK_BT}{m})^2} (1-2%.3f\\ %.3f (1-\\frac{sin(x\\ %.3f)}{x\\ %.3f}))$',B,B(3)), 'Interpreter','latex')
pos = get(gcf, 'Position');
set(gcf, 'Position',pos+[-150 -150 +150 +150])
[ynew,ynewci] = predict(mdl,xTm);
figure
for k = 1:2
idx = (1:numel(x))+numel(x)*(k-1);
subplot(2,1,k)
errorbar(x.^2, ym(:,k),yerr(:,k), '.')
hold on
plot(x.^2, ynew(idx), '-r')
plot(x.^2, ynewci(idx,:), '--r')
hold off
grid
ylabel('Substance [Units]')
title(sprintf('y_{%d}, T = %d', k,xTm(idx(1),1)))
ylim([min(yv) max(yv)])
end
xlabel('Q^2')
producing these results:
GlobalSearch stopped because it analyzed all the trial points.
37 out of 46 local solver runs converged with a positive local solver exit flag.
B =
9.949241168887308e-01
1.665174089666267e-02
-1.114738004495943e+03
1.870279356786803e+00
1.913814733313489e+01
9.194994459445777e-01
3.258539376862615e-02
fval =
6.397984588108914e+00
and:
mdl =
Nonlinear regression model:
y ~ F(b,x)
Estimated Coefficients:
Estimate SE tStat pValue
________ _________ _______ __________
b1 0.98515 0.037806 26.058 3.9047e-27
b2 0.016164 0.0019881 8.1304 4.393e-10
b3 -1529.1 2927 -0.5224 0.60421
b4 1.75 0.18596 9.4109 8.4543e-12
b5 17.655 11.35 1.5554 0.12753
b6 0.90854 0.03694 24.595 3.6304e-26
b7 0.030408 0.0025114 12.108 4.0213e-15
Number of observations: 48, Error degrees of freedom: 41
Root Mean Squared Error: 0.206
R-Squared: 0.941, Adjusted R-Squared 0.932
F-statistic vs. zero model: 2.56e+03, p-value = 5.11e-52
and producing this plot:
That is likely as good as it is possible to estimate the parameters, considering everything.
@Star Strider thank you very much! You have helped a lot with this code in these last three days.
I am very grateful to you!
As always, my pleasure!
This was an extremely interesting problem!
@Star Strider Thanks for this answer. I had a similar problem, and your approach worked like a charm. Would you know a solution on how to set up something similar, if one has like -to stay with the current question by @Daniele Sonaglioni- a data set with 20 or more temperatures and again one would like to use fmincon with some boundaries for the fitting parameters and with some parameters being fitted for each temperaure and others optimised for all temperarues.
Thanks for any help or tipps,
best, Thorsten
My pleasure!
I am quite pleased that my code is useful in similar situations, and is apparently robust!
For 20 temperatures, my initial suggestion would be to set parts of it up in a loop, however the way I wrote this does not easily lend itself to that approach, since there are only two sets of data vectors and hard-coding it was appropriate.
A for loop might be appropriate for the initial set-up, for example creating ‘T1v’, ‘T2v’, ‘ym’, ‘Tm’ and ‘y_err’ (that could be made even easier if all of them were matrices), and the other loops could be expanded to accommodate the additional temperatures, instead of only two. Otherwise, the only option is to hard-code the initial set-up steps.
I hope it will be sufficiently robust to deal with 20 temperatures.
If you encounter problems with those changes, please post them as a new Question (with your code and all the necessary data, preferably as text or spreadsheet files) and provide a link to it here. I will follow up as best I can to help you get it to work.

Sign in to comment.

More Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!