Is it possible to use Simulated Annealing or Genetic Algorithm for parameter estimation of a parametric System of ODEs?
9 views (last 30 days)
Show older comments
Hi, I would like to know if is it possible to apply Genetic Algorithm (GA) or Simulated Annealing (SA) optimization methods to a parametric system of ODEs in order to estimate the best set of parameters for curve fitting to an experimental data set.
I've written a code to do that, using nnlinfit and lsqcurvefit. With the first one I've obtained good results in fitting but some parameters had negative values, so I used lsqcurvefit (with trust-region-reflective algorithm) with lower bounds=0 but the system have a lot of local minima and is very susceptible to the initial values and to the upper bounds, giving always different results.
The ODEs system consists in 5 equations and 6 parameters, and for lsqcurvefit I've used this line script
[parameters] = lsqcurvefit(@odesystem,parr0,time,experimentaldata,zeros(size(p)),[ ])
parr0 is the vector of the parameters' initial values; I've tried to change it several times obtaining always different solutions.
For these reasons I've tried to replace lsqcurvefit with GA or SA but It doesn't work because they don't recognize as valid the function @odesystem used in the lsqcurvefit script.
In the lsqcurvefit script @odesystem is a function that uses ode23s to solve the parametric system.
Maybe I have to rewrite the code in order to give to GA or SA a correct form of the function to be used by these algorithm, do you have an idea how to solve this problem?
Do you think that the usage of GA or SA is not the correct way to obtain the best set of parameters because of the system complexity?
Thank you in advance.
0 Comments
Accepted Answer
Star Strider
on 27 Feb 2019
It is definitely possible to use the ga function to estimate the parameters of a system of differential equations. I experimented with this on my own.
An example using ga (based on a previous comparmental kinetics modeling problem that used lsqcurvefit):
% NOTES:
%
% 1. The ‘theta’ (parameter) argument has to be first in your
% ‘kinetics’ funcition,
% 2. You need to return ALL the values from ‘DifEq’ since you are fitting
% all the values
function C=kinetics(theta,t)
c0=[1;0;0;0];
[T,Cv]=ode45(@DifEq,t,c0);
%
function dC=DifEq(t,c)
dcdt=zeros(4,1);
dcdt(1)=-theta(1).*c(1)-theta(2).*c(1);
dcdt(2)= theta(1).*c(1)+theta(4).*c(3)-theta(3).*c(2)-theta(5).*c(2);
dcdt(3)= theta(2).*c(1)+theta(3).*c(2)-theta(4).*c(3)+theta(6).*c(4);
dcdt(4)= theta(5).*c(2)-theta(6).*c(4);
dC=dcdt;
end
C=Cv;
end
t=[0.1
0.2
0.4
0.6
0.8
1
1.5
2
3
4
5
6];
c=[0.902 0.06997 0.02463 0.00218
0.8072 0.1353 0.0482 0.008192
0.6757 0.2123 0.0864 0.0289
0.5569 0.2789 0.1063 0.06233
0.4297 0.3292 0.1476 0.09756
0.3774 0.3457 0.1485 0.1255
0.2149 0.3486 0.1821 0.2526
0.141 0.3254 0.194 0.3401
0.04921 0.2445 0.1742 0.5277
0.0178 0.1728 0.1732 0.6323
0.006431 0.1091 0.1137 0.7702
0.002595 0.08301 0.08224 0.835];
theta0=[1;1;1;1;1;1];
% [theta,Rsdnrm,Rsd,ExFlg,OptmInfo,Lmda,Jmat]=lsqcurvefit(@kinetics,theta0,t,c);
ftns = @(theta) norm(c-kinetics(theta,t));
PopSz = 500;
Parms = 6;
opts = optimoptions('ga', 'PopulationSize',PopSz, 'InitialPopulationMatrix',randi(1E+4,PopSz,Parms)*1E-3, 'MaxGenerations',2E3, 'PlotFcn','gaplotbestf');
t0 = clock;
fprintf('\nStart Time: %4d-%02d-%02d %02d:%02d:%07.4f\n', t0)
% [B,fval,exitflag,output,population,scores] = ga(ftns, 53, [],[],[],[],zeros(53,1),Inf(53,1),[],[],opts)
[theta,fval,exitflag,output] = ga(ftns, Parms, [],[],[],[],zeros(Parms,1),Inf(Parms,1),[],[],opts)
t1 = clock;
fprintf('\nStop Time: %4d-%02d-%02d %02d:%02d:%07.4f\n', t1)
GA_Time = etime(t1,t0)
fprintf('\nElapsed Time: %23.15E\t\t%02d:%02d:%02d.%03d\n', GA_Time, hmsdv(GA_Time))
fprintf(1,'\tRate Constants:\n')
for k1 = 1:length(theta)
fprintf(1, '\t\tTheta(%d) = %8.5f\n', k1, theta(k1))
end
tv = linspace(min(t), max(t));
Cfit = kinetics(theta, tv);
figure(1)
hd = plot(t, c, 'p');
for k1 = 1:size(c,2)
CV(k1,:) = hd(k1).Color;
hd(k1).MarkerFaceColor = CV(k1,:);
end
hold on
hlp = plot(tv, Cfit);
for k1 = 1:size(c,2)
hlp(k1).Color = CV(k1,:);
end
hold off
grid
xlabel('Time')
ylabel('Concentration')
legend(hlp, 'C_1(t)', 'C_2(t)', 'C_3(t)', 'C_4(t)', 'Location','N')
Note the coding of the ‘ftns’ fitness function, and its relation to the objective function I originally wrote to use with lsqcurvefit. I wrote this as my own experiment, so it is not throughly as comment-documented as I usually do for code I post here. I will provide as much of an explanation as I can for anything that is not clear.
Genetic algorithms take a while to converge, however in my experience, they are generally successful if you give them the opportunity.
2 Comments
Star Strider
on 28 Feb 2019
As always, my pleasure.
Note that in my code, I constrained the parameters to be greater than zero, since the kinetic parameters are all positive by definition.
I doubt that the replicates would cause problems providing your differential equations and data account for them (the same replicates for the independent and dependent variables, for example). The only effect those would have is to give the replicates more ‘weight’, in the sense that they would have more influence on the fitness (and therefore the estimated parameters) than if they were not replicated. If all the data were replicated, it would simply slow down the ga function.
If you are having problems getting ga to fit your data, I would use the original ‘opts’ structure in my code, where ‘Parms’ is the number of parameters you want to fit. I scaled the initial population as:
'InitialPopulationMatrix',randi(1E+4,PopSz,Parms)*1E-3
so scale it to provide appropriate initial parameter magnitudes. (Here they are all in the range of about 1, however you can scale their magnitudes individually by multiplying them with a vector of magnitudes, and using the bsxfun function, most efficiently done in a separate line of code. I chose randi because it is simply easier to define the initial range of parameters with it.) I designed it to begin with a large and very wide-ranging initial population, specifically in order to easily find the best parameter set in its parameter space. It does this, although at the expense of speed, since larger populations produce slower convergence.
I have occasionally had to run ga a few times with the same problem if it ended up going wildly off course the first time. A large initial population (I chose 500) usually prevents this. Also, if you are having problems getting a good fit of your model to your data, go over your model carefully to be certain you coded it correctly. It has been my personal experience that small errors can cause significant problems.
More Answers (2)
alesmaz
on 18 Jun 2019
1 Comment
Star Strider
on 18 Jun 2019
I would just use the estimated parameters with nlinfit with the known parameter estimates, and then nlparci or nlpredci (or fitnlm and its associated functions). The nlinfit (or fitnlm) function will likely not adjust the parameter estimates much (if at all), and will provide the confidence intervals (or the information necessary to calculate them).
crt56 ak
on 8 Aug 2021
Edited: crt56 ak
on 8 Aug 2021
Hi star, thanks for the useful code. I am a follower. I have a set of 8 differential equations with 2 parameters to estimate say theta(1) and theta(2). I have only one sets of experimental data c (for differential eqn 5) and corresponding time t. I am trying to use your code for ga to get estimates for the parameters by changing C= CV to C= CV(:,5) and the rest intact. But it seems the fitness functions bring errors. The Code is complaining of not enough input arguments at the line of ga inputs . Is there anyway I could modify my fitness functions in such a case?
4 Comments
See Also
Categories
Find more on Genetic Algorithm in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!