lsqnonlin

Solve nonlinear least-squares (nonlinear data-fitting) problems

Description

Nonlinear least-squares solver

Solves nonlinear least-squares curve fitting problems of the form

$\underset{x}{\mathrm{min}}{‖f\left(x\right)‖}_{2}^{2}=\underset{x}{\mathrm{min}}\left({f}_{1}{\left(x\right)}^{2}+{f}_{2}{\left(x\right)}^{2}+...+{f}_{n}{\left(x\right)}^{2}\right)$

with optional lower and upper bounds lb and ub on the components of x.

x, lb, and ub can be vectors or matrices; see Matrix Arguments.

Rather than compute the value ${‖f\left(x\right)‖}_{2}^{2}$ (the sum of squares), lsqnonlin requires the user-defined function to compute the vector-valued function

$f\left(x\right)=\left[\begin{array}{c}{f}_{1}\left(x\right)\\ {f}_{2}\left(x\right)\\ ⋮\\ {f}_{n}\left(x\right)\end{array}\right].$

example

x = lsqnonlin(fun,x0) starts at the point x0 and finds a minimum of the sum of squares of the functions described in fun. The function fun should return a vector (or array) of values and not the sum of squares of the values. (The algorithm implicitly computes the sum of squares of the components of fun(x).)

Note

Passing Extra Parameters explains how to pass extra parameters to the vector function fun(x), if necessary.

example

x = lsqnonlin(fun,x0,lb,ub) defines a set of lower and upper bounds on the design variables in x, so that the solution is always in the range lb  x  ub. You can fix the solution component x(i) by specifying lb(i) = ub(i).

Note

If the specified input bounds for a problem are inconsistent, the output x is x0 and the outputs resnorm and residual are [].

Components of x0 that violate the bounds lb ≤ x ≤ ub are reset to the interior of the box defined by the bounds. Components that respect the bounds are not changed.

example

x = lsqnonlin(fun,x0,lb,ub,options) minimizes with the optimization options specified in options. Use optimoptions to set these options. Pass empty matrices for lb and ub if no bounds exist.

x = lsqnonlin(problem) finds the minimum for problem, a structure described in problem.

example

[x,resnorm] = lsqnonlin(___), for any input arguments, returns the value of the squared 2-norm of the residual at x: sum(fun(x).^2).

example

[x,resnorm,residual,exitflag,output] = lsqnonlin(___) additionally returns the value of the residual fun(x) at the solution x, a value exitflag that describes the exit condition, and a structure output that contains information about the optimization process.

[x,resnorm,residual,exitflag,output,lambda,jacobian] = lsqnonlin(___) additionally returns a structure lambda whose fields contain the Lagrange multipliers at the solution x, and the Jacobian of fun at the solution x.

Examples

collapse all

Fit a simple exponential decay curve to data.

Generate data from an exponential decay model plus noise. The model is

$y=\mathrm{exp}\left(-1.3t\right)+\epsilon ,$

with $t$ ranging from 0 through 3, and $\epsilon$ normally distributed noise with mean 0 and standard deviation 0.05.

rng default % for reproducibility
d = linspace(0,3);
y = exp(-1.3*d) + 0.05*randn(size(d));

The problem is: given the data (d, y), find the exponential decay rate that best fits the data.

Create an anonymous function that takes a value of the exponential decay rate $r$ and returns a vector of differences from the model with that decay rate and the data.

fun = @(r)exp(-d*r)-y;

Find the value of the optimal decay rate. Arbitrarily choose an initial guess x0 = 4.

x0 = 4;
x = lsqnonlin(fun,x0)
Local minimum possible.

lsqnonlin stopped because the final change in the sum of squares relative to
its initial value is less than the value of the function tolerance.
x = 1.2645

Plot the data and the best-fitting exponential curve.

plot(d,y,'ko',d,exp(-x*d),'b-')
legend('Data','Best fit')
xlabel('t')
ylabel('exp(-tx)') Find the best-fitting model when some of the fitting parameters have bounds.

Find a centering $b$ and scaling $a$ that best fit the function

$a\mathrm{exp}\left(-t\right)\mathrm{exp}\left(-\mathrm{exp}\left(-\left(t-b\right)\right)\right)$

to the standard normal density,

$\frac{1}{\sqrt{2\pi }}\mathrm{exp}\left(-{t}^{2}/2\right).$

Create a vector t of data points, and the corresponding normal density at those points.

t = linspace(-4,4);
y = 1/sqrt(2*pi)*exp(-t.^2/2);

Create a function that evaluates the difference between the centered and scaled function from the normal y, with x(1) as the scaling $a$ and x(2) as the centering $b$.

fun = @(x)x(1)*exp(-t).*exp(-exp(-(t-x(2)))) - y;

Find the optimal fit starting from x0 = [1/2,0], with the scaling $a$ between 1/2 and 3/2, and the centering $b$ between -1 and 3.

lb = [1/2,-1];
ub = [3/2,3];
x0 = [1/2,0];
x = lsqnonlin(fun,x0,lb,ub)
Local minimum possible.

lsqnonlin stopped because the final change in the sum of squares relative to
its initial value is less than the value of the function tolerance.
x = 1×2

0.8231   -0.2444

Plot the two functions to see the quality of the fit.

plot(t,y,'r-',t,fun(x)+y,'b-')
xlabel('t')
legend('Normal density','Fitted function') Compare the results of a data-fitting problem when using different lsqnonlin algorithms.

Suppose that you have observation time data xdata and observed response data ydata, and you want to find parameters $x\left(1\right)$ and $x\left(2\right)$ to fit a model of the form

$\text{ydata}=x\left(1\right)\mathrm{exp}\left(x\left(2\right)\text{xdata}\right).$

Input the observation times and responses.

xdata = ...
[0.9 1.5 13.8 19.8 24.1 28.2 35.2 60.3 74.6 81.3];
ydata = ...
[455.2 428.6 124.1 67.3 43.2 28.1 13.1 -0.4 -1.3 -1.5];

Create a simple exponential decay model. The model computes a vector of differences between predicted values and observed values.

fun = @(x)x(1)*exp(x(2)*xdata)-ydata;

Fit the model using the starting point x0 = [100,-1]. First, use the default 'trust-region-reflective' algorithm.

x0 = [100,-1];
options = optimoptions(@lsqnonlin,'Algorithm','trust-region-reflective');
x = lsqnonlin(fun,x0,[],[],options)
Local minimum possible.

lsqnonlin stopped because the final change in the sum of squares relative to
its initial value is less than the value of the function tolerance.
x = 1×2

498.8309   -0.1013

See if there is any difference using the 'levenberg-marquardt algorithm.

options.Algorithm = 'levenberg-marquardt';
x = lsqnonlin(fun,x0,[],[],options)
Local minimum possible.
lsqnonlin stopped because the relative size of the current step is less than
the value of the step size tolerance.
x = 1×2

498.8309   -0.1013

The two algorithms found the same solution. Plot the solution and the data.

plot(xdata,ydata,'ko')
hold on
tlist = linspace(xdata(1),xdata(end));
plot(tlist,x(1)*exp(x(2)*tlist),'b-')
xlabel xdata
ylabel ydata
title('Exponential Fit to Data')
legend('Data','Exponential Fit')
hold off Find the $x$ that minimizes

$\sum _{k=1}^{10}{\left(2+2k-{e}^{k{x}_{1}}-{e}^{k{x}_{2}}\right)}^{2}$,

and find the value of the minimal sum of squares.

Because lsqnonlin assumes that the sum of squares is not explicitly formed in the user-defined function, the function passed to lsqnonlin should instead compute the vector-valued function

${F}_{k}\left(x\right)=2+2k-{e}^{k{x}_{1}}-{e}^{k{x}_{2}}$,

for $k=1$ to $10$ (that is, $F$ should have $10$ components).

The myfun function, which computes the 10-component vector F, appears at the end of this example.

Find the minimizing point and the minimum value, starting at the point x0 = [0.3,0.4].

x0 = [0.3,0.4];
[x,resnorm] = lsqnonlin(@myfun,x0)
Local minimum possible.
lsqnonlin stopped because the size of the current step is less than
the value of the step size tolerance.
x = 1×2

0.2578    0.2578

resnorm = 124.3622

The resnorm output is the squared residual norm, or the sum of squares of the function values.

The following function computes the vector-valued objective function.

function F = myfun(x)
k = 1:10;
F = 2 + 2*k-exp(k*x(1))-exp(k*x(2));
end

Examine the solution process both as it occurs (by setting the Display option to 'iter') and afterward (by examining the output structure).

Suppose that you have observation time data xdata and observed response data ydata, and you want to find parameters $x\left(1\right)$ and $x\left(2\right)$ to fit a model of the form

$\text{ydata}=x\left(1\right)\mathrm{exp}\left(x\left(2\right)\text{xdata}\right).$

Input the observation times and responses.

xdata = ...
[0.9 1.5 13.8 19.8 24.1 28.2 35.2 60.3 74.6 81.3];
ydata = ...
[455.2 428.6 124.1 67.3 43.2 28.1 13.1 -0.4 -1.3 -1.5];

Create a simple exponential decay model. The model computes a vector of differences between predicted values and observed values.

fun = @(x)x(1)*exp(x(2)*xdata)-ydata;

Fit the model using the starting point x0 = [100,-1]. Examine the solution process by setting the Display option to 'iter'. Obtain an output structure to obtain more information about the solution process.

x0 = [100,-1];
options = optimoptions('lsqnonlin','Display','iter');
[x,resnorm,residual,exitflag,output] = lsqnonlin(fun,x0,[],[],options);
Norm of      First-order
Iteration  Func-count     f(x)          step          optimality
0          3          359677                      2.88e+04
Objective function returned Inf; trying a new point...
1          6          359677        11.6976       2.88e+04
2          9          321395            0.5       4.97e+04
3         12          321395              1       4.97e+04
4         15          292253           0.25       7.06e+04
5         18          292253            0.5       7.06e+04
6         21          270350          0.125       1.15e+05
7         24          270350           0.25       1.15e+05
8         27          252777         0.0625       1.63e+05
9         30          252777          0.125       1.63e+05
10         33          243877        0.03125       7.48e+04
11         36          243660         0.0625        8.7e+04
12         39          243276         0.0625          2e+04
13         42          243174         0.0625       1.14e+04
14         45          242999          0.125        5.1e+03
15         48          242661           0.25       2.04e+03
16         51          241987            0.5       1.91e+03
17         54          240643              1       1.04e+03
18         57          237971              2       3.36e+03
19         60          232686              4       6.04e+03
20         63          222354              8        1.2e+04
21         66          202592             16       2.25e+04
22         69          166443             32       4.05e+04
23         72          106320             64       6.68e+04
24         75         28704.7            128       8.31e+04
25         78         89.7947        140.674       2.22e+04
26         81         9.57381        2.02599            684
27         84         9.50489      0.0619927           2.27
28         87         9.50489    0.000462261         0.0114

Local minimum possible.

lsqnonlin stopped because the final change in the sum of squares relative to
its initial value is less than the value of the function tolerance.

output
output = struct with fields:
firstorderopt: 0.0114
iterations: 28
funcCount: 87
cgiterations: 0
algorithm: 'trust-region-reflective'
stepsize: 4.6226e-04
message: '...'

For comparison, set the Algorithm option to 'levenberg-marquardt'.

options.Algorithm = 'levenberg-marquardt';
[x,resnorm,residual,exitflag,output] = lsqnonlin(fun,x0,[],[],options);
First-Order                    Norm of
Iteration  Func-count    Residual       optimality      Lambda           step
0           3          359677        2.88e+04         0.01
Objective function returned Inf; trying a new point...
1          13          340761        3.91e+04       100000       0.280777
2          16          304661        5.97e+04        10000       0.373146
3          21          297292        6.55e+04        1e+06      0.0589933
4          24          288240        7.57e+04       100000      0.0645444
5          28          275407        1.01e+05        1e+06      0.0741266
6          31          249954        1.62e+05       100000       0.094571
7          36          245896        1.35e+05        1e+07      0.0133606
8          39          243846        7.26e+04        1e+06     0.00944311
9          42          243568        5.66e+04       100000     0.00821622
10          45          243424        1.61e+04        10000     0.00777936
11          48          243322         8.8e+03         1000      0.0673933
12          51          242408         5.1e+03          100       0.675209
13          54          233628        1.05e+04           10        6.59804
14          57          169089        8.51e+04            1        54.6992
15          60         30814.7        1.54e+05          0.1        196.939
16          63         147.496           8e+03         0.01        129.795
17          66         9.51503             117        0.001        9.96069
18          69         9.50489          0.0714       0.0001      0.0804859
19          72         9.50489        5.12e-05        1e-05    5.07049e-05

Local minimum possible.
lsqnonlin stopped because the relative size of the current step is less than
the value of the step size tolerance.

The 'levenberg-marquardt' converged with fewer iterations, but almost as many function evaluations:

output
output = struct with fields:
iterations: 19
funcCount: 72
stepsize: 5.0705e-05
cgiterations: []
firstorderopt: 5.1213e-05
algorithm: 'levenberg-marquardt'
message: '...'

Input Arguments

collapse all

Function whose sum of squares is minimized, specified as a function handle or the name of a function. fun is a function that accepts an array x and returns an array F, the objective functions evaluated at x. The function fun can be specified as a function handle to a file:

x = lsqnonlin(@myfun,x0)

where myfun is a MATLAB® function such as

function F = myfun(x)
F = ...            % Compute function values at x

fun can also be a function handle for an anonymous function.

x = lsqnonlin(@(x)sin(x.*x),x0);

lsqnonlin passes x to your objective function in the shape of the x0 argument. For example, if x0 is a 5-by-3 array, then lsqnonlin passes x to fun as a 5-by-3 array.

Note

The sum of squares should not be formed explicitly. Instead, your function should return a vector of function values. See Examples.

If the Jacobian can also be computed and the 'SpecifyObjectiveGradient' option is true, set by

then the function fun must return a second output argument with the Jacobian value J (a matrix) at x. By checking the value of nargout, the function can avoid computing J when fun is called with only one output argument (in the case where the optimization algorithm only needs the value of F but not J).

function [F,J] = myfun(x)
F = ...          % Objective function values at x
if nargout > 1   % Two output arguments
J = ...   % Jacobian of the function evaluated at x
end

If fun returns an array of m components and x has n elements, where n is the number of elements of x0, the Jacobian J is an m-by-n matrix where J(i,j) is the partial derivative of F(i) with respect to x(j). (The Jacobian J is the transpose of the gradient of F.)

Example: @(x)cos(x).*exp(-x)

Data Types: char | function_handle | string

Initial point, specified as a real vector or real array. Solvers use the number of elements in x0 and the size of x0 to determine the number and size of variables that fun accepts.

Example: x0 = [1,2,3,4]

Data Types: double

Lower bounds, specified as a real vector or real array. If the number of elements in x0 is equal to the number of elements in lb, then lb specifies that

x(i) >= lb(i) for all i.

If numel(lb) < numel(x0), then lb specifies that

x(i) >= lb(i) for 1 <= i <= numel(lb).

If lb has fewer elements than x0, solvers issue a warning.

Example: To specify that all x components are positive, use lb = zeros(size(x0)).

Data Types: double

Upper bounds, specified as a real vector or real array. If the number of elements in x0 is equal to the number of elements in ub, then ub specifies that

x(i) <= ub(i) for all i.

If numel(ub) < numel(x0), then ub specifies that

x(i) <= ub(i) for 1 <= i <= numel(ub).

If ub has fewer elements than x0, solvers issue a warning.

Example: To specify that all x components are less than 1, use ub = ones(size(x0)).

Data Types: double

Optimization options, specified as the output of optimoptions or a structure as optimset returns.

Some options apply to all algorithms, and others are relevant for particular algorithms. See Optimization Options Reference for detailed information.

Some options are absent from the optimoptions display. These options appear in italics in the following table. For details, see View Options.

Example: options = optimoptions('lsqnonlin','FiniteDifferenceType','central')

Problem structure, specified as a structure with the following fields:

Field NameEntry

objective

Objective function

x0

Initial point for x
lbVector of lower bounds
ubVector of upper bounds

solver

'lsqnonlin'

options

Options created with optimoptions

You must supply at least the objective, x0, solver, and options fields in the problem structure.

Data Types: struct

Output Arguments

collapse all

Solution, returned as a real vector or real array. The size of x is the same as the size of x0. Typically, x is a local solution to the problem when exitflag is positive. For information on the quality of the solution, see When the Solver Succeeds.

Squared norm of the residual, returned as a nonnegative real. resnorm is the squared 2-norm of the residual at x: sum(fun(x).^2).

Value of objective function at solution, returned as an array. In general, residual = fun(x).

Reason the solver stopped, returned as an integer.

 1 Function converged to a solution x. 2 Change in x is less than the specified tolerance, or Jacobian at x is undefined. 3 Change in the residual is less than the specified tolerance. 4 Relative magnitude of search direction is smaller than the step tolerance. 0 Number of iterations exceeds options.MaxIterations or number of function evaluations exceeded options.MaxFunctionEvaluations. -1 A plot function or output function stopped the solver. -2 Problem is infeasible: the bounds lb and ub are inconsistent.

Information about the optimization process, returned as a structure with fields:

 firstorderopt Measure of first-order optimality iterations Number of iterations taken funcCount The number of function evaluations cgiterations Total number of PCG iterations (trust-region-reflective algorithm only) stepsize Final displacement in x algorithm Optimization algorithm used message Exit message

Lagrange multipliers at the solution, returned as a structure with fields:

 lower Lower bounds lb upper Upper bounds ub

Jacobian at the solution, returned as a real matrix. jacobian(i,j) is the partial derivative of fun(i) with respect to x(j) at the solution x.

Limitations

• The trust-region-reflective algorithm does not solve underdetermined systems; it requires that the number of equations, i.e., the row dimension of F, be at least as great as the number of variables. In the underdetermined case, lsqnonlin uses the Levenberg-Marquardt algorithm.

• lsqnonlin can solve complex-valued problems directly. Note that bound constraints do not make sense for complex values. For a complex problem with bound constraints, split the variables into real and imaginary parts. See Fit a Model to Complex-Valued Data.

• The preconditioner computation used in the preconditioned conjugate gradient part of the trust-region-reflective method forms JTJ (where J is the Jacobian matrix) before computing the preconditioner. Therefore, a row of J with many nonzeros, which results in a nearly dense product JTJ, can lead to a costly solution process for large problems.

• If components of x have no upper (or lower) bounds, lsqnonlin prefers that the corresponding components of ub (or lb) be set to inf (or -inf for lower bounds) as opposed to an arbitrary but very large positive (or negative for lower bounds) number.

You can use the trust-region reflective algorithm in lsqnonlin, lsqcurvefit, and fsolve with small- to medium-scale problems without computing the Jacobian in fun or providing the Jacobian sparsity pattern. (This also applies to using fmincon or fminunc without computing the Hessian or supplying the Hessian sparsity pattern.) How small is small- to medium-scale? No absolute answer is available, as it depends on the amount of virtual memory in your computer system configuration.

Suppose your problem has m equations and n unknowns. If the command J = sparse(ones(m,n)) causes an Out of memory error on your machine, then this is certainly too large a problem. If it does not result in an error, the problem might still be too large. You can find out only by running it and seeing if MATLAB runs within the amount of virtual memory available on your system.

Algorithms

The Levenberg-Marquardt and trust-region-reflective methods are based on the nonlinear least-squares algorithms also used in fsolve.

• The default trust-region-reflective algorithm is a subspace trust-region method and is based on the interior-reflective Newton method described in  and . Each iteration involves the approximate solution of a large linear system using the method of preconditioned conjugate gradients (PCG). See Trust-Region-Reflective Least Squares.

• The Levenberg-Marquardt method is described in references , , and . See Levenberg-Marquardt Method.

Alternative Functionality

App

The Optimize Live Editor task provides a visual interface for lsqnonlin.

 Coleman, T.F. and Y. Li. “An Interior, Trust Region Approach for Nonlinear Minimization Subject to Bounds.” SIAM Journal on Optimization, Vol. 6, 1996, pp. 418–445.

 Coleman, T.F. and Y. Li. “On the Convergence of Reflective Newton Methods for Large-Scale Nonlinear Minimization Subject to Bounds.” Mathematical Programming, Vol. 67, Number 2, 1994, pp. 189–224.

 Dennis, J. E. Jr. “Nonlinear Least-Squares.” State of the Art in Numerical Analysis, ed. D. Jacobs, Academic Press, pp. 269–312.

 Levenberg, K. “A Method for the Solution of Certain Problems in Least-Squares.” Quarterly Applied Mathematics 2, 1944, pp. 164–168.

 Marquardt, D. “An Algorithm for Least-squares Estimation of Nonlinear Parameters.” SIAM Journal Applied Mathematics, Vol. 11, 1963, pp. 431–441.

 Moré, J. J. “The Levenberg-Marquardt Algorithm: Implementation and Theory.” Numerical Analysis, ed. G. A. Watson, Lecture Notes in Mathematics 630, Springer Verlag, 1977, pp. 105–116.

 Moré, J. J., B. S. Garbow, and K. E. Hillstrom. User Guide for MINPACK 1. Argonne National Laboratory, Rept. ANL–80–74, 1980.

 Powell, M. J. D. “A Fortran Subroutine for Solving Systems of Nonlinear Algebraic Equations.” Numerical Methods for Nonlinear Algebraic Equations, P. Rabinowitz, ed., Ch.7, 1970.