*System identification* is a methodology for building mathematical
models of dynamic systems using measurements of the input and output signals of the
system.

The process of system identification requires that you:

Measure the input and output signals from your system in time or frequency domain.

Select a model structure.

Apply an estimation method to estimate values for the adjustable parameters in the candidate model structure.

Evaluate the estimated model to see if the model is adequate for your application needs.

In a dynamic system, the values of the output signals depend on both the instantaneous values of the input signals and also on the past behavior of the system. For example, a car seat is a dynamic system—the seat shape (settling position) depends on both the current weight of the passenger (instantaneous value) and how long the passenger has been riding in the car (past behavior).

A *model* is a mathematical relationship between the input and output
variables of the system. Models of dynamic systems are typically described by differential or
difference equations, transfer functions, state-space equations, and pole-zero-gain
models.

You can represent dynamic models in both continuous-time and discrete-time form.

An often-used example of a dynamic model is the equation of motion of a spring-mass-damper
system. As the following figure shows, the mass moves in response to the force
*F*(*t*) applied on the base to which the mass is attached.
The input and output of this system are the force *F*(*t*) and
displacement *y*(*t*), respectively.

You can represent the same physical system as several equivalent models. For example, you can represent the mass-spring-damper system in continuous time as a second-order differential equation:

$$m\frac{{d}^{2}y}{d{t}^{2}}+c\frac{dy}{dt}+ky(t)=F(t)$$

Here, *m* is the mass, *k* is the stiffness constant of
the spring, and *c* is the damping coefficient. The solution to this
differential equation lets you determine the displacement of the mass
*y*(*t*), as a function of external force
*F*(*t*) at any time *t* for known values
of constant *m*, *c*, and *k*.

Consider the displacement *y*(*t*) and velocity $$v(t)=\frac{dy(t)}{dt}$$ as state variables:

$$x(t)=\left[\begin{array}{c}y(t)\\ v(t)\end{array}\right]$$

You can express the previous equation of motion as a state-space model of the system:

$$\begin{array}{l}\frac{dx}{dt}=Ax(t)+BF(t)\\ y(t)=Cx(t)\end{array}$$

The matrices *A*, *B*, and *C* are
related to the constants *m*, *c*, and *k*
as follows:

$$\begin{array}{c}A=\left[\begin{array}{cc}0& 1\\ -\frac{k}{m}& -\frac{c}{m}\end{array}\right]\\ B=\left[\begin{array}{cc}0& \frac{1}{m}\end{array}\right]\\ C=\left[\begin{array}{cc}1& 0\end{array}\right]\end{array}$$

You can also obtain a *transfer function model* of the
spring-mass-damper system by taking the Laplace transform of the differential equation:

$$G(s)=\frac{Y(s)}{F(s)}=\frac{1}{(m{s}^{2}+cs+k)}$$

Here, *s* is the Laplace variable.

Suppose you can observe only the input and output variables
*F*(*t*) and *y*(*t*) of
the mass-spring-damper system at discrete time instants *t* =
*n**T _{s}*, where

$$y(t)+{a}_{1}y(t-{T}_{s})+{a}_{2}y(t-2{T}_{s})=bF(t-{T}_{s})$$

Often, for simplicity, *T _{s}* is taken as one time
unit, and the equation can be written as

$$y(t)+{a}_{1}y(t-1)+{a}_{2}y(t-2)=bF(t-1)$$

Here, *a _{1}* and

This difference equation shows the dynamic nature of the model. The displacement value at
the time instant *t* depends not only on the value of force
*F* at a previous time instant, but also on the displacement values at the
previous two time instants *y*(*t*–1) and *y*(*t*–2).

You can use this equation to compute the displacement at a specific time. The displacement is represented as a weighted sum of the past input and output values:

$$y(t)=bF(t-1)-{a}_{1}y(t-1)-{a}_{2}y(t-2)$$

This equation shows an iterative way of generating values of the output
*y*(*t*) starting from initial conditions
*y*(*0*) and *y*(*1*) and
measurements of input *F*(*t*). This computation is called
*simulation*.

Alternatively, the output value at a given time *t* can be computed using
the *measured* values of output at the previous two time instants and the
input value at a previous time instant. This computation is called
*prediction*. For more information on simulation and prediction using a
model, see topics on the Simulation and Prediction
page.

You can also represent a discrete-time equation of motion in state-space and transfer-function forms by performing the transformations similar to those described in Continuous-Time Dynamic Model Example.

System identification uses the input and output signals you measure from a system to estimate the values of adjustable parameters in a given model structure. You can build models using time-domain input-output signals, frequency response data, time -series signals, and time-series spectra.

To obtain a good model of your system, you must have measured data that reflects the dynamic behavior of the system. The accuracy of your model depends on the quality of your measurement data, which in turn depends on your experimental design.

Time-domain data consists of the input and output variables of the system that you record at a uniform sampling interval over a period of time.

For example, if you measure the input force *F*(*t*) and
mass displacement *y*(*t*) of the spring-mass-damper system
illustrated in Dynamic Systems and Models
at a uniform sampling frequency of 10 Hz, you obtain the following vectors of measured
values:

$$\begin{array}{l}{u}_{meas}=[F({T}_{s}),F(2{T}_{s}),F(3{T}_{s}),\mathrm{...},F(N{T}_{s})]\\ {y}_{meas}=[y({T}_{s}),y(2{T}_{s}),y(3{T}_{s}),\mathrm{...},y(N{T}_{s})]\end{array}$$

Here, *T _{s}* = 0.1 seconds and

If you want to build a discrete-time model from this data, the data vectors
*u _{meas}* and

If you want to build a continuous-time model, you must also know the intersample behavior of the input signals during the experiment. For example, the input can be piecewise constant (zero-order hold) or piecewise linear (first-order hold) between samples.

Frequency-domain data represents measurements of the system input and output variables that you record or store in the frequency domain. The frequency-domain signals are Fourier transforms of the corresponding time-domain signals.

Frequency-domain data can also represent the frequency response of the system, represented
by the set of complex response values over a given frequency range. The *frequency
response* describes the outputs to sinusoidal inputs. If the input is a sine wave
with frequency *ω*, then the output is also a sine wave of the same frequency,
whose amplitude is *A*(*ω*) times the input signal amplitude
and a phase shift of Φ(ω) with respect to the input signal. The frequency response is *A*(*ω*)e^{(iΦ(ω))}.

In the case of the mass-spring-damper system, you can obtain the frequency response data by using a sinusoidal input force and measuring the corresponding amplitude gain and phase shift of the response over a range of input frequencies.

You can use frequency-domain data to build both discrete-time and continuous-time models of your system.

System identification requires that your data capture the important dynamics of your system. Good experimental design ensures that you measure the right variables with sufficient accuracy and duration to capture the dynamics you want to model. In general, your experiment must:

Use inputs that excite the system dynamics adequately. For example, a single step is seldom enough excitation.

Measure data long enough to capture the important time constants.

Set up a data acquisition system that has a good signal-to-noise ratio.

Measure data at appropriate sampling intervals or frequency resolution.

You can analyze the data quality before building the model using the functions and
techniques described in Analyze Data. For example, you can analyze the input spectra to
determine if the input signals have sufficient power over the bandwidth of the
system.
To get analysis and processing recommendations for your specific data, use `advice`

.

You can also analyze your data to determine peak frequencies, input delays, important time constants, and indication of nonlinearities using nonparametric analysis tools in this toolbox. You can use this information for configuring model structures for building models from data. For more information, see:

A *model structure* is a mathematical relationship between input and
output variables that contains unknown parameters. Examples of model structures are transfer
functions with adjustable poles and zeros, state-space equations with unknown system matrices,
and nonlinear parameterized functions.

The following difference equation represents a simple model structure:

$$y(k)+ay(k-1)=bu(k)$$

Here, *a* and *b* are adjustable parameters.

The system identification process requires that you choose a model structure and apply the estimation methods to determine the numerical values of the model parameters.

You can use one of the following approaches to choose the model structure:

You want a model that is able to reproduce your measured data and is as simple as possible. You can try various mathematical structures available in the toolbox. This modeling approach is called

*black-box*modeling.You want a specific structure for your model, which you might have derived from first principles, but do not know numerical values of its parameters. You can represent the model structure as a set of equations or as a state-space system in MATLAB

^{®}and estimate the values of its parameters from data. This approach is known as*grey-box*modeling.

The System Identification Toolbox™ software estimates model parameters by minimizing the error between the model
output and the measured response. The output *y _{model}*
of the linear model is given by

*y*_{model}(*t*) =
*G**u*(*t*)

Here, *G* is the transfer function.

To determine *G*, the toolbox minimizes the difference between the model
output *y*_{model}(*t*) and the measured
output *y*_{meas}(*t*). The
*minimization criterion* is a weighted norm of the error,
*v*(*t*), where

*v*(*t*) =
*y _{meas}*(

*y _{model}*(

Simulated response (

*G**u*(*t*) of the model for a given input*u*(*t*)Predicted response of the model for a given input

*u*(*t*) and past measurements of the output (*y*(_{meas}*t-1*),*y*(_{meas}*t-2*),...)

Accordingly, the error *v*(*t*) is called the
*simulation error* or *prediction error*. The
estimation algorithms
adjust parameters in the model structure *G* such that the norm of this error
is as small as possible.

You can configure the estimation algorithm by:

Configuring the minimization criterion to focus the estimation in a desired frequency range, for example, to put more emphasis at lower frequencies and deemphasize higher frequency noise contributions. You can also configure the criterion to target the intended application needs for the model, such as simulation or prediction.

Specifying optimization options for iterative estimation algorithms.

The majority of estimation algorithms in this toolbox are iterative. You can configure an iterative estimation algorithm by specifying options, such as the optimization method and the maximum number of iterations.

For more information about configuring the estimation algorithm, see Options to Configure the Loss Function and the topics for estimating specific model structures.

Black-box modeling is useful when your primary interest is in fitting the data regardless of a particular mathematical structure of the model. The toolbox provides several linear and nonlinear black-box model structures, which have traditionally been useful for representing dynamic systems. These model structures vary in complexity depending on the flexibility you need to account for the dynamics and noise in your system. You can choose one of these structures and compute its parameters to fit the measured response data.

Black-box modeling is usually a trial-and-error process, where you estimate the parameters of various structures and compare the results. Typically, you start with the simple linear model structure and progress to more complex structures. You might also choose a model structure because you are more familiar with this structure or because you have specific application needs.

The simplest linear black-box structures require the fewest options to configure:

Transfer function, with a given number of poles and zeros

Linear ARX model, which is the simplest input-output polynomial model

State-space model, which you can estimate by specifying the number of model states

Estimation of some of these structures also uses noniterative estimation algorithms, which further reduces complexity.

You can configure a model structure using the *model
order*. The definition of model order varies depending
on the type of model you select. For example, if you choose a transfer
function representation, the model order is related to the number
of poles and zeros. For state-space representation, the model order
corresponds to the number of states. In some cases, such as for linear
ARX and state-space model structures, you can estimate the model order
from the data.

If the simple model structures do not produce good models, you can select more complex model structures by:

Specifying a higher model order for the same linear model structure. A higher model order increases the model flexibility for capturing complex phenomena. However, an unnecessarily high order can make the model less reliable.

Explicitly modeling the noise by including the

*H**e*(*t*) term, as shown in the following equation.*y*(*t*) =*G**u*(*t*) +*H**e*(*t*)Here,

*H*models the additive disturbance by treating the disturbance as the output of a linear system driven by a white noise source*e*(*t*).Using a model structure that explicitly models the additive disturbance can help to improve the accuracy of the measured component

*G*. Furthermore, such a model structure is useful when your main interest is using the model for predicting future response values.Using a different linear model structure.

Using a nonlinear model structure.

Nonlinear models have more flexibility in capturing complex phenomena than linear models of similar orders. See Nonlinear Model Structures.

Ultimately, you choose the simplest model structure that provides the best fit to your measured data. For more information, see Estimating Linear Models Using Quick Start.

Regardless of the structure you choose for estimation, you can
simplify the model for your application needs. For example, you can
separate out the measured dynamics (*G*) from the
noise dynamics (*H*) to obtain a simpler model that
represents just the relationship between *y* and *u*.
You can also linearize a nonlinear model about an operating point.

A linear model is often sufficient to accurately describe the system dynamics and, in most cases, a best practice is to first try to fit linear models. If the linear model output does not adequately reproduce the measured output, you might need to use a nonlinear model.

You can assess the need to use a nonlinear model structure by plotting the response of the system to an input. If you notice that the responses differ depending on the input level or input sign, try using a nonlinear model. For example, if the output response to an input step up is faster than the response to a step down, you might need a nonlinear model.

Before building a nonlinear model of a system that you know is nonlinear, try transforming the input and output variables such that the relationship between the transformed variables is linear. For example, consider a system that has current and voltage as inputs to an immersion heater, and the temperature of the heated liquid as an output. The output depends on the inputs through the power of the heater, which is equal to the product of current and voltage. Instead of building a nonlinear model for this two-input and one-output system, you can create a new input variable by taking the product of the current and voltage and building a linear model that describes the relationship between power and temperature.

If you cannot determine variable transformations that yield a linear relationship between input and output variables, you can use nonlinear structures such as nonlinear ARX or Hammerstein-Wiener models. For a list of supported nonlinear model structures and when to use them, see Nonlinear Model Structures.

You can use the System Identification app or commands to estimate linear and nonlinear models of various structures. In most cases, you choose a model structure and estimate the model parameters using a single command.

Consider the mass-spring-damper system described in Dynamic Systems and Models. If you do not know the equation of motion of this system, you can use a black-box modeling approach to build a model. For example, you can estimate transfer functions or state-space models by specifying the orders of these model structures.

A transfer function is a ratio of polynomials:

$$G(s)=\frac{\left({b}_{0}+{b}_{1}s+{b}_{2}{s}^{2}+\mathrm{...}\right)}{\left(1+{f}_{1}s+{f}_{2}{s}^{2}+\mathrm{...}\right)}$$

For the mass-spring damper system, this transfer function is

$$G(s)=\frac{1}{\left(m{s}^{2}+cs+k\right)}$$

which is a system with no zeros and 2 poles.

In discrete-time, the transfer function of the mass-spring-damper system can be

$$G\left({z}^{-1}\right)=\frac{b{z}^{-1}}{\left(1+{f}_{1}{z}^{-1}+{f}_{2}{z}^{-2}\right)}$$

where the model orders correspond to the number of coefficients
of the numerator and the denominator (`nb`

= 1 and `nf`

=
2) and the input-output delay equals the lowest order exponent of *z*^{–1} in
the numerator (`nk`

= 1).

In continuous time, you can build a linear transfer function model using the `tfest`

command.

m = tfest(data,2,0)

Here, `data`

is your measured input-output data, represented as an `iddata`

object, and the model order is the set of number of poles (2) and the
number of zeros (0).

Similarly, you can build a discrete-time model Output Error structure using the `oe`

command.

m = oe(data,[1 2 1])

The model order is [`nb nf nk`

] = [`1 2 1`

]. Usually, you do
not know the model orders in advance. Try several model order values until you find the
orders that produce an acceptable model.

Alternatively, you can choose a state-space structure to represent the mass-spring-damper
system and estimate the model parameters using the `ssest`

or the `n4sid`

command.

m = ssest(data,2)

Here, the second argument `2`

represents the order, or the number of states
in the model.

In black-box modeling, you do not need the equation of motion for the system — only a guess of the model orders.

For more information about building models, see Steps for Using the System Identification App and Model Estimation Commands.

In some situations, you can deduce the model structure from physical principles. For example, the mathematical relationship between the input force and the resulting mass displacement in the spring-mass-damper system illustrated in Dynamic Systems and Models is well known. In state-space form, the model is given by

$$\begin{array}{l}\frac{dx}{dt}=Ax(t)+BF(t)\\ y(t)=Cx(t)\end{array}$$

where * x*(

`t`

`y`

`t`

`v`

`t`

A = [0 1; –*k*/*m*
–*c*/*m*]

B = [0; 1/*m*]

C = [1 0]

Here, you fully know the model structure but do not know the values of its
parameters—*m*, *c*, and *k*.

In the grey-box approach, you use the data to estimate the values of the unknown parameters of your model structure. You specify the model structure by a set of differential or difference equations in MATLAB and provide some initial guess for the unknown parameters specified.

In general, you build grey-box models by:

Creating a template model structure.

Configuring the model parameters with initial values and constraints (if any).

Applying an estimation method to the model structure and computing the model parameter values.

The following table summarizes the ways you can specify a grey-box model structure.

Grey-Box Structure Representation | Learn More |
---|---|

Represent the state-space model structure as a structured You can compute the parameter
values, such as | |

Represent the state-space model structure as an `idgrey` model object. You can directly estimate the values of parameters
m, c, and k. | Grey-Box Model Estimation |

After you estimate the model, you can evaluate the model quality by:

Ultimately, you must assess the quality of your model based on whether the model adequately addresses the needs of your application. For information about other available model analysis techniques, see Model Analysis.

If you do not get a satisfactory model, you can iteratively improve your results by trying a different model structure, changing the estimation algorithm settings, or performing additional data processing. If these changes do not improve your results, you might need to revisit your experimental design and data gathering procedures.

Typically, you evaluate the quality of a model by comparing the model response to the measured output for the same input signal.

Suppose you use a black-box modeling approach to create dynamic models of the spring-mass damper system. You try various model structures and orders, such as:

model1 = arx(data, [2 1 1]); model2 = n4sid(data, 3)

You can simulate these models with a particular input and compare their responses against the measured values of the displacement for the same input applied to the real system. The following figure compares the simulated and measured responses for a step input.

The figure indicates that `model2`

is better than
`model1`

because `model2`

better fits the data (65% vs.
83%).

The fit percentage indicates the agreement between the model response and the measured output: 100 means a perfect fit, and 0 indicates a poor fit (that is, the model output has the same fit to the measured output as the mean of the measured output).

For more information, see topics on the Compare Output with Measured Data page.

The System Identification Toolbox software lets you perform residual analysis to assess the model quality. Residuals represent the portion of the output data not explained by the estimated model. A good model has residuals uncorrelated with past inputs.

For more information, see the topics on the Residual Analysis page.

When you estimate the model parameters from data, you obtain their nominal values that are accurate within a confidence region. The size of this region is determined by the values of the parameter uncertainties computed during estimation. The magnitude of the uncertainties provide a measure of the reliability of the model. Large uncertainties in parameters can result from unnecessarily high model orders, inadequate excitation levels in the input data, and a poor signal-to-noise ratio in measured data.

You can compute and visualize the effect of parameter uncertainties on the model response in the time and frequency domains using pole-zero maps, Bode response plots, and step response plots. For example, in the following Bode plot of an estimated model, the shaded regions represent the uncertainty in amplitude and phase of the frequency response of the model, computed using the uncertainty in the parameters. The plot shows that the uncertainty is low only in the 5 to 50 rad/s frequency range, which indicates that the model is reliable only in this frequency range.

For more information, see Compute Model Uncertainty.

The System Identification Toolbox documentation provides you with the necessary information to use this product. Additional resources are available to help you learn more about specific aspects of system identification theory and applications.

The following book describes methods for system identification and physical modeling:

Ljung, Lennart, and Torkel Glad.

*Modeling of Dynamic Systems*. Prentice Hall Information and System Sciences Series. Englewood Cliffs, NJ: PTR Prentice Hall, 1994.

These books provide detailed information about system identification theory and algorithms:

Ljung, Lennart.

*System Identification: Theory for the User*. Second edition. Prentice Hall Information and System Sciences Series. Upper Saddle River, NJ: PTR Prentice Hall, 1999.Söderström, Torsten, and Petre Stoica.

*System Identification*. Prentice Hall International Series in Systems and Control Engineering. New York: Prentice Hall, 1989.

For information about working with frequency-domain data, see the following book:

Pintelon, Rik, and Johan Schoukens.

*System Identification. A Frequency Domain Approach*. Hoboken, NJ: John Wiley & Sons, 2001. https://doi.org/10.1002/0471723134.

For information on nonlinear identification, see the following references:

Sjöberg, Jonas, Qinghua Zhang, Lennart Ljung, Albert Benveniste, Bernard Delyon, Pierre-Yves Glorennec, Håkan Hjalmarsson, and Anatoli Juditsky. “Nonlinear Black-Box Modeling in System Identification: A Unified Overview.”

*Automatica*31, no. 12 (December 1995): 1691–1724. https://doi.org/10.1016/0005-1098(95)00120-8.Juditsky, Anatoli, Håkan Hjalmarsson, Albert Benveniste, Bernard Delyon, Lennart Ljung, Jonas SjÖberg, and Qinghua Zhang. “Nonlinear Black-Box Models in System Identification: Mathematical Foundations.”

*Automatica*31, no. 12 (December 1995): 1725–50. https://doi.org/10.1016/0005-1098(95)00119-1.Zhang, Qinghua, and Albert Benveniste. “Wavelet Networks.”

*IEEE Transactions on Neural Networks*3, no. 6 (November 1992): 889–98. https://doi.org/10.1109/72.165591.Zhang, Qinghua. “Using Wavelet Network in Nonparametric Estimation.”

*IEEE Transactions on Neural Networks*8, no. 2 (March 1997): 227–36. https://doi.org/10.1109/72.557660.

For more information about systems and signals, see the following book:

Oppenheim, Alan V., and Alan S. Willsky,

*Signals and Systems*. Upper Saddle River, NJ: PTR Prentice Hall, 1985.

The following textbook describes numerical techniques for parameter estimation using criterion minimization:

Dennis, J. E., Jr., and Robert B. Schnabel.

*Numerical Methods for Unconstrained Optimization and Nonlinear Equations*. Upper Saddle River, NJ: PTR Prentice Hall, 1983.