# regARIMA

Create regression model with ARIMA time series errors

## Description

The `regARIMA`

function returns a `regARIMA`

object specifying the functional form and storing the parameter values of a regression model with ARIMA time series
errors for a univariate response process
*y*_{t}.

$$\begin{array}{c}{y}_{t}=c+{X}_{t}\beta +{u}_{t}\\ a\left(L\right)A\left(L\right){\left(1-L\right)}^{D}\left(1-{L}^{s}\right){u}_{t}=b\left(L\right)B\left(L\right){\epsilon}_{t},\end{array}$$

Because they completely specify the model structure, the key components of a
`regARIMA`

object are the:

Regression model coefficients

*c*and*β*Polynomial degrees of the ARIMA disturbances

*u*, for example, the AR polynomial degree_{t}*p*and the degree of integration*D*

Given only polynomial degrees, the regression model contains only a constant.
All parameters, such as the model constant, and error model coefficients and
innovation-distribution parameters, are unknown and estimable unless you specify their values.
`regARIMA`

determines the number of coefficients in the regression model
by the number of variables in the supplied predictor data or by other specifications.

To estimate a model containing unknown parameter values, pass the model and data to the
`estimate`

object function. To work with an estimated or fully specified
`regARIMA`

object, pass it to an object function.

Alternatively, you can:

Create and work with

`regARIMA`

model objects interactively by using Econometric Modeler.Create a standard ARIMA model containing exogenous predictors (ARIMAX). For more details, see the

`arima`

function and Alternative ARIMA Model Representations.Create a Bayesian linear regression model by using the

`bayeslm`

function.

## Creation

### Description

creates a regression model
containing degree 0 ARIMA disturbances. The regression model contains an intercept; the
software determines the number of regression coefficients when you fit the model to data
by using `Mdl`

= regARIMA`estimate`

. The innovations are iid Gaussian
random variables with a mean of 0 and unknown variance.

creates a regression model with
ARIMA(`Mdl`

= regARIMA(`p`

,`D`

,`q`

)`p`

,`D`

,`q`

)
disturbances. The disturbance model contains nonseasonal AR polynomial lags from 1 through
`p`

, a degree `D`

nonseasonal integration polynomial,
and nonseasonal MA polynomial lags from 1 through `q`

. The regression
model contains an intercept; the software determines the number of regression coefficients
when you fit the model to data by using `estimate`

. The innovations are iid Gaussian random variables with a mean of 0
and unknown variance.

This shorthand syntax provides an easy way to create a model template in which you specify the degrees of the nonseasonal polynomials explicitly. The model template is suited for unrestricted parameter estimation. After you create a model, you can alter property values using dot notation.

sets properties and polynomial lags
using name-value arguments. For example, `Mdl`

= regARIMA(`Name=Value`

)```
regARIMA(ARLags=[1 4],AR={0.5
–0.1})
```

creates a regression model containing an unknown model intercept and
innovations variance, and AR(4) disturbances, where the lag 1 nonseasonal AR coefficient
is `–0.5`

and the lag 4 nonseasonal AR coefficient is
`0.1`

.

This longhand syntax allows you to create more flexible models. For example, you can
create a regression model with seasonal errors by using only longhand syntax.
`regARIMA`

infers all disturbance model polynomial degrees from the
properties that you set. Therefore, property values that correspond to polynomial degrees
must be consistent with each other.

### Input Arguments

The shorthand syntax provides an easy way for you to create model templates of regression models with nonseasonal ARIMA errors. Model templates are suitable for unrestricted parameter estimation. For example, to create a regression model with ARMA(2,1) errors containing an unknown model intercept and innovations variance, enter:

Mdl = regARIMA(2,0,1);

`p`

— Nonseasonal autoregressive polynomial degree

nonnegative integer

Nonseasonal autoregressive polynomial degree for the error model, specified as a nonnegative integer.

**Data Types: **`double`

`D`

— Degree of nonseasonal integration

nonnegative integer

Degree of nonseasonal integration (the degree of the nonseasonal differencing
polynomial) for the error model, specified as a nonnegative integer. The
`D`

input argument sets the property D.

**Data Types: **`double`

`q`

— Nonseasonal moving average polynomial degree

nonnegative integer

Nonseasonal moving average polynomial degree for the error model, specified as a nonnegative integer.

**Data Types: **`double`

**Name-Value Arguments**

Specify optional pairs of arguments as
`Name1=Value1,...,NameN=ValueN`

, where `Name`

is
the argument name and `Value`

is the corresponding value.
Name-value arguments must appear after other arguments, but the order of the
pairs does not matter.

*
Before R2021a, use commas to separate each name and value, and enclose*
`Name`

*in quotes.*

The longhand syntax enables you to create seasonal error models or models in which
some or all coefficients are known. During estimation, `estimate`

imposes equality constraints on any known parameters.

**Example: **`regARIMA(ARLags=[1 4],AR={0.5 –0.1})`

creates a regression
model containing an unknown model intercept and innovations variance, and AR(4)
disturbances, where the lag 1 nonseasonal AR coefficient is `–0.5`

and
the lag 4 nonseasonal AR coefficient is `0.1`

, symbolically, $$1-0.5{L}^{1}+0.1{L}^{4}$$.

`ARLags`

— Lags associated with nonseasonal AR polynomial coefficients

`1:numel(AR)`

(default) | numeric vector of unique positive integers

Lags associated with the nonseasonal AR polynomial coefficients for the error
model *u _{t}*, specified as a numeric vector of
unique positive integers. The maximum lag is

*p*.

`AR{`

is the coefficient of lag
* j*}

`ARLags(``j`

)

, where
`AR`

is the value of the property AR.**Example: **`ARLags=4`

specifies the nonseasonal AR polynomial $$1-{\varphi}_{4}{L}^{4}$$.

**Example: **`ARLags=1:4`

specifies the nonseasonal AR polynomial $$1-{\varphi}_{1}{L}^{1}-{\varphi}_{2}{L}^{2}-{\varphi}_{3}{L}^{3}-{\varphi}_{4}{L}^{4}$$.

**Example: **`ARLags=[1 4]`

specifies the nonseasonal AR polynomial $$1-{\varphi}_{1}{L}^{1}-{\varphi}_{4}{L}^{4}.$$

**Data Types: **`double`

`MALags`

— Lags associated with nonseasonal MA polynomial coefficients

`1:numel(MA)`

(default) | numeric vector of unique positive integers

Lags associated with the nonseasonal MA polynomial coefficients for the error
model *u _{t}*, specified as a numeric vector of
unique positive integers. The maximum lag is

*q*.

`MA{`

is the coefficient of lag
* j*}

`MALags(``j`

)

, where
`MA`

is the value of the property MA.**Example: **`MALags=3`

specifies the nonseasonal MA polynomial $$1+{\theta}_{3}{L}^{3}$$.

**Example: **`MALags=1:3`

specifies the nonseasonal MA polynomial $$1+{\theta}_{1}{L}^{1}+{\theta}_{2}{L}^{2}+{\theta}_{3}{L}^{3}.$$

**Example: **`MALags=[1 3]`

specifies the nonseasonal MA polynomial $$1+{\theta}_{1}{L}^{1}+{\theta}_{3}{L}^{3}$$.

**Data Types: **`double`

`SARLags`

— Lags associated with seasonal AR polynomial coefficients

`1:numel(SAR)`

(default) | numeric vector of unique positive integers

Lags associated with the seasonal AR polynomial coefficients for the error model
*u _{t}*, specified as a numeric vector of
unique positive integers. The maximum lag is

*p*.

_{s}`SAR{`

is the coefficient of lag
* j*}

`SARLags(``j`

)

, where
`SAR`

is the value of the property SAR.Specify `SARLags`

as the periodicity of the observed data, not
as multiples of the Seasonality property.
This convention does not conform to standard Box and Jenkins [1] notation, but it is more flexible for incorporating multiplicative
seasonality.

**Example: **`SARLags=[4 8]`

specifies the seasonal AR polynomial $$1-{\Phi}_{4}{L}^{4}-{\Phi}_{8}{L}^{8}.$$

**Data Types: **`double`

`SMALags`

— Lags associated with seasonal MA polynomial coefficients

`1:numel(SMA)`

(default) | numeric vector of unique positive integers

Lags associated with the seasonal MA polynomial coefficients for the error model
*u _{t}*, specified as a numeric vector of
unique positive integers. The maximum lag is

*q*.

_{s}`SMA{`

is the coefficient of lag
* j*}

`SMALags(``j`

)

, where
`SMA`

is the value of the property SMA.Specify `SMALags`

as the periodicity of the observed data, not
as multiples of the Seasonality property.
This convention does not conform to standard Box and Jenkins [1] notation, but it is more flexible for incorporating multiplicative
seasonality.

**Example: **`SMALags=4`

specifies the seasonal MA polynomial $$1+{\Theta}_{4}{L}^{4}.$$

**Data Types: **`double`

**Note**

Polynomial degrees are not estimable. If you do not specify a polynomial degree, or
`regARIMA`

cannot infer it from other specifications,
`regARIMA`

does not include the polynomial in the model.

## Properties

You can set writable property values when you create the model object by using name-value argument syntax, or after you create the model object by using dot notation. For example, to create the fully specified regression model with ARMA(2,1) disturbances

$$\begin{array}{l}{y}_{t}=1+3{x}_{1}+5{x}_{2}+{u}_{t}\\ {u}_{t}=0.3{u}_{t-1}-0.15{u}_{t-2}+{\epsilon}_{t}+0.2{\epsilon}_{t-1},\end{array}$$

enter:

Mdl = regARIMA(Intercept=1,Beta=[3; 5],AR={0.3 -0.15},MA=0.2); Mdl.Variance = 1;

**Note**

`NaN`

-valued properties indicate estimable parameters. Numeric properties indicate equality constraints on parameters during model estimation. Coefficient vectors can contain both numeric and`NaN`

-valued elements.You can specify polynomial coefficients as vectors in any orientation, but

`regARIMA`

stores them as row vectors.

## Regression Model Properties

`Intercept`

— Regression model intercept *c*

`NaN`

(default) | numeric scalar

Regression model intercept *c*, specified as a numeric
scalar.

**Example: **`Intercept=1`

**Data Types: **`double`

`Beta`

— Regression model coefficients *β*

empty row vector `[]`

(default) | numeric vector

Regression component coefficients *β* associated with predictor
variables *x _{t}*, specified as a numeric
vector.

The default indicates one of the following conditions:

`estimate`

infers the size of`Beta`

from the number of columns of the specified predictor data`X`

. Therefore, if you plan to fit all regression coefficients to data, you do not need to specify`Beta`

.The model does not include regression coefficients.

**Example: **`Beta=[0.5 NaN 3]`

specifies three regression coefficients.
During estimation, `estimate`

fixes
*β*_{1} to 5 and
*β*_{3} to 3, and it fits
*β*_{2} to the data associated with the second
predictor variable.

**Data Types: **`double`

## Error Model Properties

`P`

— Compound AR polynomial degree

nonnegative integer

This property is read-only.

Compound AR polynomial degree of the error model, specified as a nonnegative integer.

`P`

does not necessarily conform to standard Box and Jenkins
notation [1] because `P`

captures the degrees of the nonseasonal and seasonal AR
polynomials (properties `AR`

and `SAR`

,
respectively), nonseasonal integration (property `D`

), and
seasonality (property `Seasonality`

). Explicitly,
`P`

= *p* + *D* +
*p _{s}* +

*s*.

`P`

conforms to Box and Jenkins notation for models without
integration or a seasonal AR component (`D`

= `0`

and `SAR`

= `{}`

).`P`

specifies the number of lagged observations required to
initialize the AR components of the model.

**Data Types: **`double`

`Q`

— Compound MA polynomial degree

nonnegative integer

This property is read-only.

Compound MA polynomial degree of the error model, specified as a nonnegative integer.

`Q`

does not necessarily conform to standard Box and Jenkins
notation [1] because `Q`

captures the degrees of the nonseasonal and seasonal MA
polynomials (properties `MA`

and `SMA`

,
respectively). Explicitly, `Q`

= *q* +
*q _{s}*.

`Q`

conforms to Box
and Jenkins notation for models without a seasonal MA component
(`SMA`

= `{}`

).`Q`

specifies the number of lagged innovations required to
initialize the MA components of the model.

**Data Types: **`double`

`AR`

— Nonseasonal AR polynomial coefficients *ϕ*

cell vector | empty cell vector `{}`

Nonseasonal AR polynomial coefficients *ϕ* for the error model
*u _{t}*, specified as a cell vector. Cells
contain numeric scalars or

`NaN`

values. A fully specified nonseasonal
AR polynomial must be stable.Coefficient signs correspond to the model expressed in difference-equation notation.
For example, for the nonseasonal AR polynomial $$\varphi \left(L\right)=1-0.5L+0.1{L}^{2},$$ specify `AR={0.5 –0.1}`

.

If you do not set the `ARLags`

name-value argument,
`AR{`

is the coefficient of lag
* j*}

*,*

`j`

*= 1,…,*

`j`

*p*, where

*p*=

`numel(AR)`

.Otherwise, if `ARLags`

=

, with `arlags`

*p* =
`max(`

, the following conditions apply:* arlags*)

The lengths of

`AR`

and

must be equal.`arlags`

`AR{`

is the coefficient of lag}`j`

, for each(`arlags`

)`j`

.`j`

`regARIMA`

stores`AR`

as a length*p*cell vector. All cells that do not correspond to lags in

contain`arlags`

`0`

.

The default value of `AR`

depends on other specifications:

If you use the shorthand syntax to specify

`p`

> 0,`AR`

is a length`p`

cell vector, where each cell contains a`NaN`

value.If you specify

`ARLags`

,`AR`

is a length*p*cell vector.`AR{`

=}`j`

`NaN`

for each lag

. All other cells contain(`arlags`

)`j`

`0`

.Otherwise,

`AR`

is an empty cell vector`{}`

, meaning the model does not contain a nonseasonal AR polynomial.

The coefficients in `AR`

correspond to coefficients in an
underlying `LagOp`

lag operator polynomial, and they are
subject to a near-zero tolerance exclusion test. If a coefficient is
`1e–12`

or below, `regARIMA`

excludes that
coefficient and its corresponding lag in `ARLags`

from the
model.

**Example: **`AR={0.8}`

sets the only AR lag coefficient associated with
lag `ARLags(1)`

to `0.8`

.

**Example: **`regARIMA(AR={0.2 0 0.1})`

sets the error model, in
difference-equation form, to $${u}_{t}=0.2{u}_{t-1}+0.1{u}_{t-3}+{\epsilon}_{t}$$.

**Example: **`regARIMA(AR={NaN –0.1},ARLags=[4 8])`

sets the AR lag
polynomial to $$1-{\varphi}_{4}{L}^{4}+0.1{L}^{8}$$, where *ϕ*_{4} is unknown and
estimable.

**Data Types: **`cell`

`MA`

— Nonseasonal MA polynomial coefficients *θ*

cell vector | empty cell vector `{}`

Nonseasonal MA polynomial coefficients *θ* for the error model
*u _{t}*, specified as a cell vector. Cells
contain numeric scalars or

`NaN`

values. A fully specified nonseasonal
MA polynomial must be invertible.If you do not set the `MALags`

name-value pair argument,
`MA{`

is the coefficient of lag
* j*}

*,*

`j`

*= 1,…,*

`j`

*q*, where

*q*=

`numel(MA)`

.Otherwise, if `MALags`

=

, with `malags`

*q* =
`max(MALags)`

, the following conditions apply:

The lengths of

`MA`

and

must be equal.`malags`

`MA{`

is the coefficient of lag}`j`

, for each(`malags`

)`j`

.`j`

`regARIMA`

stores`MA`

as a length*q*cell vector. All cells that do not correspond to lags in

contain`malags`

`0`

.

The default value of `MA`

depends on other specifications:

If you use the shorthand syntax to specify

`q`

> 0,`MA`

is a length`q`

cell vector, where each cell contains a`NaN`

value.If you specify

`MALags`

,`MA`

is a length*q*cell vector.`MA{`

=}`j`

`NaN`

for each lag

. All other cells contain(`malags`

)`j`

`0`

.Otherwise,

`MA`

is an empty cell vector`{}`

, meaning the error model does not contain a nonseasonal MA polynomial.

The coefficients in `MA`

correspond to coefficients in an
underlying `LagOp`

lag operator polynomial, and they are
subject to a near-zero tolerance exclusion test. If a coefficient is
`1e–12`

or below, `regARIMA`

excludes that
coefficient and its corresponding lag in `MALags`

from the
model.

**Example: **`MA=0.8`

sets the only MA lag coefficient associated with
lag `MALags(1)`

to `0.8`

.

**Example: **`regARIMA(MA={0.2 0.1})`

sets the error model to $${u}_{t}={\epsilon}_{t}+0.2{\epsilon}_{t-1}+0.1{\epsilon}_{t-2}.$$

**Example: **`regARIMA(MA={NaN –0.1},MALags=[4 8])`

sets the MA lag
polynomial to $$1+{\theta}_{4}{L}^{4}-0.1{L}^{8}$$, where *θ*_{4} is unknown and
estimable.

**Data Types: **`cell`

`SAR`

— Seasonal AR polynomial coefficients Φ

cell vector | empty cell vector `{}`

Seasonal AR polynomial coefficients Φ for the error model
*u _{t}*, specified as a cell vector. Cells
contain numeric scalars or

`NaN`

values. A fully specified seasonal AR
polynomial must be stable.Coefficient signs correspond to the model expressed in difference-equation notation.
For example, for the seasonal AR polynomial $$\Phi \left(L\right)=1-0.5{L}^{4}+0.1{L}^{8},$$ specify `SAR={0.5 –0.1}`

.

If you do not set the `SARLags`

name-value argument,
`SAR{`

is the coefficient of lag
* j*}

*,*

`j`

*= 1,…,*

`j`

*p*, where

_{s}*p*=

_{s}`numel(SAR)`

.Otherwise, if `SARLags`

=

, with
`sarlags`

*p _{s}* =

`max(``sarlags`

)

, the following conditions apply:The lengths of

`SAR`

and

must be equal.`sarlags`

`SAR{`

is the coefficient of lag}`j`

, for each(`sarlags`

)`j`

.`j`

`regARIMA`

stores`SAR`

as a length*p*cell vector. All cells that do not correspond to lags in_{s}

contain`sarlags`

`0`

.

The default value of `SAR`

depends on the value of
`SARLags`

:

If you specify

`SARLags`

,`SAR`

is a length*p*cell vector._{s}`SAR{`

=}`j`

`NaN`

for each lag`SARLags(`

. All other cells contain)`j`

`0`

.Otherwise,

`SAR`

is an empty cell vector`{}`

, meaning the error model does not contain a seasonal AR polynomial.

The coefficients in `SAR`

correspond to coefficients in an
underlying `LagOp`

lag operator polynomial, and they are
subject to a near-zero tolerance exclusion test. If a coefficient is
`1e–12`

or below, `regARIMA`

excludes that
coefficient and its corresponding lag in `SARLags`

from the
model.

**Example: **`SAR=0.8`

sets the only SAR lag coefficient associated with
lag `SARLags(1)`

to `0.8`

.

**Example: **`regARIMA(SAR={0.2 0.1},Seasonality=4)`

sets the error
model to $$\left(1-0.2{L}^{1}-0.1{L}^{2}\right)\left(1-{L}^{4}\right){u}_{t}={\epsilon}_{t}$$.

**Example: **`regARIMA(SAR={NaN –0.1},SARLags=[4 8],Seasonality=4)`

sets
the SAR lag polynomial to $$\left(1-{\Theta}_{4}{L}^{4}-0.1{L}^{8}\right)\left(1-{L}^{4}\right)$$, where Φ_{4} is unknown and
estimable.

**Data Types: **`cell`

`SMA`

— Seasonal MA polynomial coefficients

cell vector | empty cell vector `{}`

Seasonal MA polynomial coefficients for the error model, specified as a cell vector.
Cells contain numeric scalars or `NaN`

values. A fully specified
seasonal MA polynomial must be invertible.

If you do not set the `SMALags`

name-value argument,
`SMA{`

is the coefficient of lag
* j*}

*,*

`j`

*= 1,…,*

`j`

*q*, where

_{s}*q*=

_{s}`numel(SMA)`

.Otherwise, if `SMALags`

=

, with
`smalags`

*q _{s}* =

`max(``smalags`

)

, the following conditions apply:The lengths of

`SMA`

and`SMALags`

must be equal.`SMA{`

is the coefficient of lag}`j`

, for each(`smalags`

)`j`

.`j`

`regARIMA`

stores`SMA`

as a length*q*cell vector. All cells that do not correspond to lags in_{s}

contain`smalags`

`0`

.

The default value of `SMA`

depends on other specifications:

If you specify

`SMALags`

,`MA`

is a length*q*cell vector.`MA{`

=}`j`

`NaN`

for each lag`MALags(`

. All other cells contain)`j`

`0`

.Otherwise,

`SMA`

is an empty cell vector`{}`

, meaning the error model does not contain a seasonal MA polynomial.

The coefficients in `SMA`

correspond to coefficients in an
underlying `LagOp`

lag operator polynomial, and they are
subject to a near-zero tolerance exclusion test. If a coefficient is
`1e–12`

or below, `regARIMA`

excludes that
coefficient and its corresponding lag in `SMALags`

from the
model.

**Example: **`SMA=0.8`

sets the only SMA lag coefficient associated with
lag `SMALags(1)`

to `0.8`

.

**Example: **`regARIMA(SMA{0.2 0.1},Seasonality=4)`

specifies the error
model $$\left(1-{L}^{4}\right){u}_{t}=\left(1+0.2L+0.1{L}^{2}\right){\epsilon}_{t}.$$

**Example: **`regARIMA(SMALags=[1 4],SMA={0.2 0.1},Seasonality = 4)`

specifies the error model $$\left(1-{L}^{4}\right){u}_{t}=\left(1+0.2L+0.1{L}^{4}\right){\epsilon}_{t}.$$

**Data Types: **`cell`

`D`

— Degree of nonseasonal integration

`0`

(default) | nonnegative integer

Degree of nonseasonal integration, or the degree of the nonseasonal differencing polynomial, for the error model specified as a nonnegative integer.

If you use shorthand syntax to create `Mdl`

, the input
`d`

sets `D`

.

**Example: **`D=1`

**Example: **`regARIMA(0,1,2)`

sets `D`

to
`1`

.

**Data Types: **`double`

`Seasonality`

— Degree of seasonal differencing polynomial

`0`

(default) | nonnegative integer

Degree of the seasonal differencing polynomial *s* for the error
model, specified as a nonnegative integer.

**Example: **`Seasonality=12`

specifies monthly
periodicity.

**Data Types: **`double`

`Variance`

— Variance *σ*^{2} of model innovations process *ε*_{t}

`NaN`

(default) | positive scalar

_{t}

Variance *σ*^{2} of the model innovations
process *ε _{t}*, specified as a positive
scalar.

`NaN`

specifies an unknown and estimable variance, which
`estimate`

fits to data.

**Example: **`Variance=1`

**Data Types: **`double`

## Other Properties

`Description`

— Model description

string scalar | character vector

Model description, specified as a string scalar or character vector. `regARIMA`

stores the value as a string scalar. The default value describes the parametric form of the model, for example, ```
"Regression with ARMA(2,1) Error Model (Gaussian
Distribution)"
```

.

**Example: **`"Model 1"`

**Data Types: **`string`

| `char`

`Distribution`

— Conditional probability distribution of innovation process *ε*_{t}

`"Gaussian"`

(default) | `"t"`

| structure array

_{t}

Conditional probability distribution of the innovation process
*ε _{t}*, specified as a string or structure
array.

`regARIMA`

stores the value as a structure array.

Distribution | String | Structure Array |
---|---|---|

Gaussian | `"Gaussian"` | `struct('Name',"Gaussian")` |

Student’s t | `"t"` | `struct('Name',"t",'DoF',DoF)` |

The `'DoF'`

field specifies the *t* distribution
degrees of freedom parameter.

`DoF`

> 2 or`DoF`

=`NaN`

.`DoF`

is estimable.If you specify

`"t"`

,`DoF`

is`NaN`

by default. You can change its value by using dot notation after you create the model. For example,`Mdl.Distribution.DoF = 3`

.If you supply a structure array to specify the Student's

*t*distribution, then you must specify both the`'Name'`

and the`'DoF'`

fields.

**Example: **`Distribution=struct('Name',"t",'DoF',10)`

`SeriesName`

— Response series name

string scalar | character vector | `"Y"`

*Since R2023b*

Response series name, specified as a string scalar or character vector. `regARIMA`

stores the value as a string scalar.

**Example: **`"StockReturn"`

**Data Types: **`string`

| `char`

## Object Functions

`estimate` | Fit univariate regression model with ARIMA errors to data |

`infer` | Infer residuals of univariate regression model with ARIMA time series errors |

`summarize` | Display estimation results of regression model with ARIMA errors |

`simulate` | Monte Carlo simulation of univariate regression model with ARIMA time series errors |

`filter` | Filter disturbances through regression model with ARIMA errors |

`impulse` | Generate regression model with ARIMA errors impulse response function |

`forecast` | Forecast responses of univariate regression model with ARIMA time series errors |

`arima` | Convert regression model with ARIMA errors to ARIMAX model |

## Examples

### Specify Regression Model with Nonseasonal ARIMA Errors

Specify the following regression model with ARIMA(2,1,3) errors:

$$\begin{array}{c}{y}_{t}={u}_{t}\\ (1-{\varphi}_{1}L-{\varphi}_{2}{L}^{2})(1-L){u}_{t}=(1+{\theta}_{1}L+{\theta}_{2}{L}^{2}+{\theta}_{3}{L}^{3}){\epsilon}_{t}.\end{array}$$

Mdl = regARIMA(2,1,3)

Mdl = regARIMA with properties: Description: "ARIMA(2,1,3) Error Model (Gaussian Distribution)" SeriesName: "Y" Distribution: Name = "Gaussian" Intercept: NaN Beta: [1×0] P: 3 D: 1 Q: 3 AR: {NaN NaN} at lags [1 2] SAR: {} MA: {NaN NaN NaN} at lags [1 2 3] SMA: {} Variance: NaN

The output displays the values of the properties `P`

, `D`

, and `Q`

of `Mdl`

. The corresponding autoregressive and moving average coefficients (contained in `AR`

and `MA`

) are cell arrays containing the correct number of `NaN`

values. Because `P`

= `p`

+ `D`

= 3, you need three presample observations to initialize the model for estimation.

### Modify Regression Model with ARIMA Errors

Define the regression model with ARIMA errors:

$$\begin{array}{l}\begin{array}{c}{y}_{t}=2+{X}_{t}\left[\begin{array}{c}1.5\\ 0.2\end{array}\right]+{u}_{t}\\ (1-0.2L-0.3{L}^{2}){u}_{t}=(1+0.1L){\epsilon}_{t},\end{array}\end{array}$$

where $${\epsilon}_{t}$$ is Gaussian with variance 0.5.

```
Mdl = regARIMA(Intercept=2,AR={0.2 0.3},MA={0.1}, ...
Variance=0.5,Beta=[1.5 0.2])
```

Mdl = regARIMA with properties: Description: "Regression with ARMA(2,1) Error Model (Gaussian Distribution)" SeriesName: "Y" Distribution: Name = "Gaussian" Intercept: 2 Beta: [1.5 0.2] P: 2 Q: 1 AR: {0.2 0.3} at lags [1 2] SAR: {} MA: {0.1} at lag [1] SMA: {} Variance: 0.5

`Mdl`

is fully specified to, for example, simulate a series of responses given the predictor data matrix, $${X}_{t}$$.

Modify the model to estimate the regression coefficient, the AR terms, and the variance of the innovations.

Mdl.Beta = [NaN NaN]; Mdl.AR = {NaN NaN}; Mdl.Variance = NaN;

Change the innovations distribution to a $$t$$ distribution with 15 degrees of freedom.

Mdl.Distribution = struct("Name","t","DoF",15)

Mdl = regARIMA with properties: Description: "Regression with ARMA(2,1) Error Model (t Distribution)" SeriesName: "Y" Distribution: Name = "t", DoF = 15 Intercept: 2 Beta: [NaN NaN] P: 2 Q: 1 AR: {NaN NaN} at lags [1 2] SAR: {} MA: {0.1} at lag [1] SMA: {} Variance: NaN

### Specify Regression Model with SARIMA Errors

Specify the following model:

$$\begin{array}{l}\begin{array}{c}{y}_{t}=1+6{X}_{t}+{u}_{t}\\ (1-0.2L)(1-L)(1-0.5{L}^{4}-0.2{L}^{8})(1-{L}^{4}){u}_{t}=(1+0.1L)(1+0.05{L}^{4}+0.01{L}^{8}){\epsilon}_{t},\end{array}\end{array}$$

where $${\epsilon}_{t}$$ is Gaussian with variance 1.

Mdl = regARIMA(Intercept=1,Beta=6,AR=0.2,MA=0.1,D=1, ... SAR={0.5,0.2},SARLags=[4, 8],SMA={0.05,0.01},SMALags=[4 8], ... Seasonality=4,Variance=1)

Mdl = regARIMA with properties: Description: "Regression with ARIMA(1,1,1) Error Model Seasonally Integrated with Seasonal AR(8) and MA(8) (Gaussian Distribution)" SeriesName: "Y" Distribution: Name = "Gaussian" Intercept: 1 Beta: [6] P: 14 D: 1 Q: 9 AR: {0.2} at lag [1] SAR: {0.5 0.2} at lags [4 8] MA: {0.1} at lag [1] SMA: {0.05 0.01} at lags [4 8] Seasonality: 4 Variance: 1

If you do not specify `SARLags`

or `SMALags`

, then the coefficients in `SAR`

and `SMA`

correspond to lags 1 and 2 by default.

Mdl = regARIMA(Intercept=1,Beta=6,AR=0.2,MA=0.1,D=1, ... SAR={0.5,0.2},SARLags=[4, 8], ... Seasonality=4,Variance=1)

Mdl = regARIMA with properties: Description: "Regression with ARIMA(1,1,1) Error Model Seasonally Integrated with Seasonal AR(8) (Gaussian Distribution)" SeriesName: "Y" Distribution: Name = "Gaussian" Intercept: 1 Beta: [6] P: 14 D: 1 Q: 1 AR: {0.2} at lag [1] SAR: {0.5 0.2} at lags [4 8] MA: {0.1} at lag [1] SMA: {} Seasonality: 4 Variance: 1

## More About

### Regression Model with ARIMA Time Series Errors

A *regression model with ARIMA time series errors*
explains the behavior of a response series by applying linear regression with predictor
data, though the errors have autocorrelation indicative of an ARIMA process.

The model has the form (in lag operator notation)

$$\begin{array}{c}{y}_{t}=c+{X}_{t}\beta +{u}_{t}\\ a\left(L\right)A\left(L\right){\left(1-L\right)}^{D}\left(1-{L}^{s}\right){u}_{t}=b\left(L\right)B\left(L\right){\epsilon}_{t},\end{array}$$

where:

*t*= 1,...,*T*.*y*is the response series._{t}*X*is row_{t}*t*of*X*, which is the matrix of concatenated predictor data vectors. That is,*X*is observation_{t}*t*of each predictor series.*c*is the regression model intercept.*β*is the regression coefficient.*u*is the disturbance series._{t}*ε*is the innovations series._{t}$${L}^{j}{y}_{t}={y}_{t-j}.$$

$$a\left(L\right)=\left(1-{a}_{1}L-\mathrm{...}-{a}_{p}{L}^{p}\right),$$ which is the degree

*p*, nonseasonal autoregressive polynomial.$$A\left(L\right)=\left(1-{A}_{1}L-\mathrm{...}-{A}_{{p}_{s}}{L}^{{p}_{s}}\right),$$ which is the degree

*p*, seasonal autoregressive polynomial._{s}$${\left(1-L\right)}^{D},$$ which is the degree

*D*, nonseasonal integration polynomial.$$\left(1-{L}^{s}\right),$$ which is the degree

*s*, seasonal integration polynomial.$$b\left(L\right)=\left(1+{b}_{1}L+\mathrm{...}+{b}_{q}{L}^{q}\right),$$ which is the degree

*q*, nonseasonal moving average polynomial.$$B\left(L\right)=\left(1+{B}_{1}L+\mathrm{...}+{B}_{{q}_{s}}{L}^{{q}_{s}}\right),$$ which is the degree

*q*, seasonal moving average polynomial._{s}

Regression models with ARIMA errors contain a hierarchy of error series. The unconditional disturbance, *u _{t}*, or structural disturbance, is based on the structural regression component. The conditional error (one-step-ahead forecast or prediction error),

*ε*is the innovation of

_{t}*u*.

_{t}**Note**

The degrees of the lag operators in the seasonal polynomials
*A*(*L*) and
*B*(*L*) do not conform to those defined by Box and
Jenkins [1]. In other words, Econometrics Toolbox™ does not treat *p _{1}* =

*s*,

*p*=

_{2}*2s*,...,

*p*=

_{s}*c*nor

_{p}s*q*=

_{1}*s*,

*q*=

_{2}*2s*,...,

*q*=

_{s}*c*, where

_{q}s*c*and

_{p}*c*are positive integers. The software is flexible as it lets you specify the lag operator degrees. See Create Multiplicative ARIMA Models.

_{q}## References

[1] Box, George E. P., Gwilym M. Jenkins, and Gregory C. Reinsel. *Time Series Analysis: Forecasting and Control*. 3rd ed. Englewood Cliffs, NJ: Prentice Hall, 1994.

## Version History

**Introduced in R2013b**

### R2023b: Name the response series of a regression model with ARIMA errors

Name the response series of a regression model with ARIMA errors by setting the
`SeriesName`

property to a string scalar. When you supply input
response data to model object functions in a table or timetable, the functions choose the
variable with name `SeriesName`

as the response variable by default.

### R2018a: Describe a response series of a regression model with ARIMA errors

Describe a response series of a regression model with ARIMA errors by setting the
`Description`

property to a string scalar.

### R2018a: Use indices that are consistent with MATLAB cell array indexing

The indices of cell arrays of lag operator polynomial coefficients follow MATLAB^{®} cell array indexing rules. Affected model properties are the
`AR`

, `MA`

, `SAR`

, and
`SMA`

properties.

You cannot access any lag-zero coefficients by using an index of

`0`

. For example,`Mdl.AR{0}`

issues an error.Remove any instances of such indices of zero from your code. The value of all lag-zero coefficients is

`1`

, except for the lag operator polynomial corresponding to the`ARCH`

property, which has the value`0`

.You cannot index beyond the maximal lag in the polynomial. For example, if

`Mdl.P`

is 4, then`Mdl.AR{p}`

issues an error when`p`

is greater than`4`

. For details on the maximal lags of the lag operator polynomials, see the corresponding property descriptions.Remove any instances of such indices beyond the maximal lag from your code. All coefficients beyond the maximal lag are

`0`

.

### R2018a: Models store innovation distribution name as a string scalar

The `Name`

field of the `Distribution`

property of
`regARIMA`

model objects stores the innovation distribution name
as a string scalar, for example, `"Gaussian"`

for Gaussian innovations.
Before R2018a, MATLAB stored the innovation distribution name as a character vector, for example
`'Gaussian'`

for Gaussian innovations. Although most text-data
operations accept character vectors and string scalars for text-data input, the two data
types have some differences. For details, see Text in String and Character Arrays.

## Open Example

You have a modified version of this example. Do you want to open this example with your edits?

## MATLAB Command

You clicked a link that corresponds to this MATLAB command:

Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.

Select a Web Site

Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .

You can also select a web site from the following list:

## How to Get Best Site Performance

Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.

### Americas

- América Latina (Español)
- Canada (English)
- United States (English)

### Europe

- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)

- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)