Documentation

AR Order Selection with Partial Autocorrelation Sequence

This example shows how to assess the order of an autoregressive model using the partial autocorrelation sequence. For these processes, you can use the partial autocorrelation sequence to help with model order selection. For a stationary time series with values $X(1),X(2),X(3),\dots,X(k+1)$, the partial autocorrelation sequence at lag $k$ is the correlation between $X(1)$ and $X(k+1)$ after regressing $X(1)$ and $X(k+1)$ on the intervening observations, $X(2),X(3),X(4),\dots,X(k)$. For a moving average process, you can use the autocorrelation sequence to assess the order. However, for an autoregressive (AR) or autoregressive moving average (ARMA) process, the autocorrelation sequence does not help in order selection. Consider the AR(2) process defined by

$$X(n)+1.5X(n-1)+0.75X(n-2)=\varepsilon(n),$$

where $\varepsilon(n)$ is an $N(0,1)$ Gaussian white noise process. The following example:

  • Simulates a realization of the AR(2) process

  • Graphically explores the correlation between lagged values of the time series

  • Examines the sample autocorrelation sequence of the time series

  • Fits an AR(15) model to the time series by solving the Yule-Walker equations (aryule)

  • Uses the reflection coefficients returned by aryule to compute the partial autocorrelation sequence

  • Examines the partial autocorrelation sequence to select the model order

Simulate a 1000-sample time series from the AR(2) process defined by the difference equation. Set the random number generator to the default settings for reproducible results.

A = [1 1.5 0.75];
rng default
x = filter(1,A,randn(1000,1));

View the frequency response of the AR(2) process.

freqz(1,A)

The AR(2) process acts like a highpass filter in this case.

Graphically examine the correlation in x by producing scatter plots of $X(n)$ vs. $X(1)$ for $n = 2, 3, 4, 5$.

x12 = x(1:end-1);
x21 = x(2:end);
subplot(2,2,1)
plot(x12,x21,'*')
xlabel('X_1')
ylabel('X_2')

x13 = x(1:end-2);
x31 = x(3:end);
subplot(2,2,2)
plot(x13,x31,'*')
xlabel('X_1')
ylabel('X_3')

x14 = x(1:end-3);
x41 = x(4:end);
subplot(2,2,3)
plot(x14,x41,'*')
xlabel('X_1')
ylabel('X_4')

x15 = x(1:end-4);
x51 = x(5:end);
subplot(2,2,4)
plot(x15,x51,'*')
xlabel('X_1')
ylabel('X_5')

In the scatter plot, you see there is a linear relationship between $X(1)$ and $X(2)$ and between $X(1)$ and $X(3)$, but not between $X(1)$ and either $X(4)$ or $X(5)$.

The points in the top row scatter plots fall approximately on a line with a negative slope in the top left panel and positive slope in the top right panel. The scatter plots in the bottom two panels do not show any apparent linear relationship.

The negative correlation between $X(1)$ and $X(2)$ and the positive correlation between $X(1)$ and $X(3)$ are explained by the highpass-filter behavior of the AR(2) process.

Find the sample autocorrelation sequence out to lag 50 and plot the result.

[xc,lags] = xcorr(x,50,'coeff');

figure
stem(lags(51:end),xc(51:end),'filled')
xlabel('Lag')
ylabel('ACF')
title('Sample Autocorrelation Sequence')

The sample autocorrelation sequence shows a negative value at lag 1 and a positive value at lag 2. Based on the scatter plot, this is the expected result. However, you cannot determine from the sample autocorrelation sequence what order is appropriate for the AR model.

Fit an AR(15) model using aryule. Return the reflection coefficients. The negative of the reflection coefficients is the partial autocorrelation sequence.

[arcoefs,E,K] = aryule(x,15);
pacf = -K;

Plot the partial autocorrelation sequence along with the large-sample 95% confidence intervals. If the data are generated by an autoregressive process of order $p$, the values of the sample partial autocorrelation sequence for lags greater than $p$ follow a $N(0,1/N)$ distribution, where $N$ is the length of the time series.

stem(pacf,'filled')
xlabel('Lag')
ylabel('Partial Autocorrelation')
xlim([1 15])
uconf = 1.96/sqrt(1000);
lconf = -uconf;
hold on
plot([1 15],[1 1]'*[lconf uconf],'r')

The only values of the partial autocorrelation sequence outside the 95% confidence bounds occur at lags 1 and 2. This indicates that the correct model order for the AR process is 2.

In this example, you generated the time series to simulate an AR(2) process. The partial autocorrelation sequence only confirms that result. In practice, you have only the observed time series without any prior information about model order. In a realistic scenario, the partial autocorrelation is an important tool for appropriate model order selection in stationary autoregressive time series.

Was this topic helpful?