Lag order selection in multivariate time series models using recursive programming

5 views (last 30 days)
Hi. I am trying to create a recursive function that runs a model comparison procedure and selects the optimal lag order using information criteria (-only BIC for simplicity) for time-series models with lags (aka ADL), but it does so for multiple RHS variables (-I will call them X's, but I will also include autoregressive terms).
1. The Single-variable Case: lagordersim_1hp()
To give an example, the simplest case would be to have a single variable X for which I want to select the optimal number of lags (p*). The lags are stored in the matrix Xsel i.e. Xsel=[L1X L2X]. Any other variables that we wish to included in all the competing models, are placed in Xfix. Then, the model selection procedure is run for EACH individual lag in Xsel, adding one at a time, e.g.
  • y on Xfix L1X
  • y on Xfix L1X L2X
or using variable Xsel:
  • y on Xfix Xsel(:,1:1)
  • y on Xfix Xsel(:,1:2)
I have created a simple function to select the optimal lag in the single-variable case. Outputs 'ordr' and 'minval' are the optimal order (p*) and the corresponding BIC, respectively.
%Make some data.
y = rand(50,1);
Xsel = lagmatrix(rand(55,1),1:4);
Xsel = Xsel(6:end,:);
Xfix = ones(50,1); %intercept
[ordr, minval] = lagordersim_1hp(y, Xfix, Xsel);
2. The Mutlivariate Case: lagordersim()
In the recursive routine I am breaking down the problem into the single-variable case. Assuming 3 variables for which I want to find the optimal order, the problem is then simplified to the 1-variable case by keeping a single variable in Xsel (-the last one, i.e. X3) and adding the lags of the rest variables (X1,X2 etc) one-by-one into Xfix.
The way I have structured the funciton, is as follows: Xsel is now a Nv-by-1 cell array containing in each cell each of the RHS variables we want to optimize their lags (e.g. Xsel = {X1, X2, X3}).
%Make some data.
clear
y = rand(60,1);
X1 = lagmatrix(y,1:2); %AR terms
X2 = lagmatrix(rand(60,1),1:3);
X3 = lagmatrix(rand(60,1),1:4);
idx = ~any(isnan([y X1 X2 X3]),2);
y = y(idx); X1 = X1(idx,:); X2 = X2(idx,:); X3 = X3(idx,:);
Xsel = {X1, X2, X3};
Xfix = ones(length(y),1);
clear idx
[ordr, minval] = lagordersim(y, Xfix, Xsel) % 3 variables case
[ordr, minval] = lagordersim(y, Xfix, Xsel(1:2)) % 2 variables case
Note: You can find lagordersim_1hp() as a subfunction inside lagordersim(), if you wish to use it. However, using lagordersim(y,Xsel,Xfix) with Xsel being a (TxP) matrix, instead of cell array, would effectively return the same result as using lagordersim_1hp(y,Xsel,Xfix). For example:
[ordr, minval] = lagordersim(y, Xfix, Xsel(1)) % 1 variable case
Problem to be solved
I am sure that my function checks all the cases, but I am not sure how to keep track of all the iterations, and how to return the 2 final results:
  1. The optimal orders for all Nv variables in Xsel. This should be either a Nv cell of scalars, or a Nv vector, containing: ordr=[p1* p2* p3*].
  2. The minimum BIC
The final model should then be formed as: y on Xfix X1(:,1:p1*) X2(:,1:p2*) X3(:,1:p3*)

Answers (0)

Categories

Find more on Manual Performance Optimization in Help Center and File Exchange

Products


Release

R2020b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!