I have a dataset containing y portfolios which consists of a length that can be anything between 60 to 180 observations.
These portfolios have to be regressed on their respective factors. The factor length is fixed at 180 observations. So for instance a portfolio containing 60 observations needs to be regressed on the 60 matching factor observations.
My portfolios can either contain:
180 observations, meaning they have data available for the entire period.
Less than 180 observations, but always at least 60 observations:
- They can either have observations from the beginning of the datapoint, but they can have missing data at some point in the data (for instance observations available from t0-t60)
- They can also start up during the data period. For instance from t100-t170.
As suggested in an earlier solution i need to load in the portfolios and in order to do so, I need to fill with NaN to make the matrix in same length.
So how do I make a loop that can capture the dynamics of my dataset?
At the moment I am thinking of an "if" function that will state that if NaN is present it will be ignored and the available data will then be regressed on the matching observations from my factor matrix. An example if portfolio y has observations from t100-t170, then they have to be regressed on factors from t100-t170.
Do you guys have a smart solution for this?