How does regress deal with NaN?

32 views (last 30 days)
Hello. I have a question about how the regress function deals with NaN. I know that it handles them as missing values and ignores them but am wondering more specifically how this is done. mainly if i have a dataset containing a number of variables, say 4, and like 50 points for each of these. does it remove the rows for all the variables where only one is missing a value and thus keeping the columns the same lenght or does it somehow keep all the information that is in the dataset?
I hope i managed to make what i am asking clear. It was a little bit hard for me to formulate the question.

Accepted Answer

Kaushik Lakshminarasimhan
Kaushik Lakshminarasimhan on 11 Nov 2017
Type this on your command window:
open regress
If you scroll down to line 65 (might be a bit different depending on your version of Matlab), you'll see how regress deals with NaNs:
% Remove missing values, if any
wasnan = (isnan(y) | any(isnan(X),2));
havenans = any(wasnan);
if havenans
y(wasnan) = [];
X(wasnan,:) = [];
n = length(y);
end
You can see that regress removes the entire row of X, if either one or more of the entries in that row is NaN or if the corresponding output y is NaN. This is the correct way to handle missing values -- if you do not know the value of one of the predictors, you have to throw away the entire observation.

More Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!