Linear regression with categorical predictor & quadratic term - dataset
6 views (last 30 days)
Show older comments
Hey!
I am trying to construct a dataset array as below(matlab help) in order to do a regression:
>> load carsmall
>> ds = dataset(MPG, Weight);
>> ds.Year=nominal(Model_Year);
>> mdl = fitlm(ds, 'MPG~Year+Weight^2')
mdl =
Linear regression model:
MPG ~ 1 + Weight + Year + Weight^2
Estimated Coefficients:
Estimate SE tStat pValue
(Intercept) 54.206 4.7117 11.505 2.6648e-19
Weight -0.016404 0.0031249 -5.2493 1.0283e-06
Year_76 2.0887 0.71491 2.9215 0.0044137
Year_82 8.1864 0.81531 10.041 2.6364e-16
Weight^2 1.5573e-06 4.9454e-07 3.149 0.0022303
Number of observations: 94, Error degrees of freedom: 89
Root Mean Squared Error: 2.78
R-squared: 0.885, Adjusted R-Squared 0.88
F-statistic vs. constant model: 172, p-value = 5.52e-41
But unfortunately, i get an error message when I try to do it with my data:
>> ds=dataset(price, size, weight, speed);
>> ds.postcode=nominal(postcode);
mdl = fitlm(ds, 'price~postcode+size+weight+speed')
Which is:
Index of element to remove exceeds matrix dimensions.
Error in classreg.regr.modelutils.designmatrix>dummyVars (line 432)
X0 = eye(ng); X0(:,1) = [];
Error in classreg.regr.modelutils.designmatrix (line 279)
[Xj,dummynames] = dummyVars(dummyCoding{j},Xj,catLevels{j});
Error in classreg.regr.TermsRegression/designMatrix (line 316)
[design,~,~,coefTerm,coefNames] ...
Error in LinearModel/fitter (line 654)
[model.Design,model.CoefTerm,model.CoefficientNames] =
designMatrix(model,X);
Error in classreg.regr.FitObject/doFit (line 220)
model = fitter(model);
Error in LinearModel.fit (line 857)
model = doFit(model);
Error in fitlm (line 111)
model = LinearModel.fit(X,varargin{:});
Thank you for your help!
10 Comments
dpb
on 13 Jun 2014
Edited: dpb
on 13 Jun 2014
I don't know; Rsq is only roughly 40% (leaving 60% of total unexplained) and just glancing thru it appears that a very few of the coefficients are significant out of the "cast of thousands". Superficially I wouldn't say it looks like a very good model and certainly isn't high on the candidates list for a parsimonious one... :)
I'd guess the model estimated on only
(Intercept) 0.090898
bedroom 1.0747e-22
postcode_NW1 0.072403
postcode_SW1X 0.026157
would perform nearly as well.
Answers (0)
See Also
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!