Linear regression with categorical predictor & quadratic term - dataset

4 views (last 30 days)
Hey!
I am trying to construct a dataset array as below(matlab help) in order to do a regression:
>> load carsmall
>> ds = dataset(MPG, Weight);
>> ds.Year=nominal(Model_Year);
>> mdl = fitlm(ds, 'MPG~Year+Weight^2')
mdl =
Linear regression model:
MPG ~ 1 + Weight + Year + Weight^2
Estimated Coefficients:
Estimate SE tStat pValue
(Intercept) 54.206 4.7117 11.505 2.6648e-19
Weight -0.016404 0.0031249 -5.2493 1.0283e-06
Year_76 2.0887 0.71491 2.9215 0.0044137
Year_82 8.1864 0.81531 10.041 2.6364e-16
Weight^2 1.5573e-06 4.9454e-07 3.149 0.0022303
Number of observations: 94, Error degrees of freedom: 89
Root Mean Squared Error: 2.78
R-squared: 0.885, Adjusted R-Squared 0.88
F-statistic vs. constant model: 172, p-value = 5.52e-41
But unfortunately, i get an error message when I try to do it with my data:
>> ds=dataset(price, size, weight, speed);
>> ds.postcode=nominal(postcode);
mdl = fitlm(ds, 'price~postcode+size+weight+speed')
Which is:
Index of element to remove exceeds matrix dimensions.
Error in classreg.regr.modelutils.designmatrix>dummyVars (line 432)
X0 = eye(ng); X0(:,1) = [];
Error in classreg.regr.modelutils.designmatrix (line 279)
[Xj,dummynames] = dummyVars(dummyCoding{j},Xj,catLevels{j});
Error in classreg.regr.TermsRegression/designMatrix (line 316)
[design,~,~,coefTerm,coefNames] ...
Error in LinearModel/fitter (line 654)
[model.Design,model.CoefTerm,model.CoefficientNames] =
designMatrix(model,X);
Error in classreg.regr.FitObject/doFit (line 220)
model = fitter(model);
Error in LinearModel.fit (line 857)
model = doFit(model);
Error in fitlm (line 111)
model = LinearModel.fit(X,varargin{:});
Thank you for your help!
  10 Comments
dpb
dpb on 13 Jun 2014
Edited: dpb on 13 Jun 2014
I don't know; Rsq is only roughly 40% (leaving 60% of total unexplained) and just glancing thru it appears that a very few of the coefficients are significant out of the "cast of thousands". Superficially I wouldn't say it looks like a very good model and certainly isn't high on the candidates list for a parsimonious one... :)
I'd guess the model estimated on only
(Intercept) 0.090898
bedroom 1.0747e-22
postcode_NW1 0.072403
postcode_SW1X 0.026157
would perform nearly as well.
Tania
Tania on 14 Jun 2014
Thanks, this is just the beginning for me.will try to work with the data with other models afterwards and hopefully get better results. but this help me a lot to get my first matlab analysis working. thanks :)

Sign in to comment.

Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!