reference dummy coding with Matlab fitlme
6 views (last 30 days)
Show older comments
Thanks in advance for the help
I have a set of data that is composed of multiple categorical predictors and a single numerical response. I want to use regression to predict the response. Matlab automatically recognizes categorical data and uses dummy coding to remove the rank and magnitude that is associated with numeric predictors.
Here is my question. Suppose I have a predictor with categories 0, 1, and 2. I want to be able to specify which of the three categories is the reference (I want to use reference dummy coding as opposed to effects or full). Is there a way to do this? In particular I am using fitlme. The docs say that the first category is set to zero when using reference dummy coding. In my case does this mean that 0 would be the reference variable or is the first category that matlab sees in my dataset set as the reference variable? In other words, what does 'the first category' mean?
0 Comments
Accepted Answer
Gautam Pendse
on 11 Aug 2014
Hi Ryan,
You can use categorical or nominal to specify the first category. Here's an example:
% 0. Dummy data.
rng(0,'twister');
y = rand(30,1);
g = [zeros(10,1);ones(10,1);2*ones(10,1)];
T = table(y);
% 1. First category is automatically set based on sort order. In this
% case it will be 0.
T.g = categorical(g);
lme = fitlme(T,'y ~ g')
% 2. Make 2 the first category.
T.g = categorical(g,[2,0,1]);
lme = fitlme(T,'y ~ g')
% 3. Same as 1 but using nominal.
T.g = nominal(g);
lme = fitlme(T,'y ~ g')
% 4. Same as 2 but using nominal.
gn = nominal(g);
getlabels(gn)
gn = reorderlevels(gn,{'2','0','1'});
T.g = gn;
lme = fitlme(T,'y ~ g')
Hope this helps,
Gautam
0 Comments
More Answers (0)
See Also
Products
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!