SVM with Dummy Variables
4 views (last 30 days)
Context: I have a cell array with 19 features that are all categorical (nominal) (as columns) and ~1500 data entries (as rows). I've looped through all the columns and used double(dummyvar(nominal(featureVector))) to convert all the features into dummy variables (vectors of 1s & 0s) and all looks right.
Problem: When I try to feed this as the input data X to fitcsvm() it gives me an error as it expects X to be a floating point matrix.
Error using ClassificationSVM.prepareData (line 602)
You can pass only floating-point data for X to SVM.
If I convert the cell array into a matrix, then the dummy variable vectors will be represented as columns and thus they lose their identity as dummy variables as fitcsvm() expects each column to be a predictor in itself and now thinks there are (num of features)*(num of categories in each feature) predictors. So I don't see how I can use dummy variables with an SVM in Matlab which is mind boggling and I know this is a basic problem many will have.
Thanks so much for your help!
Ilya on 29 Jul 2015
Just convert your cell array into a matrix. Yes, dummy variables will lose their identity in the sense that different levels of a categorical predictor will be treated as different predictors. This is common practice though.