How do I use the new Support Vector Machine Regression model to simulate the response of new predictors?

7 views (last 30 days)
I’m trying out the new SVM regression capabilities that came with 2015b by following the example from the documentation as much as possible but I don’t fully get it to work. I want to train a SVM regression model on historical data and then feed it new predictors and simulate the response of the target variable. What I have tried is:
%Fit a SVM regression model to data in tbl where all columns are predictors except ‘Target’ which is the response variable.
mdl = fitrsvm(tbl,'Target','KernelFunction','gaussian','KernelScale','auto','Standardize',true);
% Check that the model converged:
conv = mdl.ConvergenceInfo.Converged
% Use the trained model to predict the response of given predictor data:
YFit = predict(mdl, tbl);
So far everything works fine and YFit matched the target data fairly well. However, creating this response is of course pointless since I already have the target data for the data set, what I want to do is give the model new values for the predictors and simulate the response. But when I try to give it that using the same command but with a table containing more predictor data points compared to the training case:
YFit = predict(mdl, NEWtbl); %(NEWtbl is a time extension of the original tbl)
The fit only works for the part of the table that has been used during the fitting, as soon as it goes into predictors that it hasn’t already seen it becomes a horizontal line.
Which commands am I supposed to use to predict the response of unseen data?

Accepted Answer

Ilya on 6 Sep 2015
This likely means that some variables in the new data have values well outside their ranges in the training data. Think about what the Gaussian kernel means and what response you get from an SVM model when a test point is far away from all points in the training set.
You would have more luck with other models such as say linear SVM in the sense that you wouldn't get a constant prediction for points outside the training set support. Yet you'd have to be very careful interpreting such predictions. SVM and similar models generally require that new data have the same support as the training data do.
Ilya on 7 Sep 2015
Normalizing the data could help. Increasing the respective tolerance would make the algorithm run for fewer iterations. The lack of convergence indicates that your data are not described by a linear model that well.
The most productive thing you could do is find variables that do not have the same support in the new data as in the training data and figure out what to do about them. This depends on the specifics of your problem and has nothing to do with SVM. Trying various models and hoping that extrapolation outside the training set would give a sensible answer can easily steer you to wrong conclusions.

Sign in to comment.

More Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!