# resubPredict

Classify observations in support vector machine (SVM) classifier

## Syntax

``label = resubPredict(SVMModel)``
``````[label,Score] = resubPredict(SVMModel)``````

## Description

````label = resubPredict(SVMModel)` returns a vector of predicted class labels (`label`) for the trained support vector machine (SVM) classifier `SVMModel` using the predictor data `SVMModel.X`.```

``````[label,Score] = resubPredict(SVMModel)``` additionally returns class likelihood measures, either scores or posterior probabilities.```

## Examples

Load the `ionosphere` data set.

`load ionosphere`

Train an SVM classifier. Standardize the data and specify that `'g'` is the positive class.

`SVMModel = fitcsvm(X,Y,'ClassNames',{'b','g'},'Standardize',true);`

`SVMModel` is a `ClassificationSVM` classifier.

Predict the training sample labels and scores. Display the results for the first 10 observations.

```[label,score] = resubPredict(SVMModel); table(Y(1:10),label(1:10),score(1:10,2),'VariableNames',... {'TrueLabel','PredictedLabel','Score'})```
```ans=10×3 table TrueLabel PredictedLabel Score _________ ______________ _______ {'g'} {'g'} 1.4861 {'b'} {'b'} -1.0004 {'g'} {'g'} 1.8685 {'b'} {'b'} -2.6458 {'g'} {'g'} 1.2805 {'b'} {'b'} -1.4617 {'g'} {'g'} 2.1672 {'b'} {'b'} -5.7085 {'g'} {'g'} 2.4797 {'b'} {'b'} -2.7811 ```

Fit the optimal score-to-posterior-probability transformation function.

```rng(1); % For reproducibility ScoreSVMModel = fitPosterior(SVMModel)```
```ScoreSVMModel = ClassificationSVM ResponseName: 'Y' CategoricalPredictors: [] ClassNames: {'b' 'g'} ScoreTransform: '@(S)sigmoid(S,-9.479889e-01,-1.220433e-01)' NumObservations: 351 Alpha: [90x1 double] Bias: -0.1342 KernelParameters: [1x1 struct] Mu: [1x34 double] Sigma: [1x34 double] BoxConstraints: [351x1 double] ConvergenceInfo: [1x1 struct] IsSupportVector: [351x1 logical] Solver: 'SMO' Properties, Methods ```

Because the classes are inseparable, the score transformation function (`ScoreSVMModel.ScoreTransform`) is the sigmoid function.

Estimate scores and positive class posterior probabilities for the training data. Display the results for the first 10 observations.

```[label,scores] = resubPredict(SVMModel); [~,postProbs] = resubPredict(ScoreSVMModel); table(Y(1:10),label(1:10),scores(1:10,2),postProbs(1:10,2),'VariableNames',... {'TrueLabel','PredictedLabel','Score','PosteriorProbability'})```
```ans=10×4 table TrueLabel PredictedLabel Score PosteriorProbability _________ ______________ _______ ____________________ {'g'} {'g'} 1.4861 0.82213 {'b'} {'b'} -1.0002 0.30446 {'g'} {'g'} 1.8684 0.86913 {'b'} {'b'} -2.6459 0.084225 {'g'} {'g'} 1.2805 0.79183 {'b'} {'b'} -1.4615 0.22039 {'g'} {'g'} 2.1672 0.89812 {'b'} {'b'} -5.7081 0.0050204 {'g'} {'g'} 2.4796 0.9222 {'b'} {'b'} -2.7809 0.074869 ```

## Input Arguments

Full, trained SVM classifier, specified as a `ClassificationSVM` model trained with `fitcsvm`.

## Output Arguments

Predicted class labels, returned as a categorical or character array, logical or numeric vector, or cell array of character vectors.

The predicted class labels have the following:

• Same data type as the observed class labels (`SVMModel.Y`)

• Length equal to the number of rows in `SVMModel.X`

For one-class learning, `label` contains the one class represented in `SVMModel.Y`.

Predicted class scores or posterior probabilities, returned as a numeric column vector or numeric matrix.

• For one-class learning, `Score` is a column vector with the same number of rows as `SVMModel.X`. The elements are the positive class scores for the corresponding observations. You cannot obtain posterior probabilities for one-class learning.

• For two-class learning, `Score` is a two-column matrix with the same number of rows as `SVMModel.X`.

• If you fit the optimal score-to-posterior-probability transformation function using `fitPosterior` or `fitSVMPosterior`, then `Score` contains class posterior probabilities. That is, if the value of `SVMModel.ScoreTransform` is not `none`, then the first and second columns of `Score` contain the negative class (`SVMModel.ClassNames{1}`) and positive class (`SVMModel.ClassNames{2}`) posterior probabilities for the corresponding observations, respectively.

• Otherwise, the first column contains the negative class scores and the second column contains the positive class scores for the corresponding observations.

If `SVMModel``.KernelParameters.Function` is `'linear'`, then the classification score for the observation x is

`$f\left(x\right)=\left(x/s\right)\prime \beta +b.$`

`SVMModel` stores β, b, and s in the properties `Beta`, `Bias`, and `KernelParameters.Scale`, respectively.

To estimate classification scores manually, you must first apply any transformations to the predictor data that were applied during training. Specifically, if you specify `'Standardize',true` when using `fitcsvm`, then you must standardize the predictor data manually by using the mean `SVMModel.Mu` and standard deviation `SVMModel.Sigma`, and then divide the result by the kernel scale in `SVMModel.KernelParameters.Scale`.

All SVM functions, such as `resubPredict` and `predict`, apply any required transformation before estimation.

If `SVMModel``.KernelParameters.Function` is not `'linear'`, then `Beta` is empty (`[]`).

## More About

### Classification Score

The SVM classification score for classifying observation x is the signed distance from x to the decision boundary ranging from -∞ to +∞. A positive score for a class indicates that x is predicted to be in that class. A negative score indicates otherwise.

The positive class classification score $f\left(x\right)$ is the trained SVM classification function. $f\left(x\right)$ is also the numerical, predicted response for x, or the score for predicting x into the positive class.

`$f\left(x\right)=\sum _{j=1}^{n}{\alpha }_{j}{y}_{j}G\left({x}_{j},x\right)+b,$`

where $\left({\alpha }_{1},...,{\alpha }_{n},b\right)$ are the estimated SVM parameters, $G\left({x}_{j},x\right)$ is the dot product in the predictor space between x and the support vectors, and the sum includes the training set observations. The negative class classification score for x, or the score for predicting x into the negative class, is –f(x).

If G(xj,x) = xjx (the linear kernel), then the score function reduces to

`$f\left(x\right)=\left(x/s\right)\prime \beta +b.$`

s is the kernel scale and β is the vector of fitted linear coefficients.

For more details, see Understanding Support Vector Machines.

### Posterior Probability

The posterior probability is the probability that an observation belongs in a particular class, given the data.

For SVM, the posterior probability is a function of the score P(s) that observation j is in class k = {-1,1}.

• For separable classes, the posterior probability is the step function

`$P\left({s}_{j}\right)=\left\{\begin{array}{l}\begin{array}{cc}0;& s<\underset{{y}_{k}=-1}{\mathrm{max}}{s}_{k}\end{array}\\ \begin{array}{cc}\pi ;& \underset{{y}_{k}=-1}{\mathrm{max}}{s}_{k}\le {s}_{j}\le \underset{{y}_{k}=+1}{\mathrm{min}}{s}_{k}\end{array}\\ \begin{array}{cc}1;& {s}_{j}>\underset{{y}_{k}=+1}{\mathrm{min}}{s}_{k}\end{array}\end{array},$`

where:

• sj is the score of observation j.

• +1 and –1 denote the positive and negative classes, respectively.

• π is the prior probability that an observation is in the positive class.

• For inseparable classes, the posterior probability is the sigmoid function

`$P\left({s}_{j}\right)=\frac{1}{1+\mathrm{exp}\left(A{s}_{j}+B\right)},$`

where the parameters A and B are the slope and intercept parameters, respectively.

### Prior Probability

The prior probability of a class is the believed relative frequency with which observations from that class occur in a population.

## Tips

• If you are using a linear SVM model for classification and the model has many support vectors, then using `resubPredict` for the prediction method can be slow. To efficiently classify observations based on a linear SVM model, remove the support vectors from the model object by using `discardSupportVectors`.

## Algorithms

• By default and irrespective of the model kernel function, MATLAB® uses the dual representation of the score function to classify observations based on trained SVM models, specifically

`$\stackrel{^}{f}\left(x\right)=\sum _{j=1}^{n}{\stackrel{^}{\alpha }}_{j}{y}_{j}G\left(x,{x}_{j}\right)+\stackrel{^}{b}.$`

This prediction method requires the trained support vectors and α coefficients (see the `SupportVectors` and `Alpha` properties of the SVM model).

• By default, the software computes optimal posterior probabilities using Platt’s method [1]:

1. Perform 10-fold cross-validation.

2. Fit the sigmoid function parameters to the scores returned from the cross-validation.

3. Estimate the posterior probabilities by entering the cross-validation scores into the fitted sigmoid function.

• The software incorporates prior probabilities in the SVM objective function during training.

• For SVM, `predict` and `resubPredict` classify observations into the class yielding the largest score (the largest posterior probability). The software accounts for misclassification costs by applying the average-cost correction before training the classifier. That is, given the class prior vector P, misclassification cost matrix C, and observation weight vector w, the software defines a new vector of observation weights (W) such that

`${W}_{j}={w}_{j}{P}_{j}\sum _{k=1}^{K}{C}_{jk}.$`

## References

[1] Platt, J. “Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods.” Advances in Large Margin Classifiers. MIT Press, 1999, pp. 61–74.

