predict

Predict responses for new observations from ECOC incremental learning classification model

Since R2022a

Syntax

label = predict(Mdl,X)

label = predict(Mdl,X,Name=Value)

[label,NegLoss,PBScore] = predict(___)

Description

label = predict(Mdl,X) returns the predicted responses (or labels) label of the observations in the predictor data X from the multiclass error-correcting output codes (ECOC) classification model for incremental learning Mdl.

example

label = predict(Mdl,X,Name=Value) specifies additional options using one or more name-value arguments. For example, specify ObservationsIn=columns to indicate that observations in the predictor data are oriented along the columns of X.

example

[label,NegLoss,PBScore] = predict(___) uses any of the input argument combinations in the previous syntaxes and additionally returns:

An array of negated average binary losses (NegLoss). For each observation in X, predict assigns the label of the class yielding the largest negated average binary loss (or, equivalently, the smallest average binary loss).
An array of positive-class scores (PBScore) for the observations classified by each binary learner.

example

Examples

collapse all

Predict Class Labels

Open Live Script

Create an incremental learning model by converting a traditionally trained ECOC model, and predict class labels using both models.

Load the human activity data set.

load humanactivity

For details on the data set, enter Description at the command line.

Fit a multiclass ECOC classification model to the entire data set.

Mdl = fitcecoc(feat,actid);

Mdl is a ClassificationECOC model object representing a traditionally trained ECOC classification model.

Convert the traditionally trained ECOC classification model to a model for incremental learning.

IncrementalMdl = incrementalLearner(Mdl)

IncrementalMdl = 
  incrementalClassificationECOC

            IsWarm: 1
           Metrics: [1x2 table]
        ClassNames: [1 2 3 4 5]
    ScoreTransform: 'none'
    BinaryLearners: {10x1 cell}
        CodingName: 'onevsone'
          Decoding: 'lossweighted'

IncrementalMdl is an incrementalClassificationECOC model object prepared for incremental learning.

The incrementalLearner function initializes the incremental learner by passing the coding design and model parameters for binary learners to it, along with other information Mdl extracts from the training data. IncrementalMdl is warm (IsWarm is 1), which means that incremental learning functions can track performance metrics and make predictions.

An incremental learner created from converting a traditionally trained model can generate predictions without further processing.

Predict class labels for all observations using both models.

ttlabels = predict(Mdl,feat);
illables = predict(IncrementalMdl,feat);
isequal(ttlabels,illables)

ans = logical
   1

Both models predict the same labels for each observation.

Compute Negated Average Binary Losses

Open Live Script

Prepare an incremental ECOC model for predict by fitting the model to a chunk of observations. Compute negated average binary losses for streaming data by using the predict function, and evaluate the model performance using the area under the receiver operating characteristic (ROC) curve, or AUC.

Load the human activity data set. Randomly shuffle the data.

load humanactivity
n = numel(actid);
rng(10) % For reproducibility
idx = randsample(n,n);
X = feat(idx,:);
Y = actid(idx);

For details on the data set, enter Description at the command line.

Create an ECOC model for incremental learning. Specify the class names. Prepare the model for predict by fitting the model to the first 10 observations.

Mdl = incrementalClassificationECOC(ClassNames=unique(Y));
initobs = 10;
Mdl = fit(Mdl,X(1:initobs,:),Y(1:initobs));

Mdl is an incrementalClassificationECOC model. All its properties are read-only. The model is configured to generate predictions.

Simulate a data stream, and perform the following actions on each incoming chunk of 100 observations.

Call predict to compute negated average binary losses for each observation in the incoming chunk of data. Specify to use the "lossbased" decoding scheme.
Call rocmetrics to compute the AUC using the negated average binary losses, and store the AUC value, averaged over all classes. This AUC is an incremental measure of how well the model predicts the activities on average.
Call fit to fit the model to the incoming chunk. Overwrite the previous incremental model with a new one fitted to the incoming observations.

numObsPerChunk = 100;
nchunk = floor((n - initobs)/numObsPerChunk);
auc = zeros(nchunk,1);

% Incremental learning
for j = 1:nchunk
    ibegin = min(n,numObsPerChunk*(j-1) + 1 + initobs);
    iend   = min(n,numObsPerChunk*j + initobs);
    idx = ibegin:iend;    
    [~,NegLoss] = predict(Mdl,X(idx,:),Decoding="lossbased");  
    mdlROC = rocmetrics(Y(idx),NegLoss,Mdl.ClassNames);
    [~,~,~,auc(j)] = average(mdlROC,"micro");
    Mdl = fit(Mdl,X(idx,:),Y(idx));
end

Mdl is an incrementalClassificationECOC model object trained on all the data in the stream.

Plot the AUC values for each incoming chunk of data.

plot(auc)
xlim([0 nchunk])
ylabel("AUC")
xlabel("Iteration")

Figure contains an axes object. The axes object with xlabel Iteration, ylabel AUC contains an object of type line.

The plot suggests that the classifier predicts the activities well during incremental learning.

Input Arguments

collapse all

`Mdl` — ECOC classification model for incremental learning
`incrementalClassificationECOC` model object

ECOC classification model for incremental learning, specified as an incrementalClassificationECOC model object. You can create Mdl by calling incrementalClassificationECOC directly, or by converting a supported, traditionally trained machine learning model using the incrementalLearner function.

You must configure Mdl to predict labels for a batch of observations.

If Mdl is a converted, traditionally trained model, you can predict labels without any modifications.
Otherwise, you must fit Mdl to data using fit or updateMetricsAndFit.

`X` — Batch of predictor data
floating-point matrix

Batch of predictor data, specified as a floating-point matrix of n observations and Mdl.NumPredictors predictor variables. The value of the ObservationsIn name-value argument determines the orientation of the variables and observations. The default ObservationsIn value is "rows", which indicates that observations in the predictor data are oriented along the rows of X.

Note

predict supports only floating-point input predictor data. If your input data includes categorical data, you must prepare an encoded version of the categorical data. Use dummyvar to convert each categorical variable to a numeric matrix of dummy variables. Then, concatenate all dummy variable matrices and any other numeric predictors. For more details, see Dummy Variables.

Data Types: single | double

Name-Value Arguments

Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Example: BinaryLoss="quadratic",Decoding="lossbased" specifies the quadratic binary learner loss function and the loss-based decoding scheme for aggregating the binary losses.

`BinaryLoss` — Binary learner loss function
`Mdl.BinaryLoss` (default) | `"hamming"` | `"linear"` | `"logit"` | `"exponential"` | `"binodeviance"` | `"hinge"` | `"quadratic"` | function handle

Binary learner loss function, specified as a built-in loss function name or function handle.

This table describes the built-in functions, where y_j is the class label for a particular binary learner (in the set {–1,1,0}), s_j is the score for observation j, and g(y_j,s_j) is the binary loss formula.

Value	Description	Score Domain	g(y_j,s_j)
`"binodeviance"`	Binomial deviance	(–∞,∞)	log[1 + exp(–2y_js_j)]/[2log(2)]
`"exponential"`	Exponential	(–∞,∞)	exp(–y_js_j)/2
`"hamming"`	Hamming	[0,1] or (–∞,∞)	[1 – sign(y_js_j)]/2
`"hinge"`	Hinge	(–∞,∞)	max(0,1 – y_js_j)/2
`"linear"`	Linear	(–∞,∞)	(1 – y_js_j)/2
`"logit"`	Logistic	(–∞,∞)	log[1 + exp(–y_js_j)]/[2log(2)]
`"quadratic"`	Quadratic	[0,1]	[1 – y_j(2s_j – 1)]²/2

The software normalizes binary losses so that the loss is 0.5 when y_j = 0. Also, the software calculates the mean binary loss for each class [1].

For a custom binary loss function, for example customFunction, specify its function handle BinaryLoss=@customFunction.
customFunction has this form:
```
bLoss = customFunction(M,s)
```
- M is the K-by-B coding matrix stored in Mdl.CodingMatrix.
- s is the 1-by-B row vector of classification scores.
- bLoss is the classification loss. This scalar aggregates the binary losses for every learner in a particular class. For example, you can use the mean binary loss to aggregate the loss over the learners for each class.
- K is the number of classes.
- B is the number of binary learners.
For an example of a custom binary loss function, see Predict Test-Sample Labels of ECOC Model Using Custom Binary Loss Function. This example is for a traditionally trained model. You can define a custom loss function for incremental learning as shown in the example.

For more information, see Binary Loss.

Data Types: char | string | function_handle

`Decoding` — Decoding scheme
`Mdl.Decoding` (default) | `"lossweighted"` | `"lossbased"`

Decoding scheme, specified as "lossweighted" or "lossbased".

The decoding scheme of an ECOC model specifies how the software aggregates the binary losses and determines the predicted class for each observation. The software supports two decoding schemes:

"lossweighted" — The predicted class of an observation corresponds to the class that produces the minimum sum of the binary losses over binary learners.
"lossbased" — The predicted class of an observation corresponds to the class that produces the minimum average of the binary losses over binary learners.

For more information, see Binary Loss.

Example: Decoding="lossbased"

Data Types: char | string

`ObservationsIn` — Predictor data observation dimension
`"rows"` (default) | `"columns"`

Predictor data observation dimension, specified as "rows" or "columns".

Example: ObservationsIn="columns"

Data Types: char | string

Output Arguments

collapse all

`label` — Predicted responses (labels)
categorical array | floating-point vector | character array | logical vector | string vector | cell array of character vectors

Predicted responses (labels), returned as a categorical or character array; floating-point, logical, or string vector; or cell array of character vectors with n rows. n is the number of observations in X, and label(j) is the predicted response for observation j.

label has the same data type as the class names stored in Mdl.ClassNames. (The software treats string arrays as cell arrays of character vectors.)

The predict function predicts the classification of an observation by assigning the observation to the class yielding the largest negated average binary loss (or, equivalently, the smallest average binary loss). For an observation with NaN loss values, the function classifies the observation into the majority class, which makes up the largest proportion of the training labels.

`NegLoss` — Negated average binary losses
numeric matrix

Negated average binary losses, returned as an n-by-K numeric matrix. n is the number of observations in X, and K is the number of distinct classes in the training data (numel(Mdl.ClassNames)).

NegLoss(i,k) is the negated average binary loss for classifying observation i into the kth class.

If Decoding is 'lossbased', then NegLoss(i,k) is the negated sum of the binary losses divided by the total number of binary learners.
If Decoding is 'lossweighted', then NegLoss(i,k) is the negated sum of the binary losses divided by the number of binary learners for the kth class.

For more details, see Binary Loss.

`PBScore` — Positive-class scores
numeric matrix

Positive-class scores for each binary learner, returned as an n-by-B numeric matrix. n is the number of observations in X, and B is the number of binary learners (numel(Mdl.BinaryLearners)).

More About

collapse all

Binary Loss

The binary loss is a function of the class and classification score that determines how well a binary learner classifies an observation into the class. The decoding scheme of an ECOC model specifies how the software aggregates the binary losses and determines the predicted class for each observation.

Assume the following:

m_kj is element (k,j) of the coding design matrix M—that is, the code corresponding to class k of binary learner j. M is a K-by-B matrix, where K is the number of classes, and B is the number of binary learners.
s_j is the score of binary learner j for an observation.
g is the binary loss function.
$\hat{k}$ is the predicted class for the observation.

The software supports two decoding schemes:

Loss-based decoding [2] (Decoding is "lossbased") — The predicted class of an observation corresponds to the class that produces the minimum average of the binary losses over all binary learners.

$\hat{k} = \underset{k}{argmin} \frac{1}{B} \sum_{j = 1}^{B} | m_{k j} | g (m_{k j}, s_{j}) .$
Loss-weighted decoding [3] (Decoding is "lossweighted") — The predicted class of an observation corresponds to the class that produces the minimum average of the binary losses over the binary learners for the corresponding class.

$\hat{k} = \underset{k}{argmin} \frac{\sum_{j = 1}^{B} | m_{k j} | g (m_{k j}, s_{j})}{\sum_{j = 1}^{B} | m_{k j} |} .$
The denominator corresponds to the number of binary learners for class k. [1] suggests that loss-weighted decoding improves classification accuracy by keeping loss values for all classes in the same dynamic range.

The predict, resubPredict, and kfoldPredict functions return the negated value of the objective function of argmin as the second output argument (NegLoss) for each observation and class.

This table summarizes the supported binary loss functions, where y_j is a class label for a particular binary learner (in the set {–1,1,0}), s_j is the score for observation j, and g(y_j,s_j) is the binary loss function.

Value	Description	Score Domain	g(y_j,s_j)
`"binodeviance"`	Binomial deviance	(–∞,∞)	log[1 + exp(–2y_js_j)]/[2log(2)]
`"exponential"`	Exponential	(–∞,∞)	exp(–y_js_j)/2
`"hamming"`	Hamming	[0,1] or (–∞,∞)	[1 – sign(y_js_j)]/2
`"hinge"`	Hinge	(–∞,∞)	max(0,1 – y_js_j)/2
`"linear"`	Linear	(–∞,∞)	(1 – y_js_j)/2
`"logit"`	Logistic	(–∞,∞)	log[1 + exp(–y_js_j)]/[2log(2)]
`"quadratic"`	Quadratic	[0,1]	[1 – y_j(2s_j – 1)]²/2

The software normalizes binary losses so that the loss is 0.5 when y_j = 0, and aggregates using the average of the binary learners [1].

Do not confuse the binary loss with the overall classification loss (specified by the LossFun name-value argument of the loss and predict object functions), which measures how well an ECOC classifier performs as a whole.

Algorithms

collapse all

Observation Weights

If the prior class probability distribution is known (in other words, the prior distribution is not empirical), predict normalizes observation weights to sum to the prior class probabilities in the respective classes. This action implies that the default observation weights are the respective prior class probabilities.

If the prior class probability distribution is empirical, the software normalizes the specified observation weights to sum to 1 each time you call predict.

References

[1] Allwein, E., R. Schapire, and Y. Singer. “Reducing multiclass to binary: A unifying approach for margin classiﬁers.” Journal of Machine Learning Research. Vol. 1, 2000, pp. 113–141.

[2] Escalera, S., O. Pujol, and P. Radeva. “Separability of ternary codes for sparse designs of error-correcting output codes.” Pattern Recog. Lett. Vol. 30, Issue 3, 2009, pp. 285–297.

[3] Escalera, S., O. Pujol, and P. Radeva. “On the decoding process in ternary error-correcting output codes.” IEEE Transactions on Pattern Analysis and Machine Intelligence. Vol. 32, Issue 7, 2010, pp. 120–134.

Version History

Introduced in R2022a

predict

Syntax

Description

Examples

Predict Class Labels

Compute Negated Average Binary Losses

Input Arguments

`Mdl` — ECOC classification model for incremental learning
`incrementalClassificationECOC` model object

`X` — Batch of predictor data
floating-point matrix

Name-Value Arguments

`BinaryLoss` — Binary learner loss function
`Mdl.BinaryLoss` (default) | `"hamming"` | `"linear"` | `"logit"` | `"exponential"` | `"binodeviance"` | `"hinge"` | `"quadratic"` | function handle

`Decoding` — Decoding scheme
`Mdl.Decoding` (default) | `"lossweighted"` | `"lossbased"`

`ObservationsIn` — Predictor data observation dimension
`"rows"` (default) | `"columns"`

Output Arguments

`label` — Predicted responses (labels)
categorical array | floating-point vector | character array | logical vector | string vector | cell array of character vectors

`NegLoss` — Negated average binary losses
numeric matrix

`PBScore` — Positive-class scores
numeric matrix

More About

Binary Loss

Algorithms

Observation Weights

References

Version History

See Also

Functions

Objects

Topics

predict

Syntax

Description

Examples

Predict Class Labels

Compute Negated Average Binary Losses

Input Arguments

Mdl — ECOC classification model for incremental learning incrementalClassificationECOC model object

X — Batch of predictor data floating-point matrix

Name-Value Arguments

BinaryLoss — Binary learner loss function Mdl.BinaryLoss (default) | "hamming" | "linear" | "logit" | "exponential" | "binodeviance" | "hinge" | "quadratic" | function handle

Decoding — Decoding scheme Mdl.Decoding (default) | "lossweighted" | "lossbased"

ObservationsIn — Predictor data observation dimension "rows" (default) | "columns"

Output Arguments

label — Predicted responses (labels) categorical array | floating-point vector | character array | logical vector | string vector | cell array of character vectors

NegLoss — Negated average binary losses numeric matrix

PBScore — Positive-class scores numeric matrix

More About

Binary Loss

Algorithms

Observation Weights

References

Version History

See Also

Functions

Objects

Topics

`Mdl` — ECOC classification model for incremental learning
`incrementalClassificationECOC` model object

`X` — Batch of predictor data
floating-point matrix

`BinaryLoss` — Binary learner loss function
`Mdl.BinaryLoss` (default) | `"hamming"` | `"linear"` | `"logit"` | `"exponential"` | `"binodeviance"` | `"hinge"` | `"quadratic"` | function handle

`Decoding` — Decoding scheme
`Mdl.Decoding` (default) | `"lossweighted"` | `"lossbased"`

`ObservationsIn` — Predictor data observation dimension
`"rows"` (default) | `"columns"`

`label` — Predicted responses (labels)
categorical array | floating-point vector | character array | logical vector | string vector | cell array of character vectors

`NegLoss` — Negated average binary losses
numeric matrix

`PBScore` — Positive-class scores
numeric matrix