perObservationLoss
Per observation classification error of model for incremental learning
Since R2022a
Description
specifies additional options using one or more Err
= perObservationLoss(Mdl
,X
,Y
,Name=Value
)Name=Value
arguments.
Examples
Compute per Observation Loss for Incremental Classification Model
Load the human activity data set. Randomly shuffle the data.
load humanactivity n = numel(actid); rng(1); % For reproducibility idx = randsample(n,n); X = feat(idx,:); Y = actid(idx);
For details on the data set, enter Description
at the command line.
Responses can be one of five classes: Sitting, Standing, Walking, Running, or Dancing. Dichotomize the response by identifying whether the subject is moving (actid > 2).
Y = Y > 2;
Create an incremental linear SVM model for binary classification. Configure it for loss by specifying the class names, prior class distribution (uniform), and arbitrary coefficient and bias values. Specify a metrics window size of 1000 observations.
p = size(X,2); Beta = randn(p,1); Bias = randn(1); Mdl = incrementalClassificationLinear('Beta',Beta,'Bias',Bias,... 'ClassNames',unique(Y),'Prior','uniform','MetricsWindowSize',1000,'Metrics','classiferror');
Mdl
is an incrementalClassificationLinear
model. All its properties are read-only.
Preallocate the number of variables in each chunk for creating a stream of data and variables to store the performance metrics.
numObsPerChunk = 50; nchunk = floor(n/numObsPerChunk); L = zeros(nchunk,1); % To store loss values PoL = zeros(nchunk,50); % To store per observation loss values
Simulate a data stream with incoming chunks of 50 observations each. At each iteration:
Call
updateMetricsAndFit
to update the performance metrics and fit the model to the incoming data. Overwrite the previous incremental model with the new one.Call
loss
to measure the model performance on the incoming data andperObservationLoss
to compute the classification error for each observation in the chunk of data and store the performance metrics.
for j = 1:nchunk ibegin = min(n,numObsPerChunk*(j-1) + 1); iend = min(n,numObsPerChunk*j); idx = ibegin:iend; Mdl = updateMetricsAndFit(Mdl,X(idx,:),Y(idx)); L(j) = loss(Mdl,X(idx,:),Y(idx)); PoL(j,:) = perObservationLoss(Mdl,X(idx,:),Y(idx)); end
PerObservationLoss
computes the loss for each observation in each chunk of data after the warmup period (after IsWarm
property is 1 (or true
)). PoL
is a nchunk
-by-numObsPerChunk
matrix, which, in this example, corresponds to a 481-by-50 matrix. Each row corresponds to a chunk of observation in the stream and each column corresponds to an observation in the corresponding chunk. The default warmup period is 1000 observations, which corresponds to 20 chunks of incoming data. Hence, first 20 rows of PoL
only has NaN
values. loss
starts computing the classification error for each chunk of data, whether the model is warm or not, so L
has a loss value for the first 20 as well.
Input Arguments
Mdl
— Incremental learning model
incrementalClassificationKernel
model object | incrementalClassificationLinear
model object | incrementalClassificationECOC
model object | incrementalClassificationNaiveBayes
model object
Incremental learning model, specified as an incrementalClassificationKernel
, incrementalClassificationLinear
, incrementalClassificationECOC
, or incrementalClassificationNaiveBayes
model object. You can create
Mdl
directly or by converting a supported, traditionally trained
machine learning model using the incrementalLearner
function. For
more details, see the corresponding reference page.
X
— Chunk of predictor data
floating-point matrix
Chunk of predictor data with which to compute the per observation loss, specified as
a floating-point matrix of n observations and
Mdl.NumPredictors
predictor variables. The value of the
ObservationsIn
name-value argument determines the orientation of
the variables and observations.
The length of the observation labels Y
and the number of observations in X
must be equal; Y(
is the label of observation j (row or column) in j
)X
.
Note
perObservationLoss
supports only floating-point
input predictor data. If your input data includes categorical data, you must prepare an encoded
version of the categorical data. Use dummyvar
to convert each categorical variable
to a numeric matrix of dummy variables. Then, concatenate all dummy variable matrices and any
other numeric predictors. For more details, see Dummy Variables.
Data Types: single
| double
Y
— Chunk of labels
categorical array | character array | string array | logical vector | cell array of character vectors
Chunk of labels with which to compute the per observation loss, specified as a categorical, character, or string array, logical vector, or cell array of character vectors.
The length of the observation labels Y
and the number of
observations in X
must be equal;
Y(
is the label of observation
j (row or column) in j
)X
.
For classification problems:
If
Y
contains a label that is not a member ofMdl.ClassNames
,perObservationLoss
issues an error.The data type of
Y
andMdl.ClassNames
must be the same.
Data Types: char
| string
| cell
| categorical
| logical
Name-Value Arguments
Specify optional pairs of arguments as
Name1=Value1,...,NameN=ValueN
, where Name
is
the argument name and Value
is the corresponding value.
Name-value arguments must appear after other arguments, but the order of the
pairs does not matter.
Example: ObservationsIn="columns",LossFun="hinge"
specifies that the
observations are in columns and the loss function is the built-in hinge loss.
ObservationsIn
— Orientation of data in X
"rows"
(default) | "columns"
Orientation of data in X
, specified as either
"rows"
or "columns"
.
Example: ObservationsIn="columns"
LossFun
— Loss function
"classiferror"
| "binodeviance"
| "exponential"
| "hinge"
| "logit"
| "quadratic"
| "mincost"
| function handle
Loss function, specified as a built-in loss function name or function handle.
The following table lists the built-in loss function names.
Name | Description |
---|---|
"binodeviance" | Binomial deviance |
"classiferror" | Misclassification error rate |
"exponential" | Exponential |
"hinge" | Hinge |
"logit" | Logistic |
"quadratic" | Quadratic |
"mincost" | Minimal expected misclassification cost (for
incrementalClassificationNaiveBayes only) |
Default is "mincost"
for
incrementalClassificationNaiveBayes
model object and
"classiferror"
for other objects.
Note
You can only specify "classiferror"
for
incrementalClassificationECOC
.
To specify a custom loss function, use function handle notation. The function must have this form:
lossval = lossfcn(C,S)
The output argument
lossval
is an n-by-1 floating-point vector, where n is the number of observations inX
. The value inlossval(
is the classification loss of observationj
)
.j
You specify the function name (
).lossfcn
C
is an n-by-K logical matrix with rows indicating the class to which the corresponding observation belongs.K
is the number of distinct classes (numel(Mdl.ClassNames)
), and the column order corresponds to the class order in theClassNames
property. CreateC
by settingC(
=p
,q
)1
, if observation
is in classp
, for each observation in the specified data. Set the other elements in rowq
top
0
.S
is an n-by-K numeric matrix of predicted classification scores.S
is similar to theScore
output ofpredict
, where rows correspond to observations in the data and the column order corresponds to the class order in theClassNames
property.S(
is the classification score of observationp
,q
)
being classified in classp
.q
Example: LossFun="logit"
Example: LossFun=@
lossfcn
Data Types: char
| string
| function_handle
Version History
Introduced in R2022a
MATLAB Command
You clicked a link that corresponds to this MATLAB command:
Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.
Select a Web Site
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .
You can also select a web site from the following list
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.
Americas
- América Latina (Español)
- Canada (English)
- United States (English)
Europe
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)
Asia Pacific
- Australia (English)
- India (English)
- New Zealand (English)
- 中国
- 日本Japanese (日本語)
- 한국Korean (한국어)