kfoldEdge
Classification edge for crossvalidated classification model
Description
returns the classification edge obtained by the crossvalidated
classification model E
= kfoldEdge(CVMdl
)CVMdl
. For every fold,
kfoldEdge
computes the classification edge for validationfold
observations using a classifier trained on trainingfold observations.
CVMdl.X
and CVMdl.Y
contain both sets of
observations.
returns the classification edge with additional options specified by one or more namevalue
arguments. For example, specify the folds to use or specify to compute the classification
edge for each individual fold.E
= kfoldEdge(CVMdl
,Name,Value
)
Examples
Estimate kfold Edge of Classifier
Compute the kfold edge for a model trained on Fisher's iris data.
Load Fisher's iris data set.
load fisheriris
Train a classification tree classifier.
tree = fitctree(meas,species);
Crossvalidate the classifier using 10fold crossvalidation.
cvtree = crossval(tree);
Compute the kfold edge.
edge = kfoldEdge(cvtree)
edge = 0.8578
Compute KFold Edge of HeldOut Observations
Compute the kfold edge for an ensemble trained on the Fisher iris data.
Load the sample data set.
load fisheriris
Train an ensemble of 100 boosted classification trees.
t = templateTree('MaxNumSplits',1); % Weak learner template tree object ens = fitcensemble(meas,species,'Learners',t);
Create a crossvalidated ensemble from ens
and find the classification edge.
rng(10,'twister') % For reproducibility cvens = crossval(ens); E = kfoldEdge(cvens)
E = 3.2033
Input Arguments
CVMdl
— Crossvalidated partitioned classifier
ClassificationPartitionedModel
object  ClassificationPartitionedEnsemble
object  ClassificationPartitionedGAM
object
Crossvalidated partitioned classifier, specified as a ClassificationPartitionedModel
, ClassificationPartitionedEnsemble
, or ClassificationPartitionedGAM
object. You can create the object in two ways:
Pass a trained classification model listed in the following table to its
crossval
object function.Train a classification model using a function listed in the following table and specify one of the crossvalidation namevalue arguments for the function.
NameValue Arguments
Specify optional pairs of arguments as
Name1=Value1,...,NameN=ValueN
, where Name
is
the argument name and Value
is the corresponding value.
Namevalue arguments must appear after other arguments, but the order of the
pairs does not matter.
Before R2021a, use commas to separate each name and value, and enclose
Name
in quotes.
Example: kfoldEdge(CVMdl,'Folds',[1 2 3 5])
specifies to use the
first, second, third, and fifth folds to compute the classification edge, but to exclude the
fourth fold.
Folds
— Fold indices to use
1:CVMdl.KFold
(default)  positive integer vector
Fold indices to use, specified as a positive integer vector. The elements of Folds
must be within the range from 1
to CVMdl.KFold
.
The software uses only the folds specified in Folds
.
Example: 'Folds',[1 4 10]
Data Types: single
 double
IncludeInteractions
— Flag to include interaction terms
true
 false
Flag to include interaction terms of the model, specified as true
or
false
. This argument is valid only for a generalized
additive model (GAM). That is, you can specify this argument only when
CVMdl
is ClassificationPartitionedGAM
.
The default value is true
if the models in
CVMdl
(CVMdl.Trained
) contain
interaction terms. The value must be false
if the models do not
contain interaction terms.
Example: 'IncludeInteractions',false
Data Types: logical
Mode
— Aggregation level for output
'average'
(default)  'individual'
 'cumulative'
Aggregation level for the output, specified as 'average'
, 'individual'
, or 'cumulative'
.
Value  Description 

'average'  The output is a scalar average over all folds. 
'individual'  The output is a vector of length k containing one value per fold, where k is the number of folds. 
'cumulative'  Note If you want to specify this value,

Example: 'Mode','individual'
Output Arguments
E
— Classification edge
numeric scalar  numeric column vector
Classification edge, returned as a numeric scalar or numeric column vector.
If
Mode
is'average'
, thenE
is the average classification edge over all folds.If
Mode
is'individual'
, thenE
is a kby1 numeric column vector containing the classification edge for each fold, where k is the number of folds.If
Mode
is'cumulative'
andCVMdl
isClassificationPartitionedEnsemble
, thenE
is amin(CVMdl.NumTrainedPerFold)
by1 numeric column vector. Each elementj
is the average classification edge over all folds that the function obtains by using ensembles trained with weak learners1:j
.If
Mode
is'cumulative'
andCVMdl
isClassificationPartitionedGAM
, then the output value depends on theIncludeInteractions
value.If
IncludeInteractions
isfalse
, thenL
is a(1 + min(NumTrainedPerFold.PredictorTrees))
by1 numeric column vector. The first element ofL
is the average classification edge over all folds that is obtained using only the intercept (constant) term. The(j + 1)
th element ofL
is the average edge obtained using the intercept term and the firstj
predictor trees per linear term.If
IncludeInteractions
istrue
, thenL
is a(1 + min(NumTrainedPerFold.InteractionTrees))
by1 numeric column vector. The first element ofL
is the average classification edge over all folds that is obtained using the intercept (constant) term and all predictor trees per linear term. The(j + 1)
th element ofL
is the average edge obtained using the intercept term, all predictor trees per linear term, and the firstj
interaction trees per interaction term.
More About
Classification Edge
The classification edge is the weighted mean of the classification margins.
One way to choose among multiple classifiers, for example to perform feature selection, is to choose the classifier that yields the greatest edge.
Classification Margin
The classification margin for binary classification is, for each observation, the difference between the classification score for the true class and the classification score for the false class. The classification margin for multiclass classification is the difference between the classification score for the true class and the maximal score for the false classes.
If the margins are on the same scale (that is, the score values are based on the same score transformation), then they serve as a classification confidence measure. Among multiple classifiers, those that yield greater margins are better.
Algorithms
kfoldEdge
computes the classification edge as described in the
corresponding edge
object function. For a modelspecific description, see
the appropriate edge
function reference page in the following
table.
Model Type  edge Function 

Discriminant analysis classifier  edge 
Ensemble classifier  edge 
Generalized additive model classifier  edge 
knearest neighbor classifier  edge 
Naive Bayes classifier  edge 
Neural network classifier  edge 
Support vector machine classifier  edge 
Binary decision tree for multiclass classification  edge 
Extended Capabilities
GPU Arrays
Accelerate code by running on a graphics processing unit (GPU) using Parallel Computing Toolbox™.
Usage notes and limitations:
This function fully supports GPU arrays for the following crossvalidated model objects:
Ensemble classifier trained with
fitcensemble
knearest neighbor classifier trained with
fitcknn
Support vector machine classifier trained with
fitcsvm
Binary decision tree for multiclass classification trained with
fitctree
Neural network for classification trained with
fitcnet
For more information, see Run MATLAB Functions on a GPU (Parallel Computing Toolbox).
Version History
Introduced in R2011aR2024b: Specify GPU arrays for neural network models (requires Parallel Computing Toolbox)
kfoldEdge
fully supports GPU arrays for ClassificationPartitionedModel
models trained using
fitcnet
.
R2023b: Observations with missing predictor values are used in resubstitution and crossvalidation computations
Starting in R2023b, the following classification model object functions use observations with missing predictor values as part of resubstitution ("resub") and crossvalidation ("kfold") computations for classification edges, losses, margins, and predictions.
In previous releases, the software omitted observations with missing predictor values from the resubstitution and crossvalidation computations.
R2022a: kfoldEdge
returns a different value for crossvalidated SVM and ensemble classifiers with a nondefault cost matrix
If you specify a nondefault cost matrix when you crossvalidate the input model object for an SVM or ensemble classification model, the kfoldEdge
function returns a different value compared to previous releases.
The kfoldEdge
function uses the
observation weights stored in the W
property. The way the function uses the
W
property value has not changed. However, the property value stored in the input model object has changed for
crossvalidated SVM and ensemble model objects with a nondefault cost matrix, so the
function can return a different value.
For details about the property value change, see Cost property stores the userspecified cost matrix (crossvalidated SVM classifier) or Cost property stores the userspecified cost matrix (crossvalidated ensemble classifier).
If you want the software to handle the cost matrix, prior
probabilities, and observation weights in the same way as in previous releases, adjust the prior
probabilities and observation weights for the nondefault cost matrix, as described in Adjust Prior Probabilities and Observation Weights for Misclassification Cost Matrix. Then, when you train a
classification model, specify the adjusted prior probabilities and observation weights by using
the Prior
and Weights
namevalue arguments, respectively,
and use the default cost matrix.
See Also
kfoldPredict
 kfoldMargin
 kfoldLoss
 kfoldfun
 ClassificationPartitionedModel
MATLAB Command
You clicked a link that corresponds to this MATLAB command:
Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.
Select a Web Site
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .
You can also select a web site from the following list:
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.
Americas
 América Latina (Español)
 Canada (English)
 United States (English)
Europe
 Belgium (English)
 Denmark (English)
 Deutschland (Deutsch)
 España (Español)
 Finland (English)
 France (Français)
 Ireland (English)
 Italia (Italiano)
 Luxembourg (English)
 Netherlands (English)
 Norway (English)
 Österreich (Deutsch)
 Portugal (English)
 Sweden (English)
 Switzerland
 United Kingdom (English)