Main Content

crossval

Cross-validated decision tree

Description

example

cvmodel = crossval(model) creates a partitioned model from model, a fitted regression tree. By default, crossval uses 10-fold cross validation on the training data to create cvmodel.

cvmodel = crossval(model,Name=Value) creates a partitioned model with additional options specified by a single name-value argument.

Examples

collapse all

Load the carsmall data set. Consider Acceleration, Displacement, Horsepower, and Weight as predictor variables.

load carsmall
X = [Acceleration Displacement Horsepower Weight];

Grow a regression tree using the entire data set.

Mdl = fitrtree(X,MPG);

Mdl is a RegressionTree model.

Cross-validate the regression tree using 10-fold cross-validation.

CVMdl = crossval(Mdl);

CVMdl is a RegressionPartitionedModel cross-validated model. crossval stores the ten trained, compact regression trees in the Trained property of CVMdl.

Display the compact regression tree that crossval trained using all observations except those in the first fold.

CVMdl.Trained{1}
ans = 
  CompactRegressionTree
           PredictorNames: {'x1'  'x2'  'x3'  'x4'}
             ResponseName: 'Y'
    CategoricalPredictors: []
        ResponseTransform: 'none'


Estimate the generalization error of Mdl by computing the 10-fold cross-validated mean-squared error.

L = kfoldLoss(CVMdl)
L = 23.5706

Input Arguments

collapse all

Regression tree, specified as a RegressionTree object created by the fitrtree function.

Name-Value Arguments

Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Example: cvmodel = crossval(model,Holdout=0.5) performs holdout validation using 50% of the data.

Before R2021a, use commas to separate each name and value, and enclose Name in quotes.

Example: cvmodel = crossval(model,"Holdout",0.5) performs holdout validation using 50% of the data.

Partition for cross-validated tree, specified as a cvpartition object.

Use only one of these four name-value arguments at a time: CVPartition, Holdout, KFold, or Leaveout.

Fraction of data used for holdout validation, specified as a numeric scalar in the range [0, 1]. Holdout validation tests the specified fraction of the data, and uses the rest of the data for training.

Use only one of these four name-value arguments at a time: CVPartition, Holdout, KFold, or Leaveout.

Example: Holdout=0.1

Data Types: single | double

Number of folds to use in a cross-validated tree, specified as a positive integer.

Use only one of these four name-value arguments at a time: CVPartition, Holdout, KFold, or Leaveout.

Example: KFold=8

Leave-one-out cross-validation flag, specified as "on" or "off". Use leave-one-out cross-validation by specifying the value "on".

Use only one of these four name-value arguments at a time: CVPartition, Holdout, KFold, or Leaveout.

Example: Leaveout="on"

Data Types: single | double

Output Arguments

collapse all

Partitioned regression model, returned as a RegressionPartitionedModel object.

Tips

  • Assess the predictive performance of model on cross-validated data using the "kfold" functions and properties of cvmodel, such as kfoldLoss.

Alternatives

You can create a cross-validation tree directly from the data, instead of creating a decision tree followed by a cross-validation tree. To do so, include one of these five name-value arguments in fitrtree: CrossVal, KFold, Holdout, Leaveout, or CVPartition.

Extended Capabilities

Version History

Introduced in R2011a

See Also

|