Growing Decision Trees
Start with all input data, and examine all possible binary splits on every predictor.
Select a split with best optimization criterion.
A split might lead to a child node having too few observations (less than the
MinLeafSizeparameter). To avoid this, the software chooses a split that yields the best optimization criterion subject to the
Impose the split.
Repeat recursively for the two child nodes.
The explanation requires two more items: description of the optimization criterion and stopping rule.
Stopping rule: Stop splitting when any of the following hold:
The node is pure.
For classification, a node is pure if it contains only observations of one class.
For regression, a node is pure if the mean squared error (MSE) for the observed response in this node drops below the MSE for the observed response in the entire data multiplied by the tolerance on quadratic error per node (
There are fewer than
MinParentSizeobservations in this node.
Any split imposed on this node produces children with fewer than
The algorithm splits
Regression: mean-squared error (MSE). Choose a split to minimize the MSE of predictions compared to the training data.
Classification: One of three measures, depending on the setting of the
'gdi'(Gini's diversity index, the default)
For alternative split predictor selection techniques, see Choose Split Predictor Selection Technique.
For a continuous predictor, a tree can split halfway between any two adjacent unique values found for this predictor. For a categorical predictor with L levels, a classification tree needs to consider 2L–1–1 splits to find the optimal split. Alternatively, you can choose a heuristic algorithm to find a good split, as described in Splitting Categorical Predictors in Classification Trees.
For dual-core systems and above,
fitrtree parallelize training decision
trees using Intel® Threading Building Blocks (TBB). For details on Intel TBB, see https://software.intel.com/content/www/us/en/develop/tools/oneapi/components/onetbb.html.
 Breiman, L., J. H. Friedman, R. A. Olshen, and C. J. Stone. Classification and Regression Trees. Boca Raton, FL: Chapman & Hall, 1984.