Optimize a Boosted Regression Ensemble
This example shows how to optimize hyperparameters of a boosted regression ensemble. The optimization minimizes the cross-validation loss of the model.
The problem is to model the efficiency in miles per gallon of an automobile, based on its acceleration, engine displacement, horsepower, and weight. Load the carsmall
data, which contains these and other predictors.
load carsmall
X = [Acceleration Displacement Horsepower Weight];
Y = MPG;
Fit a regression ensemble to the data using the LSBoost
algorithm, and using surrogate splits. Optimize the resulting model by varying the number of learning cycles, the maximum number of surrogate splits, and the learn rate. Furthermore, allow the optimization to repartition the cross-validation between every iteration.
For reproducibility, set the random seed and use the 'expected-improvement-plus'
acquisition function.
rng('default') Mdl = fitrensemble(X,Y, ... 'Method','LSBoost', ... 'Learner',templateTree('Surrogate','on'), ... 'OptimizeHyperparameters',{'NumLearningCycles','MaxNumSplits','LearnRate'}, ... 'HyperparameterOptimizationOptions',struct('Repartition',true, ... 'AcquisitionFunctionName','expected-improvement-plus'))
|====================================================================================================================| | Iter | Eval | Objective: | Objective | BestSoFar | BestSoFar | NumLearningC-| LearnRate | MaxNumSplits | | | result | log(1+loss) | runtime | (observed) | (estim.) | ycles | | | |====================================================================================================================| | 1 | Best | 3.5219 | 18.426 | 3.5219 | 3.5219 | 383 | 0.51519 | 4 | | 2 | Best | 3.4752 | 0.83184 | 3.4752 | 3.4777 | 16 | 0.66503 | 7 | | 3 | Best | 3.1575 | 1.1016 | 3.1575 | 3.1575 | 33 | 0.2556 | 92 | | 4 | Accept | 6.3076 | 0.66399 | 3.1575 | 3.1579 | 13 | 0.0053227 | 5 | | 5 | Accept | 3.4449 | 9.8747 | 3.1575 | 3.1579 | 277 | 0.45891 | 99 | | 6 | Accept | 3.9806 | 0.4488 | 3.1575 | 3.1584 | 10 | 0.13017 | 33 | | 7 | Best | 3.059 | 0.6018 | 3.059 | 3.06 | 10 | 0.30126 | 3 | | 8 | Accept | 3.1707 | 0.57633 | 3.059 | 3.1144 | 10 | 0.28991 | 16 | | 9 | Accept | 3.0937 | 1.5229 | 3.059 | 3.1046 | 10 | 0.31488 | 13 | | 10 | Accept | 3.196 | 0.42912 | 3.059 | 3.1233 | 10 | 0.32005 | 11 | | 11 | Best | 3.0495 | 0.52104 | 3.0495 | 3.1083 | 10 | 0.27882 | 85 | | 12 | Best | 2.946 | 0.95518 | 2.946 | 3.0774 | 10 | 0.27157 | 7 | | 13 | Accept | 3.2026 | 0.49933 | 2.946 | 3.0995 | 10 | 0.25734 | 20 | | 14 | Accept | 5.7151 | 11.78 | 2.946 | 3.0996 | 376 | 0.001001 | 43 | | 15 | Accept | 3.207 | 16.022 | 2.946 | 3.0937 | 499 | 0.027394 | 18 | | 16 | Accept | 3.8606 | 1.9035 | 2.946 | 3.0937 | 36 | 0.041427 | 12 | | 17 | Accept | 3.2026 | 13.085 | 2.946 | 3.095 | 443 | 0.019836 | 76 | | 18 | Accept | 3.4832 | 6.5873 | 2.946 | 3.0956 | 205 | 0.99989 | 8 | | 19 | Accept | 5.6285 | 5.3196 | 2.946 | 3.0942 | 192 | 0.0022197 | 2 | | 20 | Accept | 3.0896 | 9.2928 | 2.946 | 3.0938 | 188 | 0.023227 | 93 | |====================================================================================================================| | Iter | Eval | Objective: | Objective | BestSoFar | BestSoFar | NumLearningC-| LearnRate | MaxNumSplits | | | result | log(1+loss) | runtime | (observed) | (estim.) | ycles | | | |====================================================================================================================| | 21 | Accept | 3.2654 | 5.1998 | 2.946 | 3.0951 | 167 | 0.023242 | 86 | | 22 | Accept | 6.1202 | 0.72457 | 2.946 | 3.0904 | 16 | 0.010203 | 42 | | 23 | Accept | 2.9963 | 10.806 | 2.946 | 3.0985 | 440 | 0.076162 | 1 | | 24 | Accept | 3.2801 | 5.0268 | 2.946 | 3.097 | 171 | 0.067074 | 69 | | 25 | Accept | 3.47 | 13.457 | 2.946 | 3.0968 | 497 | 0.13969 | 6 | | 26 | Accept | 3.4413 | 14.546 | 2.946 | 3.0945 | 497 | 0.051993 | 50 | | 27 | Best | 2.9095 | 5.4036 | 2.9095 | 2.9126 | 216 | 0.036052 | 1 | | 28 | Accept | 3.0866 | 2.5779 | 2.9095 | 2.9153 | 78 | 0.24579 | 1 | | 29 | Accept | 3.0473 | 8.0392 | 2.9095 | 2.9713 | 239 | 0.032173 | 1 | | 30 | Accept | 3.0383 | 0.97298 | 2.9095 | 2.972 | 25 | 0.31894 | 1 |
__________________________________________________________ Optimization completed. MaxObjectiveEvaluations of 30 reached. Total function evaluations: 30 Total elapsed time: 196.6107 seconds Total objective function evaluation time: 167.1972 Best observed feasible point: NumLearningCycles LearnRate MaxNumSplits _________________ _________ ____________ 216 0.036052 1 Observed objective function value = 2.9095 Estimated objective function value = 2.972 Function evaluation time = 5.4036 Best estimated feasible point (according to models): NumLearningCycles LearnRate MaxNumSplits _________________ _________ ____________ 216 0.036052 1 Estimated objective function value = 2.972 Estimated function evaluation time = 7.1005
Mdl = RegressionEnsemble ResponseName: 'Y' CategoricalPredictors: [] ResponseTransform: 'none' NumObservations: 94 HyperparameterOptimizationResults: [1x1 BayesianOptimization] NumTrained: 216 Method: 'LSBoost' LearnerNames: {'Tree'} ReasonForTermination: 'Terminated normally after completing the requested number of training cycles.' FitInfo: [216x1 double] FitInfoDescription: {2x1 cell} Regularization: [] Properties, Methods
Compare the loss to that of a boosted, unoptimized model, and to that of the default ensemble.
loss = kfoldLoss(crossval(Mdl,'kfold',10))
loss = 18.2889
Mdl2 = fitrensemble(X,Y, ... 'Method','LSBoost', ... 'Learner',templateTree('Surrogate','on')); loss2 = kfoldLoss(crossval(Mdl2,'kfold',10))
loss2 = 29.4663
Mdl3 = fitrensemble(X,Y);
loss3 = kfoldLoss(crossval(Mdl3,'kfold',10))
loss3 = 37.7424
For a different way of optimizing this ensemble, see Optimize Regression Ensemble Using Cross-Validation.