To determine a good lasso-penalty strength for a linear classification model that uses a logistic regression learner, compare k-fold edges.
Load the NLP data set. Preprocess the data as in Estimate k-Fold Cross-Validation Edge.
Create a set of 11 logarithmically-spaced regularization strengths from through .
Cross-validate a binary, linear classification model using 5-fold cross-validation and that uses each of the regularization strengths. Optimize the objective function using SpaRSA. Lower the tolerance on the gradient of the objective function to 1e-8
.
CVMdl =
ClassificationPartitionedLinear
CrossValidatedModel: 'Linear'
ResponseName: 'Y'
NumObservations: 31572
KFold: 5
Partition: [1x1 cvpartition]
ClassNames: [0 1]
ScoreTransform: 'none'
Properties, Methods
CVMdl
is a ClassificationPartitionedLinear
model. Because fitclinear
implements 5-fold cross-validation, CVMdl
contains 5 ClassificationLinear
models that the software trains on each fold.
Estimate the edges for each fold and regularization strength.
eFolds = 5×11
0.9958 0.9958 0.9958 0.9958 0.9958 0.9924 0.9770 0.9178 0.8452 0.8127 0.8127
0.9991 0.9991 0.9991 0.9991 0.9991 0.9938 0.9780 0.9201 0.8262 0.8128 0.8128
0.9992 0.9992 0.9992 0.9992 0.9992 0.9942 0.9781 0.9135 0.8253 0.8128 0.8128
0.9974 0.9974 0.9974 0.9974 0.9974 0.9931 0.9773 0.9121 0.8410 0.8130 0.8130
0.9976 0.9976 0.9976 0.9976 0.9976 0.9942 0.9782 0.9157 0.8368 0.8127 0.8127
eFolds
is a 5-by-11 matrix of edges. Rows correspond to folds and columns correspond to regularization strengths in Lambda
. You can use eFolds
to identify ill-performing folds, that is, unusually low edges.
Estimate the average edge over all folds for each regularization strength.
e = 1×11
0.9978 0.9978 0.9978 0.9978 0.9978 0.9936 0.9777 0.9158 0.8349 0.8128 0.8128
Determine how well the models generalize by plotting the averages of the 5-fold edge for each regularization strength. Identify the regularization strength that maximizes the 5-fold edge over the grid.
Several values of Lambda
yield similarly high edges. Higher values of lambda lead to predictor variable sparsity, which is a good quality of a classifier.
Choose the regularization strength that occurs just before the edge starts decreasing.
Train a linear classification model using the entire data set and specify the regularization strength LambdaFinal
.
To estimate labels for new observations, pass MdlFinal
and the new data to predict
.