Why is loss() different from calculating misclassification error using predict()?
3 views (last 30 days)
Show older comments
I am trying to fit an ECOC model to my data but the misclassification calculated from loss() is different to the misclassification calculated by comparing the predicted labels from predict() with the true labels. The same thing happens when using a different model i.e. KNN.
Even though the test dataset has 10 observations, where the misclassification error should be a multiple of 0.1 to my knowledge, loss() outputs 0.8293.
Could someone please help me understand why these are different, i.e. what is going on with the loss() function? And which is more appropriate for evaluating/reporting test set accuracy.
rng(1234)
% define variables
xtrain = rand(100,4); % random numbers, n = 100
xtest = rand(10,4); % random numbers, n = 10
ytrain = ceil(4*rand(100,1)); % 4 classes, n = 100
ytest = ceil(4*rand(10,1)); % 4 classes, n = 10
% train model
mdl1 = fitcecoc(xtrain,ytrain,'Coding','onevsall','Learners','svm');
mdl2 = fitcknn(xtrain,ytrain);
% calculate loss from loss()
loss1mdl1 = loss(mdl1,xtest,ytest);
loss1mdl2 = loss(mdl2,xtest,ytest);
% calculate loss from predict()
loss2mdl1 = 1-mean(predict(mdl1,xtest)==ytest);
loss2mdl2 = 1-mean(predict(mdl2,xtest)==ytest);
0 Comments
Answers (2)
Sulaymon Eshkabilov
on 29 Jun 2023
There is a small difference between loss() and predict() fcns. The difference of loss is coming from the calculation of loss fcn value thta considers weight for observation. Otherwise, everything is working as expected:
rng(1234)
% define variables
xtrain = rand(100,4); % random numbers, n = 100
xtest = rand(10,4); % random numbers, n = 10
ytrain = ceil(4*rand(100,1)); % 4 classes, n = 100
ytest = ceil(4*rand(10,1)); % 4 classes, n = 10
% train model
mdl1 = fitcecoc(xtrain,ytrain,'Coding','onevsall','Learners','svm');
mdl2 = fitcknn(xtrain,ytrain);
% calculate loss from loss()
loss1mdl1 = loss(mdl1,xtest,ytest)
loss1mdl2 = loss(mdl2,xtest,ytest)
Y1 = predict(mdl1,xtest);
Y2 = predict(mdl2,xtest);
YC1 = [ytest,Y1] % Two correct answers out of 10, i.e., accuracy is 20%
YC2 = [ytest,Y2] % Three correct answers out of 10, i.e., accuracy 30%
% calculate loss from predict()
loss2mdl1 = 1-mean(predict(mdl1,xtest)==ytest)
loss2mdl2 = 1-mean(predict(mdl2,xtest)==ytest)
0 Comments
Drew
on 7 May 2025
This is because the classreg loss function is normalizing the observation weights so that they sum to the prior probability in the respective class. This can be avoided by providing a custom loss function, as seen in this answer: https://www.mathworks.com/matlabcentral/answers/492062-loss-the-classification-error
0 Comments
See Also
Categories
Find more on Classification Ensembles in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!