Using fitcsvm for binary linear classification of unbalanced data

Question

rogueMedStudent7 on 14 Jun 2017

0
Link

Direct link to this question

https://ch.mathworks.com/matlabcentral/answers/344774-using-fitcsvm-for-binary-linear-classification-of-unbalanced-data

Answered: Ankita Nargundkar on 21 Jun 2017

Here is a simple example of the issue I'm running into:

tt=[1 8;2 7;3 6;4 5;5 4;6 3]; %the above 6 points are all on a line with slope -1

labels=[1 1 1 1 -1 1];

c=[0 1;2 0];

mod=fitcsvm(tt,labels,'KernelFunction','linear','Cost',c);

This spits out mod.Beta=[0 0] with a mod.Bias of 1. Therefore, the output of predict() for any point x is 1. That is, it ignores the minority class, which is a common problem for unbalanced classes. However, the cost matrix is supposed to fix that. I impose a cost that is twice as much for misclassifying the minority class, so it should draw a dividing line with a slope of 1 (and a Beta with slope -1) which correctly classifies the minority point and incorrectly classifies only a single majority point (rather than incorrectly classifying a single minority point as it is currently doing). I've tried switching the cost matrix to [0 2;1 0] to no avail. I also notice that the returned model has mod.Cost=[0 1;1 0]. It's as if it's completely ignoring my cost matrix input. What is going on here? Any help is greatly appreciated.

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Sign in to answer this question.

Answer 1

Ankita Nargundkar on 21 Jun 2017

0
Link

Direct link to this answer

https://ch.mathworks.com/matlabcentral/answers/344774-using-fitcsvm-for-binary-linear-classification-of-unbalanced-data#answer_271448

Open in MATLAB Online

>> c=[0 2.2;1 0];
>> mod=fitcsvm(tt,labels,'KernelFunction','linear','Cost',c);
>> mod.predict(tt)
ans =
     1
     1
     1
     1
    -1
    -1

Is this what you expect? One point to be noted is misclassified and minority point is classified correctly.

Documentation says

https://www.mathworks.com/help/releases/R2017a/stats/fitcsvm.html#input_argument_d0e311090

"For two-class learning, if you specify a cost matrix, then the software updates the prior probabilities by incorporating the penalties described in the cost matrix. Consequently, the cost matrix resets to the default."

That explains the strange behavior of mod.Cost being reset. Indeed you will notice that mod.Prior does change with different values of c.

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Using fitcsvm for binary linear classification of unbalanced data

0 Comments
Show -2 older commentsHide -2 older comments

Answers (1)

0 Comments
Show -2 older commentsHide -2 older comments

See Also

Categories

Tags

Products

Community Treasure Hunt

Using fitcsvm for binary linear classification of unbalanced data

0 Comments Show -2 older commentsHide -2 older comments

Answers (1)

0 Comments Show -2 older commentsHide -2 older comments

See Also

Categories

Tags

Products

Community Treasure Hunt

0 Comments
Show -2 older commentsHide -2 older comments

0 Comments
Show -2 older commentsHide -2 older comments