low weighted cross entropy values

Question

1 vote

I am building a network for semantic segmentation, with a weighted cross entropy loss. It seems possible to add weights related to my 8 classes ( inverse-frequency and normalized weights for each class) with the crossentropy() function. My issue is that the loss values that are calculated during training seem to be lower than what i should expect (values are between 0 and 1 but I would have expected them to be between 2-3).

My class weights vector is

norm_weights =[ 0.0011    0.4426    0.0023    0.0037    0.0212    0.0022    0.0065    1.0000]

And this is how I implement my loss function:

lossFcn = @(Y,T) crossentropy(Y,T,norm_weights,WeightsFormat="UC",...
              NormalizationFactor="all-elements",ClassificationMode="multilabel")*n_class_labels;
[netTrained2, info] = trainnet(augmented_ds,net2,lossFcn,options);

If anyone would have a clue about the issue, that would be helpful!

3 Comments
Show 1 older comment Hide 1 older comment

Ève on 19 Aug 2025

Open in MATLAB Online

I am reproducing a network from a research paper. My network architecture & training options are the same. My data is also from the same database. In their loss graphs, the initial loss values during training are between 2 and 3 so I assumed that this should also be the case for my network. When I use the crossentropy function without weights, such as:

[netTrained1, info] = trainnet(augmented_ds,net1,'crossentropy',options);

I do get higher loss values than when I 'personnalize' my crossentropy loss function so that it has weights

Ève on 19 Aug 2025

I reproduced the methodology from this research article as closely as I could, including how they format their network input and such. I am questionning wether there is a problem with my loss function or not because the loss values that I obtain are actually very small. I said that they were between 0 and 1, and I should have specified that they actually currently gravitate around 0.026502. I know that the goal is for the loss to tend towards zero, but my network isn't trained (I reproduced a SegNet architecture), my training accuracy is around 20%, so the loss values seem very low to me.

Sign in to comment.

Sign in to answer this question.

Follow Question

Answer 1

Matt J on 19 Aug 2025

0 votes

There are a few possible reasons for the discrepancy that I can think of,

(1) Your norm_weights do not add up to 1

(2) You have selected the NormalizationFactor="all-elements" in crossentropy(). According to the doc, though, trainnet does not normalize with all elements. It ignores the channel dimensions

(3) Other hidden normalization factors that may be buried in the blackbox that is trainnet(). I don't know if it is possible or worthwhile trying to dig them out.

7 Comments
Show 5 older comments Hide 5 older comments

Ève on 19 Aug 2025

Edited: Ève on 19 Aug 2025

For (1), I didn't think it was necessary to make sure they add up to 1. It seems like in various examples (https://www.mathworks.com/help/vision/ug/semantic-segmentation-using-deep-learning.html ,

the weighted cross-entropy loss example from https://www.mathworks.com/help/deeplearning/ref/dlarray.crossentropy.html#mw_f6eaac08-f9fa-42ae-8714-3cec2237e548_sep_mw_72d38514-0d45-4bcf-9259-997edf0cb0c8),

they don't, from my comprehension

For (2), I get what you are saying, but I think that's is if we only specify 'crossentropy' as the function in trainnet. The way I did it, with crossentropy(), there seem to be multiple normalization options, from my understanding. I tried to train it with no NormalizationFactor and my loss values are now in the hundreds. so that seems odd once again. I'm trying to familiarize myself with the equations (algorithms) provided by the doc.

I also removed ClassificationMode="multilabel" because even tho I thought semantic segmentation was multilabel classification, it seems like this input argument isn't specified in none of the semantic segmentation MATLAB examples I see.

Ève on 21 Aug 2025

I understand, I'll try your suggestions. Thanks a lot for the feedback, I really appreciate it.

Ève on 21 Aug 2025

I'll accept your answer because as you suggested, my loss values are low simply because they reflect the scale of my weights, which are for the most part very small values. I may revise the way I calculate them. I'll also add for anyone reading this that I was wrong about the ClassificationMode in my lossFcn; for my type of classification problem, it should be set to "single-label" (default). I left the rest of the function the same.

Sign in to comment.

low weighted cross entropy values

3 Comments
Show 1 older comment Hide 1 older comment

Accepted Answer

7 Comments
Show 5 older comments Hide 5 older comments

More Answers (0)

Categories

Tags

Community Treasure Hunt

low weighted cross entropy values

3 Comments Show 1 older comment Hide 1 older comment

Accepted Answer

7 Comments Show 5 older comments Hide 5 older comments

More Answers (0)

Categories

Tags

See Also

Community Treasure Hunt

3 Comments
Show 1 older comment Hide 1 older comment

7 Comments
Show 5 older comments Hide 5 older comments