the difference of best validation point and final point at training process plot

Question

Yongwon Jang on 31 Aug 2023

0
Link

Direct link to this question

https://ch.mathworks.com/matlabcentral/answers/2015506-the-difference-of-best-validation-point-and-final-point-at-training-process-plot

Commented: Yongwon Jang on 8 Sep 2023

I trained a network with trainNetwork function like below.

[net, info] = trainNetwork(Xtrain, Ytrain, lgraph, options);

And the options were like below.

options = trainingOptions('adam', ...

'InitialLearnRate', 5e-06, ...

'MaxEpochs', 30, ...

'MiniBatchSize', 128,...

'ExecutionEnvironment', 'multi-gpu', ...

'ValidationData',{Xvalid,Yvalid}, ...

'ValidationFrequency', 10, ...

'ValidationPatience', inf, ...

'Shuffle', 'every-epoch', ...

'OutputNetwork', 'best-validation', ...

'Plots','training-progress' )

In this case, the 'final' point is displayed on the graph after learning is completed. However, the final point is very different from the graph's validation accuracy and validation loss. See below figure.

I don't understand this part. You can clearly see that the validation line marked final on the graph has an accuracy of between 85 and 90%, and the final point is below 80%. In the upper right corner of the figure, the validation accuracy is 76.8%.

Is this happening because of some option setting?

I am waiting for help from experts to find out why this is happening.

Please help.

2 Comments
Show NoneHide None

James on 4 Sep 2023

It is common for test accuracy to be slighty lower than that of validation accuracy as the model can choose of multiple validation results for best performing model; this does not 100% guarantee performance in test dataset. That said, even with that the results above seem a bit out of norm.

There can be few explanations for this.

If model is overfitting

https://kr.mathworks.com/matlabcentral/answers/871713-how-to-resolve-if-validation-and-testing-accuracy-are-widely-different

If BatchNormalization is used,

https://kr.mathworks.com/matlabcentral/answers/581573-why-is-my-final-validation-accuracy-much-lower-than-the-validation-accuracy-during-training

It'd be also good if you could share how train/validation/test datasets are prepared and how model is designed ('lgraph'). For example, if K-fold is used for validation, disparity between test accuracy will be quite large.

Yongwon Jang on 8 Sep 2023

Thank you very much for your reply.

I am doing 6-fold validation. 5-fold data was allocated as training, and 1-fold data was allocated as validation data.

The question is the result of setting the 1st to 5th data groups to tr and the 6th data group to vl. (To do it all with 6-FOLD, I have to do this 5 more times, changing the VL)

The data divided into 6 groups will clearly have differences as they are experimental data from different environments. However, I don't understand why validation accuracy is so different from final accuracy.

There is something I don't quite understand at first glance, so I would like to ask you the following questions.

First, I am writing down the facts I know below.

1. Training accuracy is the result of learning by dividing data into batch size at each iteration.

2. (Divided data equal to the batch size is used for learning) At every 10th iteration, the entire validation data is entered to measure validation accuracy.

3. After completing the designated epoch, the final point is displayed as a result of the forward net that inputs validation data into the net selected as best-validation.

If so, shouldn't the point marked final be placed in the same place as the validation accuracy? If the program uses the same validation data as the black dot taken every 10th iteration as validation accuracy, it should be taken in the same place, right? Maybe I'm misunderstanding this?

I am curious about what data is used to measure the final accuracy of the data marked final at the end of training.

I additionally tested as follows... I confirmed that as the batch size increases, the difference (between the position of the point marked as final and the validation accuracy point) decreases. Is it related to batch size?

If necessary, I will share the process of dividing the data into 6 groups.

thank you.

Sign in to comment.

Sign in to answer this question.

Answer 1

Gagan Agarwal on 5 Sep 2023

0
Link

Direct link to this answer

https://ch.mathworks.com/matlabcentral/answers/2015506-the-difference-of-best-validation-point-and-final-point-at-training-process-plot#answer_1301286

Hi Yongwon Jang

The plot depicts a decline in validation accuracy after the training on the final iteration of the data, and due to the default setting of the ‘OutputNetwork’ training option as ‘last-iteration,’ the ‘Validation Accuracy’ field is being recorded as 76.8%.

The ‘OutputNetwork’ training option is not correctly assigned in the ‘option’ variable.

To obtain the Best Validation Loss as the ‘Validation Accuracy’, it is recommended to set the ‘OutputNetwork’ option to ‘best-validation-loss' rather than ‘best-validation.’

For a more comprehensive understanding of various optional parameters, you can refer to the following documentation: - https://www.mathworks.com/help/deeplearning/ref/trainingoptions.html

1 Comment
Show -1 older commentsHide -1 older comments

Yongwon Jang on 8 Sep 2023

‘best-validation-loss' rather than ‘best-validation.’

I appreciate your pointing it out. I corrected it and ran the code again, but still I wonder the difference between final point and validation accuracy.

I think.... after completing the designated epoch, the final point is displayed as a result of the forward net that inputs validation data into the net selected as best-validation-loss. My guess is... if the selected net of training is used for both validation accuracy process and final accuracy process, the final point should be the same as validation accuracy point. But, the plot shows still difference (smaller than before)

Please let me konw if I misunderstand something.

Again, thank you for your help.

Sign in to comment.

the difference of best validation point and final point at training process plot

2 Comments
Show NoneHide None

Answers (1)

1 Comment
Show -1 older commentsHide -1 older comments

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

the difference of best validation point and final point at training process plot

2 Comments Show NoneHide None

Answers (1)

1 Comment Show -1 older commentsHide -1 older comments

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

2 Comments
Show NoneHide None

1 Comment
Show -1 older commentsHide -1 older comments