Transfer Learning of various deep neural networks, validation accuracy is significantly lower than training accuracy.
16 views (last 30 days)
Show older comments
So just an overview of what I am trying to achieve, I am transfer learning numerous of MATLAB's pre-trained deep neural networks onto a brain tumour dataset, to see which gives the highest accuracy.
I am using a brain tumour dataset where the training images contains around 1200 images, and the testing images is around 300, and it is split equally in both for the 3 different classifers.
The method I have used is the transfer learning code provided by MathWorks (here: https://uk.mathworks.com/help/deeplearning/ug/transfer-learning-using-pretrained-network.html), and I have swapped out "net = GoogLeNet" for the various models that I have chosen.
The issue I am facing is that while the training accuracy is very high, the validation accuracy seems extremely low in comparison. Here is the training process of ResNet-50. This achieved the highest accuracy out of all the models I have transfer learned.
As you can see the training accuracy is very good, and the validation accuracy is significantly lower, this is the same case for the loss aswell.
Now I am sure that the issue is related to overfitting, but I do not know how to solve this issue. I understand that using data augmentation or dropout layers could help in preventing overfitting, but I have no idea how to implement that in this context of transfer learning.
If anyone could help me with getting the accuracy and loss validation to properly fit the accuracy and loss training I would be extremely grateful.
Thanks in advance, and if any of my actual code is needed to solve this problem I can provide no issue.
2 Comments
Christopher Erickson
on 24 Mar 2023
Using dropout in this context can be very straightforward by simply adding dropoutLayers to your layer graph. I am supposing from your question that you are applying transfer learning in the Deep Network Designer, in which case please refer to to the augmentation options referenced in the transfer learning page. Before attempting either of these, however, I would suggest seeing what happens if you just train for a longer period of time. The phenomena of grokking or double descent might occur, so don't necessarily assume that a low training loss indicates that your model is done learning.
Answers (1)
Sandeep
on 27 Mar 2023
Hi Ted,
It's great that you're working on transfer learning for a brain tumor dataset. Overfitting is indeed a common issue in deep learning models and data augmentation and dropout layers can help preventing it from happening.
Data augmentation can help by artificially increasing the size of the dataset and creating more variation in the training images. You can use MATLAB's built-in augmentedImageDatastore function to apply random transformations to the images, such as rotation, scaling, and flipping.
Here is an example on how to use the augmentedImageDatastore function,
augmenter = imageDataAugmenter('RandRotation',[-10 10],'RandXReflection',true,'RandYReflection',true);
trainingData = augmentedImageDatastore(imageSize, trainData, 'DataAugmentation', augmenter);
Dropout randomly sets some of the activations in the network to zero during training, which helps prevent the network from relying too heavily on any single input or feature. You can add dropout layers to your network using the dropoutLayer function.
An example for the dropoutLayer function is as follows,
layers = [ ...
% Add some layers from the pretrained network
dropoutLayer(0.5), ...
fullyConnectedLayer(numClasses), ...
softmaxLayer(), ...
classificationLayer()]
;
See Also
Categories
Find more on Image Data Workflows in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!