MATLAB Answers

Why do the training curve fall sharp suddenly?

38 views (last 30 days)
Saugata Bose
Saugata Bose on 31 Aug 2019
Commented: YA Az on 5 Sep 2020
I am training a CNN classifier on a binary balanced dataset. The dataset has 4500 numbers of tweet data along with the class of the tweet. During training, I am applying, GLOVE embedding of 300 dimensions, 'adam' solver to run the model for 33 times of epochs. Besides, the sequence length I have considered is 31.
I have applied 200 filters which include a number of convolution2dlayers,batch normalization layers, relu layers, dropout layers and max-pooling layers. The drop out I have considered is 0.2 and the max pool layer is of size [1 sequence length].
The training curve is approaching smoothly until the end period where it has fallen sharply. Here, I have attached the training plot I receive:
Would you please explain to me why does this sudden fall occur? And how could I get rid of this?


Sign in to comment.

Accepted Answer

Matt J
Matt J on 31 Aug 2019
The final validation metrics are labeled Final in the plots. If your network contains batch normalization layers, then the final validation metrics are often different from the validation metrics evaluated during training. This is because batch normalization layers in the final network perform different operations than during training.

  1 Comment

Saugata Bose
Saugata Bose on 31 Aug 2019
Matt hi. thanks for your response. Yes, removing batch normalization has solved the problem. But it does not mean that using batch normaliation will always create such anamaly. Because, I am using batch normalization layer in few of my works but I never did experence such thing. Does this anamaly by any chance relate to the dataset the model is working on or the hyperparameters of the model?

Sign in to comment.

More Answers (1)

Don Mathis
Don Mathis on 3 Sep 2019
If your network has BatchNormalization layers that appear downstream from Dropout layers, then sometimes "re-finalizing" the batchnorm layers with dropout turned off fixes this problem. Apply the following function to your final network (the one that includes the Batchnorm layers), then check your validation accuracy.
function net = finalizeNetWithoutDropout(net, varargin)
% Refinalize BatchNormalization layers in a noise-free environment, by
% turning off dropout. Pass your original training data using any of these
% options:
% net = finalizeNetWithoutDropout(net, imds)
% net = finalizeNetWithoutDropout(net, mbds)
% net = finalizeNetWithoutDropout(net, X,Y)
% net = finalizeNetWithoutDropout(net, sequences,Y)
% net = finalizeNetWithoutDropout(net, tbl)
% net = finalizeNetWithoutDropout(net, tbl,responseName)
opts = trainingOptions('sgdm', 'InitialLearnRate',eps, 'MaxEpochs',1, 'Verbose',false);
if isa(net, 'SeriesNetwork')
layers = net.Layers;
for i=1:numel(layers)
if isa(layers(i), 'nnet.cnn.layer.DropoutLayer')
name = layers(i).Name;
layers(i) = dropoutLayer(0, 'Name', name);
net = trainNetwork(varargin{:}, layers, opts);
% It's a DAG
lg = layerGraph(net);
for i=1:numel(lg.Layers)
if isa(lg.Layers(i), 'nnet.cnn.layer.DropoutLayer')
name = lg.Layers(i).Name;
lg = replaceLayer(lg, name, dropoutLayer(0, 'Name', name));
net = trainNetwork(varargin{:}, lg, opts);

  1 Comment

YA Az on 5 Sep 2020
I also encountered the same issue. In my case I already traiend my network and saved the model - right after the validation accuracy dropped in the end. My question is: is there a way for me to still use my saved model for predictions (along with the batch normalization layers)? Now, when I'm trying to predict with it I'm getting very bad results - as if my training didn't happened (while the validation accuracy during training was ~97%).
P.S - I do not have any dropout layers in my network - just batch normalization layers.

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!