At least for (2), the issue is that checkpoints don't save the state of the solver (momentum parameters, etc), so your solver cannot "start where it left off", although it might approximately do so if you lower the learning rate by an order of magnitude.
R2018a Neural Network Toolbox "Bugs"??
3 views (last 30 days)
Show older comments
In R2018a Neural Network Toolbox I have noticed two oddities:
(1) When training a CNN with trainNetwork and 'options' field 'Plots' set to 'training-progress', the VERY LAST validation data point on the plot is always plotted as a discontinuous jump with lower accuracy and high loss, regardless of whether I click the 'Stop' button manually, or it stops by completing all Epochs/Iterations (I have 'ValidationPatience' at 'Inf'). I did not noticed there ever occurring in R2017b. Is it a bug, or some new type of 'feature'???
(2) Checkpoints seem to save properly when specifying a 'CheckpointPath' in 'options' to trainNetwork, but they do not seem to initialize properly or, they at least are not initializing as one would expect. For example, one would expect that loading a save 'net' object from the last checkpoint, specifying the same databases/set, same options, then running trainNetwork(imds,net.Layers,options), one would begin reconstructing the training accuracy and Loss plots you left off with--at least approximately--and that if you let it pick up where it left off, then viewed the convolutional filter weights after a couple iterations, they should look mildly similar to what you left off with from with what is in the checkpoint 'net' object for filter weights. Instead, I see them looking very different, and there is always a discontinuous break in my Accuracy/Loss plots if I follow the documentation/directions in R2018a for 'picking up where I left off' with checkpoints. Is this another bug/logical error, or a 'feature' of some sort???
0 Comments
Accepted Answer
More Answers (2)
Johannes Bergstrom
on 27 Apr 2018
Regarding (1), please see https://www.mathworks.com/help/nnet/examples/monitor-deep-learning-training-progress.html
If your network contains batch normalization layers, then the final validation metrics are often different from the validation metrics evaluated during training. This is because batch normalization layers in the final network perform different operations than during training.
In particular, batch normalization layers use means and variances evaluated on each mini-batch during training, but use the whole training set for the final network.
0 Comments
See Also
Categories
Find more on Image Data Workflows in Help Center and File Exchange
Products
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!