It would be better if you have a larger training set with both positive and negative categories represented.
In the above documentation, it is mentioned as follows: “First a CNN is pretrained using the CIFAR-10 data set, which has 50,000 training images. Then this pretrained CNN is fine-tuned for stop sign detection using just 41 training images. Without pretraining the CNN, training the stop sign detector would require many more images.”
You could try to pretrain the image on the large generic dataset like CIFAR-10 before training it for your specific requirement as mentioned in the documentation example. To obtain a better accuracy for classification, as with other neural networks, you may need more epochs of training.
I suggest the following steps:
1) Pre-train the network on a large dataset first as discussed above.
2) You could try tuning some parameters like 'MiniBatchSize', 'InitialLearnRate', 'MaxEpochs'.
3) Increase the training dataset size.
As the network is being trained, you can observe the accuracy over various epochs. Repeat the above steps until you get a suitable accuracy.
Hope this helps