File Exchange

image thumbnail

Semantic Segmentation Using FCN-AlexNet

version (2.27 MB) by Kei Otsuka
How to create, train and evaluate FCN-AlexNet for semantic segmentation


Updated 09 Feb 2018

View License

This demo shows how to create, train and evaluate AlexNet based Fully Convolutional Network for semantic segmentation. MATLAB and Computer Vision System Toolbox provides fcnLayers function to create FCN, but this is VGG-16 based FCN. If you would like to use lower the computational cost of network, FCN-AlexNet might be one of the options.
AlexNetをベースとしたFCNによるSemantic Segmentationのデモです。AlexNetベースのFCNを定義し、学習、ネットワークを評価するところまでをご紹介します。Computer Vision System ToolboxではfcnLayersと呼ばれる関数が提供されており、容易にFCNを定義することができますが、こちらはVGG-16をベースとしたネットワークになっています。 計算コストの関係でもう少しコンパクトなネットワークを試したい場合や、性能比較の基準としてVGG-16以外のネットワークをベースとしたFCNを試したい場合にお試しください。

Comments and Ratings (20)


Could you please give me a feedback about an AlexNet FCN-8s that I implemented? I compared my results with the results obtained with your implementation of AlexNet 32s but I can't figure if my results are what they should be

Kei Otsuka

Yes, you can create AlexNet based FCN that has 16x or 8x upsampled prediction. As you mentioned, AlexNet has 3 pooling layers - pool1, pool2 and pool5. For example, by combining predictions from both pool5 layer and pool2 layer, at stride 16, we can achieve 16x upsampled prediction.


It is possible to implement AlexNet as FCN-16s or FCN-8s ? Based on the paper "Fully Convolutional Networks for Semantic Segmentation" the FCN architecture has 5 pool layers but the original AlexNet have only 3 pooling layers.

Kei Otsuka

This example uses FCN-32s architecture, but you can create FCN-16s or 8s by fusing features from different coarseness.


The implementation is based on FCN-32s, FCN-16s or FCN-8s?

Kei Otsuka

Basically, all images handled in datastore need to have pixel labels, since the datastore in this example is intended to be used for training or validation. You can include images that doesn't has pixel labels but segmentation accuracy by trained network might be degraded.

Thanks a lot for this work
I have a question, whether we have to find pixel labels for all images in the image datastore?

Kei Otsuka

AlexNet can only process RGB images with 3 channels. If you would like to create network to perform semantic segmentation of a multispectral image with 6 channels, this example might be helpful.

How can I perform semantic segmentation of a multispectral image (6 channel input to the network) using FCN-AlexNet??,

Kei Otsuka

CamVid pixel label IDs are provided as RGB color values and CamVid class names are listed alongside each RGB value in my example. You can change these values to align with your own label IDs. In terms of padding, [100 100] padding guarantees that the network output can be aligned to the input for any input size in the given datasets. The alignment is handled automatically by net specification and the crop layer. It is possible, though less convenient, to calculate the exact offsets necessary and do away with this amount of padding.

As Has

Thanks for your great effort . i have two questions, how can i change the values of sky and other objects ( sky=[128 128..]_ in camvidPixelLabelIDs function to be adopt with my own images. second if i want to change the size of input image instead of [360 480] how did you compute the padding with [100 100].

Kei Otsuka

Looks like your code includes unnecessary space.
’score / ref’ should be 'score/ref'

As Has

when i write this line i get an error
lgraph = connectLayers (lgraph, 'data' , 'score / ref' );

error :-
Layer 'score ' does not exist.

Error in nnet.cnn.LayerGraph>iGetDestinationInformation (line 557)
iValidateLayerName( endLayerName, layers );

Kei Otsuka


Created network by using fcnLayers is preinitialized using layers and weights from the VGG-16 network.
If you want to create FCN that is preinitialized using layers and weights from another network such as AlexNet,
you need to create network manually. This example shows how to do that.

In terms of SegNet, you can use segnetLayers function to create SegNet that uses layers and weights from VGG-16 or VGG-19 network. This would help you to understand the architecture of SegNet and then you can replace Layers to create AlexNet based SegNet.

As Has

thanks so much, i really need it ,if you don't mind i have 2 questions :-
if i want to implement AlexNet for U-net and Segnet what i have to do
why you don't use the function ( fcnLayers) with matlab

Thanks a lot for your help. My mail :

Kei Otsuka

I have pretrained FCN-Alexnet that can be shared with you, but filesize is more than 200MB and I can not upload it to File Exchange due to the limitation. May I have your email address? I may be able to share pretrained network through another file transfer system.

could you provide any link for pretraining weights for FCN-AlexNet as for segnetVGG16CamVid:
pretrainedURL = '';

Kei Otsuka

Thanks for the note. The error message indicates that you are using CPU for training. I will try to find out where the issue is, but can you use GPU for training? Since it takes about a few hours for training even on GPU environment, I think that training deep neural network on CPU environment might be unreasonable in many cases.

Thank you for your great effort .
An error message appears when starting the network training as follow:
Training on single CPU.
Initializing image normalization.
| Epoch | Iteration | Time Elapsed | Mini-batch | Mini-batch | Base Learning |
| | | (hh:mm:ss) | Accuracy | Loss | Rate |
Error using trainNetwork (line 154)
Padding exceeds array bounds.

Error in fcnAlexNetExample (line 273)
[net, info] =

Caused by:
Error using builtin
Padding exceeds array bounds.

Thanks in advance


update html files

MATLAB Release Compatibility
Created with R2017b
Compatible with any release
Platform Compatibility
Windows macOS Linux