This is machine translation

Translated by Microsoft
Mouseover text to see original. Click the button below to return to the English version of the page.

Note: This page has been translated by MathWorks. Click here to see
To view all translated materials including this page, select Country from the country navigator on the bottom of this page.

Code Generation for Semantic Segmentation Network

This example demonstrates code generation for an image segmentation application that uses deep learning. It uses the codegen command to generate a MEX function that runs prediction on a DAG Network object for SegNet [1], a popular deep learning network for image segmentation.

Prerequisites

  • CUDA® enabled NVIDIA® GPU with compute capability 3.2 or higher.

  • NVIDIA CUDA toolkit and driver.

  • NVIDIA cuDNN library (v5 and above).

  • Deep Learning Toolbox™ to use a DAG Network object.

  • Image Processing Toolbox™ for reading and displaying images.

  • Computer Vision System Toolbox™ for labeloverlay function used in this example.

  • GPU Coder™ for generating CUDA code.

  • GPU Coder Interface for Deep Learning Libraries support package. To install this support package, use the Add-On Explorer.

  • Environment variables for the compilers and libraries. For information on the supported versions of the compilers and libraries, see Third-party Products. For setting up the environment variables, see Setting Up the Prerequisite Products.

Verify the GPU Environment

Use the coder.checkGpuInstall function and verify that the compilers and libraries needed for running this example are set up correctly.

coder.checkGpuInstall('gpu','codegen','cudnn','quiet');

About the Segmentation Network

SegNet [1] is a popular type of convolutional neural network (CNN) designed for semantic image segmentation. It is a deep encoder-decoder multi-class pixel-wise segmentation network trained on the CamVid [2] dataset and imported into MATLAB® for inference. The SegNet [1] is trained to segment pixels belonging to 11 classes which include Sky, Building, Pole, Road, Pavement, Tree, SignSymbol, Fence, Car, Pedestrian, and Bicyclist.

For information regarding training a semantic segmentation network in MATLAB using the CamVid [2] dataset, see Semantic Segmentation Using Deep Learning.

About the 'segnet_predict' Function

The segnet_predict.m function takes an image input and runs prediction on the image using the deep learning network saved in SegNet.mat file. The function loads the network object from SegNet.mat into a persistent variable mynet. On subsequent calls to the function, the persistent object is reused for prediction.

type('segnet_predict.m')
% Copyright 2018 The MathWorks, Inc.

function out = segnet_predict(in)
%#codegen

% A persistent object mynet is used to load the DAG network object.
% At the first call to this function, the persistent object is constructed and
% setup. When the function is called subsequent times, the same object is reused 
% to call predict on inputs, thus avoiding reconstructing and reloading the
% network object.

persistent mynet;

if isempty(mynet)
    mynet = coder.loadDeepLearningNetwork('SegNet.mat');
end

% pass in input
out = predict(mynet,in);


Get the Pretrained SegNet DAG Network Object

net = getSegNet();
Downloading pretrained SegNet (107 MB)...

The DAG network contains 91 layers including convolution, batch normalization, pooling, unpooling and the pixel classification output layers.

net.Layers
ans = 

  91x1 Layer array with layers:

     1   'inputImage'        Image Input                  360x480x3 images with 'zerocenter' normalization
     2   'conv1_1'           Convolution                  64 3x3x3 convolutions with stride [1  1] and padding [1  1  1  1]
     3   'bn_conv1_1'        Batch Normalization          Batch normalization with 64 channels
     4   'relu1_1'           ReLU                         ReLU
     5   'conv1_2'           Convolution                  64 3x3x64 convolutions with stride [1  1] and padding [1  1  1  1]
     6   'bn_conv1_2'        Batch Normalization          Batch normalization with 64 channels
     7   'relu1_2'           ReLU                         ReLU
     8   'pool1'             Max Pooling                  2x2 max pooling with stride [2  2] and padding [0  0  0  0]
     9   'conv2_1'           Convolution                  128 3x3x64 convolutions with stride [1  1] and padding [1  1  1  1]
    10   'bn_conv2_1'        Batch Normalization          Batch normalization with 128 channels
    11   'relu2_1'           ReLU                         ReLU
    12   'conv2_2'           Convolution                  128 3x3x128 convolutions with stride [1  1] and padding [1  1  1  1]
    13   'bn_conv2_2'        Batch Normalization          Batch normalization with 128 channels
    14   'relu2_2'           ReLU                         ReLU
    15   'pool2'             Max Pooling                  2x2 max pooling with stride [2  2] and padding [0  0  0  0]
    16   'conv3_1'           Convolution                  256 3x3x128 convolutions with stride [1  1] and padding [1  1  1  1]
    17   'bn_conv3_1'        Batch Normalization          Batch normalization with 256 channels
    18   'relu3_1'           ReLU                         ReLU
    19   'conv3_2'           Convolution                  256 3x3x256 convolutions with stride [1  1] and padding [1  1  1  1]
    20   'bn_conv3_2'        Batch Normalization          Batch normalization with 256 channels
    21   'relu3_2'           ReLU                         ReLU
    22   'conv3_3'           Convolution                  256 3x3x256 convolutions with stride [1  1] and padding [1  1  1  1]
    23   'bn_conv3_3'        Batch Normalization          Batch normalization with 256 channels
    24   'relu3_3'           ReLU                         ReLU
    25   'pool3'             Max Pooling                  2x2 max pooling with stride [2  2] and padding [0  0  0  0]
    26   'conv4_1'           Convolution                  512 3x3x256 convolutions with stride [1  1] and padding [1  1  1  1]
    27   'bn_conv4_1'        Batch Normalization          Batch normalization with 512 channels
    28   'relu4_1'           ReLU                         ReLU
    29   'conv4_2'           Convolution                  512 3x3x512 convolutions with stride [1  1] and padding [1  1  1  1]
    30   'bn_conv4_2'        Batch Normalization          Batch normalization with 512 channels
    31   'relu4_2'           ReLU                         ReLU
    32   'conv4_3'           Convolution                  512 3x3x512 convolutions with stride [1  1] and padding [1  1  1  1]
    33   'bn_conv4_3'        Batch Normalization          Batch normalization with 512 channels
    34   'relu4_3'           ReLU                         ReLU
    35   'pool4'             Max Pooling                  2x2 max pooling with stride [2  2] and padding [0  0  0  0]
    36   'conv5_1'           Convolution                  512 3x3x512 convolutions with stride [1  1] and padding [1  1  1  1]
    37   'bn_conv5_1'        Batch Normalization          Batch normalization with 512 channels
    38   'relu5_1'           ReLU                         ReLU
    39   'conv5_2'           Convolution                  512 3x3x512 convolutions with stride [1  1] and padding [1  1  1  1]
    40   'bn_conv5_2'        Batch Normalization          Batch normalization with 512 channels
    41   'relu5_2'           ReLU                         ReLU
    42   'conv5_3'           Convolution                  512 3x3x512 convolutions with stride [1  1] and padding [1  1  1  1]
    43   'bn_conv5_3'        Batch Normalization          Batch normalization with 512 channels
    44   'relu5_3'           ReLU                         ReLU
    45   'pool5'             Max Pooling                  2x2 max pooling with stride [2  2] and padding [0  0  0  0]
    46   'decoder5_unpool'   Max Unpooling                Max Unpooling
    47   'decoder5_conv3'    Convolution                  512 3x3x512 convolutions with stride [1  1] and padding [1  1  1  1]
    48   'decoder5_bn_3'     Batch Normalization          Batch normalization with 512 channels
    49   'decoder5_relu_3'   ReLU                         ReLU
    50   'decoder5_conv2'    Convolution                  512 3x3x512 convolutions with stride [1  1] and padding [1  1  1  1]
    51   'decoder5_bn_2'     Batch Normalization          Batch normalization with 512 channels
    52   'decoder5_relu_2'   ReLU                         ReLU
    53   'decoder5_conv1'    Convolution                  512 3x3x512 convolutions with stride [1  1] and padding [1  1  1  1]
    54   'decoder5_bn_1'     Batch Normalization          Batch normalization with 512 channels
    55   'decoder5_relu_1'   ReLU                         ReLU
    56   'decoder4_unpool'   Max Unpooling                Max Unpooling
    57   'decoder4_conv3'    Convolution                  512 3x3x512 convolutions with stride [1  1] and padding [1  1  1  1]
    58   'decoder4_bn_3'     Batch Normalization          Batch normalization with 512 channels
    59   'decoder4_relu_3'   ReLU                         ReLU
    60   'decoder4_conv2'    Convolution                  512 3x3x512 convolutions with stride [1  1] and padding [1  1  1  1]
    61   'decoder4_bn_2'     Batch Normalization          Batch normalization with 512 channels
    62   'decoder4_relu_2'   ReLU                         ReLU
    63   'decoder4_conv1'    Convolution                  256 3x3x512 convolutions with stride [1  1] and padding [1  1  1  1]
    64   'decoder4_bn_1'     Batch Normalization          Batch normalization with 256 channels
    65   'decoder4_relu_1'   ReLU                         ReLU
    66   'decoder3_unpool'   Max Unpooling                Max Unpooling
    67   'decoder3_conv3'    Convolution                  256 3x3x256 convolutions with stride [1  1] and padding [1  1  1  1]
    68   'decoder3_bn_3'     Batch Normalization          Batch normalization with 256 channels
    69   'decoder3_relu_3'   ReLU                         ReLU
    70   'decoder3_conv2'    Convolution                  256 3x3x256 convolutions with stride [1  1] and padding [1  1  1  1]
    71   'decoder3_bn_2'     Batch Normalization          Batch normalization with 256 channels
    72   'decoder3_relu_2'   ReLU                         ReLU
    73   'decoder3_conv1'    Convolution                  128 3x3x256 convolutions with stride [1  1] and padding [1  1  1  1]
    74   'decoder3_bn_1'     Batch Normalization          Batch normalization with 128 channels
    75   'decoder3_relu_1'   ReLU                         ReLU
    76   'decoder2_unpool'   Max Unpooling                Max Unpooling
    77   'decoder2_conv2'    Convolution                  128 3x3x128 convolutions with stride [1  1] and padding [1  1  1  1]
    78   'decoder2_bn_2'     Batch Normalization          Batch normalization with 128 channels
    79   'decoder2_relu_2'   ReLU                         ReLU
    80   'decoder2_conv1'    Convolution                  64 3x3x128 convolutions with stride [1  1] and padding [1  1  1  1]
    81   'decoder2_bn_1'     Batch Normalization          Batch normalization with 64 channels
    82   'decoder2_relu_1'   ReLU                         ReLU
    83   'decoder1_unpool'   Max Unpooling                Max Unpooling
    84   'decoder1_conv2'    Convolution                  64 3x3x64 convolutions with stride [1  1] and padding [1  1  1  1]
    85   'decoder1_bn_2'     Batch Normalization          Batch normalization with 64 channels
    86   'decoder1_relu_2'   ReLU                         ReLU
    87   'decoder1_conv1'    Convolution                  11 3x3x64 convolutions with stride [1  1] and padding [1  1  1  1]
    88   'decoder1_bn_1'     Batch Normalization          Batch normalization with 11 channels
    89   'decoder1_relu_1'   ReLU                         ReLU
    90   'softmax'           Softmax                      softmax
    91   'labels'            Pixel Classification Layer   Class weighted cross-entropy loss with 'Sky', 'Building', and 9 other classes

Run MEX Code Generation for 'segnet_predict' Function

To generate CUDA code from the design file segnet_predict.m, create a GPU code configuration object for a MEX target and set the target language to C++. Use the coder.DeepLearningConfig function to create a CuDNN deep learning configuration object and assign it to the DeepLearningConfig property of the GPU code configuration object. Run the codegen command specifying an input of size [360,480,3]. This value corresponds to the input layer size of SegNet.

cfg = coder.gpuConfig('mex');
cfg.TargetLang = 'C++';
cfg.DeepLearningConfig = coder.DeepLearningConfig('cudnn');
codegen -config cfg segnet_predict -args {ones(360,480,3,'uint8')} -report
Code generation successful: To view the report, open('codegen/mex/segnet_predict/html/report.mldatx').

Run the Generated MEX

Load and display an input image.

im = imread('gpucoder_segnet_image.png');
imshow(im);

Call segnet_predict on the input image.

predict_scores = segnet_predict_mex(im);

The predict_scores variable is a three-dimensional matrix having 11 channels corresponding to the pixel-wise prediction scores for every class. Compute the channel with the maximum prediction score to get pixel-wise labels.

[~,argmax] = max(predict_scores,[],3);

Overlay the segmented labels over the input image and display the segmented region

classes = [
    "Sky"
    "Building"
    "Pole"
    "Road"
    "Pavement"
    "Tree"
    "SignSymbol"
    "Fence"
    "Car"
    "Pedestrian"
    "Bicyclist"
    ];

cmap = camvidColorMap();
SegmentedImage = labeloverlay(im,argmax,'ColorMap',cmap);
figure
imshow(SegmentedImage);
pixelLabelColorbar(cmap,classes);

References

[1] Badrinarayanan, Vijay, Alex Kendall, and Roberto Cipolla. "SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation." arXiv preprint arXiv:1511.00561, 2015.

[2] Brostow, Gabriel J., Julien Fauqueur, and Roberto Cipolla. "Semantic object classes in video: A high-definition ground truth database." Pattern Recognition Letters Vol 30, Issue 2, 2009, pp 88-97.