Main Content

squeezesegv2Layers

Create SqueezeSegV2 segmentation network for organized lidar point cloud

Description

example

lgraph = squeezesegv2Layers(inputSize,numClasses) returns a SqueezeSegV2 layer graph lgraph for organized point clouds of size inputSize and the number of classes numClasses.

SqueezeSegV2 is a convolutional neural network that predicts pointwise labels for an organized lidar point cloud.

Use the squeezesegv2Layers function to create the network architecture for SqueezeSegV2. This function requires Deep Learning Toolbox™.

example

lgraph = squeezesegv2Layers(___,Name,Value) specifies options using one or more name-value pair arguments in addition to the input arguments in the previous syntax. For example, 'NumEncoderModules',4 sets the number of encoders used to create the network to four.

Examples

collapse all

Set the network input parameters.

inputSize = [64 512 5];
numClasses = 4;

Create a SqueezeSegV2 layer graph.

lgraph = squeezesegv2Layers(inputSize,numClasses)
lgraph = 
  LayerGraph with properties:

         Layers: [168x1 nnet.cnn.layer.Layer]
    Connections: [186x2 table]
     InputNames: {'input'}
    OutputNames: {'focalloss'}

Display the network.

analyzeNetwork(lgraph)

Set the network input parameters.

inputSize = [64 512 6];
numClasses = 2;

Create a custom SqueezeSegV2 layer graph.

lgraph = squeezesegv2Layers(inputSize,numClasses,...
'NumEncoderModules',4,'NumContextAggregationModules',2)
lgraph = 
  LayerGraph with properties:

         Layers: [232x1 nnet.cnn.layer.Layer]
    Connections: [257x2 table]
     InputNames: {'input'}
    OutputNames: {'focalloss'}

Display the network.

analyzeNetwork(lgraph)

Input Arguments

collapse all

Size of the network input, specified as one of these options:

  • Two-element vector of the form [height width].

  • Three-element vector of the form [height width channels], where channels specifies the number of input channels. Set channels to 3 for RGB images, to 1 for grayscale images, or to the number of channels for multispectral and hyperspectral images.

Number of semantic segmentation classes, specified as an integer greater than 1.

Name-Value Pair Arguments

Specify optional comma-separated pairs of Name,Value arguments. Name is the argument name and Value is the corresponding value. Name must appear inside quotes. You can specify several name and value pair arguments in any order as Name1,Value1,...,NameN,ValueN.

Example: 'NumEncoderModules',4 sets the number of encoders used to create the network to four.

Number of encoder modules used to create the network, specified as the comma-separated pair consisting of 'NumEncoderModules' and a nonnegative integer. Each encoder module consists of two fire modules and one max-pooling layer connected sequentially. If you specify 0, then the function returns a network with a default encoder that consists of convolution and max-pooling layers with no fire modules. Use this name-value pair to customize the number of fire modules in the network.

Number of context aggregation modules (CAMs), specified as the comma-separated pair consisting of 'NumContextAggregationModules' and an integer in the range [0,3]. If you specify 0, then the function creates a network without a CAM.

Output Arguments

collapse all

Layers that represent the SqueezeSegV2 network architecture, returned as a layerGraph (Deep Learning Toolbox) object.

More About

collapse all

SqueezeSegV2 Network

  • A SqueezeSegV2 network consists of encoder modules, CAMs, intermediate fixed fire modules [1] for feature extraction, and decoder modules. The function automatically configures the number of decoder modules based on the specified number of encoder modules.

  • The function uses narrow-normal weight initialization method to initialize the weights of each convolution layer within encoder and decoder subnetworks .

  • The function initializes all bias terms to zero.

  • The function adds the padding for all convolution and max-pooling layers such that the output has the same size as the input (if the stride equals 1).

  • The height of the input tensor is significantly lower than the width in organized lidar point cloud data. To address this, the network downsamples the width dimension of the input data in convolution and max-pooling layers. The width of the input data must be a multiple of 2(D + 2), where D is the number of encoder modules used to create the network.

  • This function does not provide a recurrent conditional random field (CRF) layer.

References

[1] Wu, Bichen, Xuanyu Zhou, Sicheng Zhao, Xiangyu Yue, and Kurt Keutzer. “SqueezeSegV2: Improved Model Structure and Unsupervised Domain Adaptation for Road-Object Segmentation from a LiDAR Point Cloud.” In 2019 International Conference on Robotics and Automation (ICRA), 4376–82. Montreal, QC, Canada: IEEE, 2019.https://doi.org/10.1109/ICRA.2019.8793495.

Extended Capabilities

C/C++ Code Generation
Generate C and C++ code using MATLAB® Coder™.

GPU Code Generation
Generate CUDA® code for NVIDIA® GPUs using GPU Coder™.

Introduced in R2020b