invalid training data. the output size of [128 128 2] the last layer does not match the response size[ 1 1 2].

1 view (last 30 days)
Good morning I'm trying to create a U-net network to perform semantic segmentation. I'm already creating a dataset of training images and other labels. The size of these tif images is [128 128 3] so far so good, but when I try to start the training I got this error: invalid training data. the output size of [128 128 2] the last layer does not match the response size[1 1 2].
this is the code of my layers:
tempLayers = [
imageInputLayer([128 128 3],"Name","ImageInputTile")
convolution2dLayer([3 3],16,"Name","Encoder-Stage-1-Conv-1","BiasLearnRateFactor",2,"Padding","same","Bias",trainingSetup.Encoder_Stage_1_Conv_1.Bias,"Weights",trainingSetup.Encoder_Stage_1_Conv_1.Weights)
reluLayer("Name","Encoder-Stage-1-ReLU-1")
convolution2dLayer([3 3],16,"Name","Encoder-Stage-1-Conv-2","BiasLearnRateFactor",2,"Padding","same","Bias",trainingSetup.Encoder_Stage_1_Conv_2.Bias,"Weights",trainingSetup.Encoder_Stage_1_Conv_2.Weights)
reluLayer("Name","Encoder-Stage-1-ReLU-2")];
lgraph = addLayers(lgraph,tempLayers);
tempLayers = [
maxPooling2dLayer([2 2],"Name","Encoder-Stage-1-MaxPool","Stride",[2 2])
convolution2dLayer([3 3],32,"Name","Encoder-Stage-2-Conv-1","BiasLearnRateFactor",2,"Padding","same","Bias",trainingSetup.Encoder_Stage_2_Conv_1.Bias,"Weights",trainingSetup.Encoder_Stage_2_Conv_1.Weights)
reluLayer("Name","Encoder-Stage-2-ReLU-1")
convolution2dLayer([3 3],32,"Name","Encoder-Stage-2-Conv-2","BiasLearnRateFactor",2,"Padding","same","Bias",trainingSetup.Encoder_Stage_2_Conv_2.Bias,"Weights",trainingSetup.Encoder_Stage_2_Conv_2.Weights)
reluLayer("Name","Encoder-Stage-2-ReLU-2")];
lgraph = addLayers(lgraph,tempLayers);
tempLayers = [
maxPooling2dLayer([2 2],"Name","Encoder-Stage-2-MaxPool","Stride",[2 2])
convolution2dLayer([3 3],64,"Name","Encoder-Stage-3-Conv-1","BiasLearnRateFactor",2,"Padding","same","Bias",trainingSetup.Encoder_Stage_3_Conv_1.Bias,"Weights",trainingSetup.Encoder_Stage_3_Conv_1.Weights)
reluLayer("Name","Encoder-Stage-3-ReLU-1")
convolution2dLayer([3 3],64,"Name","Encoder-Stage-3-Conv-2","BiasLearnRateFactor",2,"Padding","same","Bias",trainingSetup.Encoder_Stage_3_Conv_2.Bias,"Weights",trainingSetup.Encoder_Stage_3_Conv_2.Weights)
reluLayer("Name","Encoder-Stage-3-ReLU-2")];
lgraph = addLayers(lgraph,tempLayers);
tempLayers = [
maxPooling2dLayer([2 2],"Name","Encoder-Stage-3-MaxPool","Stride",[2 2])
convolution2dLayer([3 3],128,"Name","Encoder-Stage-4-Conv-1","BiasLearnRateFactor",2,"Padding","same","Bias",trainingSetup.Encoder_Stage_4_Conv_1.Bias,"Weights",trainingSetup.Encoder_Stage_4_Conv_1.Weights)
reluLayer("Name","Encoder-Stage-4-ReLU-1")
convolution2dLayer([3 3],128,"Name","Encoder-Stage-4-Conv-2","BiasLearnRateFactor",2,"Padding","same","Bias",trainingSetup.Encoder_Stage_4_Conv_2.Bias,"Weights",trainingSetup.Encoder_Stage_4_Conv_2.Weights)
reluLayer("Name","Encoder-Stage-4-ReLU-2")];
lgraph = addLayers(lgraph,tempLayers);
tempLayers = [
dropoutLayer(0.5,"Name","Encoder-Stage-4-DropOut")
maxPooling2dLayer([2 2],"Name","Encoder-Stage-4-MaxPool","Stride",[2 2])
convolution2dLayer([3 3],256,"Name","Bridge-Conv-1","BiasLearnRateFactor",2,"Padding","same","Bias",trainingSetup.Bridge_Conv_1.Bias,"Weights",trainingSetup.Bridge_Conv_1.Weights)
reluLayer("Name","Bridge-ReLU-1")
convolution2dLayer([3 3],256,"Name","Bridge-Conv-2","BiasLearnRateFactor",2,"Padding","same","Bias",trainingSetup.Bridge_Conv_2.Bias,"Weights",trainingSetup.Bridge_Conv_2.Weights)
reluLayer("Name","Bridge-ReLU-2")
dropoutLayer(0.5,"Name","Bridge-DropOut")
transposedConv2dLayer([2 2],128,"Name","Decoder-Stage-1-UpConv","BiasLearnRateFactor",2,"Stride",[2 2],"Bias",trainingSetup.Decoder_Stage_1_UpConv.Bias,"Weights",trainingSetup.Decoder_Stage_1_UpConv.Weights)
reluLayer("Name","Decoder-Stage-1-UpReLU")];
lgraph = addLayers(lgraph,tempLayers);
tempLayers = [
depthConcatenationLayer(2,"Name","Decoder-Stage-1-DepthConcatenation")
convolution2dLayer([3 3],128,"Name","Decoder-Stage-1-Conv-1","BiasLearnRateFactor",2,"Padding","same","Bias",trainingSetup.Decoder_Stage_1_Conv_1.Bias,"Weights",trainingSetup.Decoder_Stage_1_Conv_1.Weights)
reluLayer("Name","Decoder-Stage-1-ReLU-1")
convolution2dLayer([3 3],128,"Name","Decoder-Stage-1-Conv-2","BiasLearnRateFactor",2,"Padding","same","Bias",trainingSetup.Decoder_Stage_1_Conv_2.Bias,"Weights",trainingSetup.Decoder_Stage_1_Conv_2.Weights)
reluLayer("Name","Decoder-Stage-1-ReLU-2")
transposedConv2dLayer([2 2],64,"Name","Decoder-Stage-2-UpConv","BiasLearnRateFactor",2,"Stride",[2 2],"Bias",trainingSetup.Decoder_Stage_2_UpConv.Bias,"Weights",trainingSetup.Decoder_Stage_2_UpConv.Weights)
reluLayer("Name","Decoder-Stage-2-UpReLU")];
lgraph = addLayers(lgraph,tempLayers);
tempLayers = [
depthConcatenationLayer(2,"Name","Decoder-Stage-2-DepthConcatenation")
convolution2dLayer([3 3],64,"Name","Decoder-Stage-2-Conv-1","BiasLearnRateFactor",2,"Padding","same","Bias",trainingSetup.Decoder_Stage_2_Conv_1.Bias,"Weights",trainingSetup.Decoder_Stage_2_Conv_1.Weights)
reluLayer("Name","Decoder-Stage-2-ReLU-1")
convolution2dLayer([3 3],64,"Name","Decoder-Stage-2-Conv-2","BiasLearnRateFactor",2,"Padding","same","Bias",trainingSetup.Decoder_Stage_2_Conv_2.Bias,"Weights",trainingSetup.Decoder_Stage_2_Conv_2.Weights)
reluLayer("Name","Decoder-Stage-2-ReLU-2")
transposedConv2dLayer([2 2],32,"Name","Decoder-Stage-3-UpConv","BiasLearnRateFactor",2,"Stride",[2 2],"Bias",trainingSetup.Decoder_Stage_3_UpConv.Bias,"Weights",trainingSetup.Decoder_Stage_3_UpConv.Weights)
reluLayer("Name","Decoder-Stage-3-UpReLU")];
lgraph = addLayers(lgraph,tempLayers);
tempLayers = [
depthConcatenationLayer(2,"Name","Decoder-Stage-3-DepthConcatenation")
convolution2dLayer([3 3],32,"Name","Decoder-Stage-3-Conv-1","BiasLearnRateFactor",2,"Padding","same","Bias",trainingSetup.Decoder_Stage_3_Conv_1.Bias,"Weights",trainingSetup.Decoder_Stage_3_Conv_1.Weights)
reluLayer("Name","Decoder-Stage-3-ReLU-1")
convolution2dLayer([3 3],32,"Name","Decoder-Stage-3-Conv-2","BiasLearnRateFactor",2,"Padding","same","Bias",trainingSetup.Decoder_Stage_3_Conv_2.Bias,"Weights",trainingSetup.Decoder_Stage_3_Conv_2.Weights)
reluLayer("Name","Decoder-Stage-3-ReLU-2")
transposedConv2dLayer([2 2],16,"Name","Decoder-Stage-4-UpConv","BiasLearnRateFactor",2,"Stride",[2 2],"Bias",trainingSetup.Decoder_Stage_4_UpConv.Bias,"Weights",trainingSetup.Decoder_Stage_4_UpConv.Weights)
reluLayer("Name","Decoder-Stage-4-UpReLU")];
lgraph = addLayers(lgraph,tempLayers);
tempLayers = [
depthConcatenationLayer(2,"Name","Decoder-Stage-4-DepthConcatenation")
convolution2dLayer([3 3],16,"Name","Decoder-Stage-4-Conv-1","BiasLearnRateFactor",2,"Padding","same","Bias",trainingSetup.Decoder_Stage_4_Conv_1.Bias,"Weights",trainingSetup.Decoder_Stage_4_Conv_1.Weights)
reluLayer("Name","Decoder-Stage-4-ReLU-1")
convolution2dLayer([3 3],16,"Name","Decoder-Stage-4-Conv-2","BiasLearnRateFactor",2,"Padding","same","Bias",trainingSetup.Decoder_Stage_4_Conv_2.Bias,"Weights",trainingSetup.Decoder_Stage_4_Conv_2.Weights)
reluLayer("Name","Decoder-Stage-4-ReLU-2")
convolution2dLayer([1 1],2,"Name","Final-ConvolutionLayer","Bias",trainingSetup.Final_ConvolutionLayer.Bias,"Weights",trainingSetup.Final_ConvolutionLayer.Weights)
softmaxLayer("Name","Softmax-Layer")
pixelClassificationLayer("Name","Segmentation-Layer")];
lgraph = addLayers(lgraph,tempLayers);
% clean up helper variable
clear tempLayers;
I hope you will help me find a solution. have a nice day!

Answers (1)

Abhishek
Abhishek on 15 May 2023
Edited: Abhishek on 15 May 2023
Hi Merzouk,
I understand that you are trying to perform image segmentation using a U-Net network.
Based on the error message, it could be inferred that the output dimensions of the last convolutional layer do not match the expected dimensions. The last convolutional layer has a size of [1 1 2], which means that it outputs a 1x1 feature map per class (2 in total), while the expected output size appears to be [128 128 2].
This can occur because the size and dimension of the output of the last convolutional layer do not match the expected size for the output layer of the network. Generally, the expected output size in semantic segmentation is the same as the input size of the image, but with a different number of channels, corresponding to the number of classes.
To fix this issue, you can modify your last conv layer to match the expected output dimensions of [128 128 2] for example by replacing:
convolution2dLayer([1 1],2,"Name","Final-ConvolutionLayer","Bias",trainingSetup.Final_ConvolutionLayer.Bias,"Weights",trainingSetup.Final_ConvolutionLayer.Weights)
with
convolution2dLayer([1 1],2,"Name","Final-ConvolutionLayer","Bias",trainingSetup.Final_ConvolutionLayer.Bias,"Weights",trainingSetup.Final_ConvolutionLayer.Weights, "Padding", "same")
Adding ‘Padding’ parameter set to 'same', which adds padding to the feature maps, such that the dimensions of the output will be the same as the input.
For more information, refer to the documentation page on 'convolution2dLayer': 2-D convolutional layer - MATLAB (mathworks.com).

Categories

Find more on Image Data Workflows in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!