Train Convolutional Neural Network for Regression

This example shows how to train a convolutional neural network to predict the angles of rotation of handwritten digits.

Regression tasks involve predicting continuous numerical values instead of discrete class labels. This example constructs a convolutional neural network architecture for regression, trains the network, and the uses the trained network to predict angles of rotated handwritten digits.

This diagram illustrates the flow of image data through a regression neural network.

Diagram showing data flow of data through the neural network. The input data to the network is a collection of images of digits. The output of the neural network are numeric scalars.

Load Data

The data set contains synthetic images of handwritten digits together with the corresponding angles (in degrees) by which each image is rotated.

Load the training and test data from the MAT files DigitsDataTrain.mat and DigitsDataTest.mat, respectively. The variables anglesTrain and anglesTest are the rotation angles in degrees. The training and test data sets each contain 5000 images.

load DigitsDataTrain
load DigitsDataTest

Display some of the training images.

numObservations = size(XTrain,4);
idx = randperm(numObservations,49);
I = imtile(XTrain(:,:,:,idx));
figure
imshow(I);

Figure contains an axes object. The hidden axes object contains an object of type image.

Partition XTrain and anglesTrain into training and validation partitions using the trainingPartitions function, attached to this example as a supporting file. To access this function, open the example as a live script. Set aside 15% of the training data for validation.

[idxTrain,idxValidation] = trainingPartitions(numObservations,[0.85 0.15]);

XValidation = XTrain(:,:,:,idxValidation);
anglesValidation = anglesTrain(idxValidation);

XTrain = XTrain(:,:,:,idxTrain);
anglesTrain = anglesTrain(idxTrain);

Define Neural Network Architecture

Define the neural network architecture.

For image input, specify an image input layer.
Specify four convolution-batchnorm-ReLU blocks with increasing numbers of filters.
Between each block, specify an average pooling layer with pooling regions and stride of size 2.
For regression, include a fully connected layer with an output size that matches the number of responses.
In this example, the training process automatically normalizes the training targets using the NormalizeTargets training option (introduced in R2026a). Using normalized targets helps stabilize training and results in training predictions that closely match the normalized targets. To make the neural network output predictions in the space of unnormalized values at prediction time only, include an inverse normalization layer (introduced in R2026a). Before R2026a: To stabilize training, normalize the targets manually before you train the neural network.

numResponses = size(anglesTrain,2);

layers = [
    imageInputLayer([28 28 1])
    convolution2dLayer(3,8,Padding="same")
    batchNormalizationLayer
    reluLayer
    averagePooling2dLayer(2,Stride=2)
    convolution2dLayer(3,16,Padding="same")
    batchNormalizationLayer
    reluLayer
    averagePooling2dLayer(2,Stride=2)
    convolution2dLayer(3,32,Padding="same")
    batchNormalizationLayer
    reluLayer
    convolution2dLayer(3,32,Padding="same")
    batchNormalizationLayer
    reluLayer
    fullyConnectedLayer(numResponses)
    inverseNormalizationLayer];

Specify Training Options

Specify the training options. Choosing among the options requires empirical analysis. To explore different training option configurations by running experiments, you can use the Experiment Manager app.

Train with a mini-batch size of 128.
Automatically normalize the training targets using the NormalizeTargets argument (introduced in R2026a). Before R2026a: To stabilize training, normalize the targets manually before you train the neural network.
Use an initial learning rate of 0.001 and drop the learning rate using piecewise learning rate schedule that drops the learning rate using a factor of 0.1 every 20 epochs.
Validate the neural network using the validation data every epoch.
Display the training progress in a plot.
Disable the verbose output.

miniBatchSize = 128;

schedule = piecewiseLearnRate( ...
    DropFactor=0.1, ...
    Period=20);

numIterationsPerEpoch = floor(numel(anglesTrain)/miniBatchSize);

options = trainingOptions("sgdm", ...
    NormalizeTargets=true, ...
    MiniBatchSize=miniBatchSize, ...
    InitialLearnRate=1e-3, ...
    LearnRateSchedule=schedule, ... 
    Shuffle="every-epoch", ...
    ValidationData={XValidation,anglesValidation}, ...
    ValidationFrequency=numIterationsPerEpoch, ...
    Plots="training-progress", ...
    Verbose=false);

Train Neural Network

Train the neural network using the trainnet function. For regression, use mean squared error loss. By default, the trainnet function uses a GPU if one is available. Using a GPU requires a Parallel Computing Toolbox™ license and a supported GPU device. For information on supported devices, see GPU Computing Requirements (Parallel Computing Toolbox). Otherwise, the function uses the CPU. To specify the execution environment, use the ExecutionEnvironment training option.

net = trainnet(XTrain,anglesTrain,layers,"mse",options);

Test Network

Test the neural network using the testnet function. For regression, evaluate the root mean squared error (RMSE). By default, the testnet function uses a GPU if one is available. To select the execution environment manually, use the ExecutionEnvironment argument of the testnet function.

rmse = testnet(net,XTest,anglesTest,"rmse")

rmse = 
7.6861

Visualize the accuracy in a plot by making predictions with the test data and comparing the predictions with the targets. Make predictions using the minibatchpredict function. By default, the minibatchpredict function uses a GPU if one is available.

YTest = minibatchpredict(net,XTest);

Plot the predicted values against the targets.

figure
scatter(YTest,anglesTest,"+")
xlabel("Prediction")
ylabel("Target")

hold on
plot([-60 60], [-60 60],"r--")

Figure contains an axes object. The axes object with xlabel Prediction, ylabel Target contains 2 objects of type scatter, line.

Make Predictions with New Data

Use the neural network to make a prediction with the first test image. To make a prediction with a single image, use the predict function. To use a GPU, first convert the data to gpuArray.

X = XTest(:,:,:,1);
if canUseGPU
    X = gpuArray(X);
end
Y = predict(net,X)

Y = single

33.0647

figure
imshow(X)
title("Angle: " + gather(Y))

Figure contains an axes object. The hidden axes object with title Angle: 33.0647 contains an object of type image.