trainNetwork function features dimensions problem

1 view (last 30 days)
I'm working on the implementation of a LSTM classification model.
As input I have different time series, as output some categorical values (labels).
This is my code:
close all
clearvars -except ts data labels
clc
for ii = 1 : numel(ts)
timeVec = datenum(ts(ii).Timetable.Time);
data{ii} = [timeVec, data{ii}];
end
numChannels = size(data{1}, 2);
numHiddenUnits = 120;
numClasses = 2;
layers = [
sequenceInputLayer(numChannels)
lstmLayer(numHiddenUnits,'OutputMode','last')
fullyConnectedLayer(numClasses)
softmaxLayer
classificationLayer];
maxEpochs = 200;
miniBatchSize = 27;
options = trainingOptions('adam', ...
'ExecutionEnvironment','cpu', ...
'MaxEpochs', maxEpochs, ...
'MiniBatchSize', miniBatchSize, ...
'GradientThreshold', 1, ...
'Verbose', false, ...
'Plots', 'training-progress');
idxTrain = [1 2 3 9 10 11];
idxTest = [4 7 8];
Xtrain = cell(numel(idxTrain), 1);
Ytrain = categorical(labels(idxTrain))';
for ii = 1 : numel(idxTrain)
Xtrain{ii} = data{idxTrain(ii)};
Ytrain(ii) = categorical(labels(idxTrain(ii)));
end
Xtest = cell(numel(idxTest), 1);
Ytest = categorical(labels(idxTest))';
for ii = 1 : numel(idxTest)
Xtest{ii} = data{idxTest(ii)};
Ytest(ii) = categorical(labels(idxTest(ii)));
end
net = trainNetwork(Xtrain', Ytrain, layers, options);
Running, I receive this error:
Error using trainNetwork
Invalid training data. Predictors must be a N-by-1 cell array of sequences, where N is the
number of sequences. All sequences must have the same feature dimension and at least one time
step.
Error in main (line 244)
net = trainNetwork(Xtrain', Ytrain, layers, options);
I think it is related to the different sizes of the matrices inside each of the Xtrain cells.
Here a screenshot with the dimensions of the different cells of Xtrain.
Are there any way to train the model using inputs with different dimensions?
  5 Comments
Ganesh
Ganesh on 20 Jun 2024
@Marco, How would you differentiate each of the "time" frame? Or let's say you achieve the training, and now you want to make a prediction. You give it a set of 12 columns as input, and now the model would be confused as to which one of the 6 time frames you are referring to.
Marco
Marco on 20 Jun 2024
I can't combine them and I must keep them separated.
I'll try with training the model per one cell and save the parameters.
The time vector is always the first column of the matrix.
Once the training is done, I'll give another matrix to the model and it should classify it into one of the two classes.

Sign in to comment.

Answers (1)

Ayush Aniket
Ayush Aniket on 20 Jun 2024
Hi Marco,
The input format required by trainNetwork function in MATLAB for dataset of sequences is a Nx1 cell array where each element is a c-by-s matrix, where c is the number of features of the sequence and s is the sequence length. Refer to the following document link to read about various input formats: https://www.mathworks.com/help/deeplearning/ref/trainnetwork.html#mw_36a68d96-8505-4b8d-b338-44e1efa9cc5e
From the attached screenshot, it seems that Xtrain has the format as s-by-c matrix. You should reconstruct it in the required format.
Also, you do need to have all the sequences of same length. The software internally applies padding. You can read about it here: https://www.mathworks.com/help/deeplearning/ug/classify-sequence-data-using-lstm-networks.html#ClassifySequenceDataUsingLSTMNetworksExample-2
Note - trainNetwork function is not recommended anymore. You can use the trainnet function instead. To read about the input formats for sequence datsets in trainnet function, refer the following link:

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!