Invalid training data for LSTM network.

9 views (last 30 days)
tyler seudath
tyler seudath on 2 Dec 2021
Answered: Avadhoot on 10 Apr 2024 at 8:58
Hi all,
I am creating a LSTM network and I am getting an error 'Invalid training data. Predictors and responses must have the same number of observations.' I converted the input training data to cell arrays and the output to a categorical one, yet I am getting an error.
Here is a sample of the code:
DataParts = zeros(size(Train1_inputX1,1), size(Train1_inputX1,2),1,2); %(4500,400,1,2)
DataParts(:,:,:,1) = real(cell2mat(Train1_inputX1));
DataParts(:,:,:,2) = imag(cell2mat(Train1_inputX1)) ;
XTrain=num2cell(reshape(DataParts, [400,1,2,4050])); %Train data
DataParts1 = zeros(size(testX1_input,1), size(testX1_input,2),1, 2);
DataParts1(:,:,:,1) = real(cell2mat(testX1_input));
DataParts1(:,:,:,2) = imag(cell2mat(testX1_input)) ;
Ttrain=num2cell(reshape(DataParts1,[400,1,2,500])); %Test data
DataParts2 = zeros(size(ValX1_input,1), size(ValX1_input,2),1, 2);
DataParts2(:,:,:,1) = real(cell2mat(ValX1_input));
DataParts2(:,:,:,2) = imag(cell2mat(ValX1_input));
Vtrain =num2cell(reshape(DataParts2,[400,1,2,450])); %450 is the number of segments %400 is the number of samples
Valoutfinal= categorical(ValX1_output); %450 values
testoutfinal = categorical(testX1_output); %500 values
Trainoutfinal= categorical(Train1_outputX1);%4050 values
%% NETWORK ARCHITECTURE
inputSize = [400 1 2];
numHiddenUnits = 800;
numClasses = 4;
layers = [ ...
sequenceInputLayer(inputSize,'Name','input')
flattenLayer('Name','flatten')
bilstmLayer(numHiddenUnits ,'OutputMode','last','Name','lstm')
fullyConnectedLayer(numClasses , 'Name','fc')
softmaxLayer('Name','softmax')
classificationLayer('Name','classification')];
% Specify training options.
maxEpochs = 100;
miniBatchSize = 27;
options = trainingOptions('sgdm', ...
'ExecutionEnvironment','cpu', ...
'GradientThreshold',1, ...
'MaxEpochs',maxEpochs, ...
'MiniBatchSize',miniBatchSize, ...
'SequenceLength','longest', ...
'Shuffle','never', ...
'Verbose',0, ...
'Plots','training-progress');
%% Train network
net = trainNetwork(Ttrain,Trainoutfinal,layers,options);
Any help is greatly appreciated.
Thanks a mil.

Answers (1)

Avadhoot
Avadhoot on 10 Apr 2024 at 8:58
I understand that you are facing the "Invalid training data" error for your LSTM model. This error suggests that there is a mismatch between the number of observations in the predictors and the responses. After going through your code, I have found the following:
1) The reshaping of "DataParts" into "XTrain", "Ttrain" and "VTrain" looks incorrect. You have reshaped the entire dataset into a single cell. But each sequence or sample must be in a separate cell according to the correct format. You'll need to apply similar corrections to your validation and test sets.
You can take a look at the below code to understand how it must be done:
% Assuming Train1_inputX1 is a cell array where each cell contains a sequence
XTrain = cell(size(Train1_inputX1));
for i = 1:length(Train1_inputX1)
tempData = Train1_inputX1{i}; % Extract the sequence
% Assuming each sequence is a 2D matrix where rows are time steps
tempData = [real(tempData), imag(tempData)]; % Combine real and imaginary parts
XTrain{i} = tempData.'; % Transpose to make it [Features, TimeSteps]
end
2) Ensure that the number of sequences in your predictors matches the number of labels in your responses. The error suggests there's a mismatch.
After making the above corrections the error must go away, provided the corrected data preparation aligns the number of observations in predictors and responses.
I hope it helps.

Tags

Products


Release

R2021a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!