USING LSTM TO CLASSIFY DATA
6 views (last 30 days)
Show older comments
Ernest Modise - Kgamane
on 7 Jun 2024
Please see my code below
% Step 1: Load the data from the Excel file
data = readmatrix('LSTMdataIn.xlsx');
% Step 2: Create labels
labels = [ones(200, 1); 2*ones(200, 1); 3*ones(200, 1); 4*ones(200, 1); 5*ones(200, 1)];
% Step 3: Reshape the data
numTimeSteps = 100;
numFeatures = 1;
reshapedData = reshape(data', numFeatures, numTimeSteps, []);
% Step 4: Split the data into training and testing sets
cv = cvpartition(labels, 'HoldOut', 0.2);
trainIdx = training(cv);
testIdx = test(cv);
XTrain = reshapedData(:, :, trainIdx);
YTrain = labels(trainIdx);
XTest = reshapedData(:, :, testIdx);
YTest = labels(testIdx);
% Step 5: Create and train the LSTM network
numHiddenUnits = 100;
layers = [ ...
sequenceInputLayer(100)
lstmLayer(numHiddenUnits)
fullyConnectedLayer(5)
softmaxLayer
classificationLayer];
options = trainingOptions('adam', 'MaxEpochs', 10, 'MiniBatchSize', 32);
net = trainNetwork(XTrain, categorical(YTrain), layers, options);
% Step 6: Evaluate the trained network
YTestPred = classify(net, XTest);
accuracy = sum(YTestPred == categorical(YTest)) / numel(YTest);
I get the following error
The training sequences are of feature dimension 1 100 but the input layer expects sequences of feature dimension 100.
Accepted Answer
Ayush Aniket
on 10 Jun 2024
Hi ernest,
Based on your data, I am assuming that you want to perform a 'sequence-to-label' classification task i.e. given a sequence the network should return the label.
There are several errors in your code that need modification:
- The error you observe is due to the fact that the sequenceInputLayer expects as argument the number of features in the sequence dataset which is '1' for your data and not '100'. Refer to the following link: https://www.mathworks.com/help/deeplearning/ref/nnet.cnn.layer.sequenceinputlayer.html#mw_9b13d129-0114-47c8-9ea3-e2977dd4fed1
- The trainNetwork function expects training data in the format of 'Nx1' cell array, where 'N' represents the number of observations and each cell should have the data in the format of 'c-by-s' matrix, where 'c' is the number of features of the sequence and 's' is the sequence length. For your data, that would be '1x100'. Here's how you can restructure your data:
% Assuming reshapedData is of size [1, numTimeSteps, numSamples]
% Convert reshapedData into a cell array where each cell contains a single sequence
XTrainCell = num2cell(reshapedData(:, :, trainIdx), [1,2]);
XTestCell = num2cell(reshapedData(:, :, testIdx), [1,2]);
% Each cell now contains a 1xnumTimeSteps matrix, which is a single sequence for the LSTM
% Adjust the training and testing data preparation
XTrain = squeeze(XTrainCell);
XTest = squeeze(XTestCell
- For 'sequence-to-label' classification task, you have to provide an additional 'Name-Value' argument 'OutputMode' in the lstmLayer of your network as shown below. Refer to the following documentation link to read the default value for the argument: https://www.mathworks.com/help/deeplearning/ref/nnet.cnn.layer.lstmlayer.html#mw_9f7c5f93-4bf2-4ddb-b922-b1c122668b9a_sep_mw_4fd4123f-c49f-4a38-95d1-85f9e542d641
layers = [ ...
sequenceInputLayer(numFeatures)
lstmLayer(numHiddenUnits,OutputMode="last")
fullyConnectedLayer(5)
softmaxLayer
classificationLayer];
Note: Starting in MATLAB R2024a, the classificationLayer and the trainNetwork functions are not recommended to be used. Instead you can use trainnet function to train your LSTM network:
More Answers (0)
See Also
Categories
Find more on Classification Learner App in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!