How to train autoencoder on dlarray data for feature extraction?

8 views (last 30 days)
I have a high dimensional time-series dataset with 625 features with around 50000 observations for each feature. I have multiple batches of this dataset arranged in a dlarray format. This results in a 4D matrix. How do I train an autoencoder to reduce the dimensioality of this dataset from original 625 features to a smaller number of variables.

Accepted Answer

Yash Sharma
Yash Sharma on 26 Jun 2024
To train an autoencoder for dimensionality reduction on your high-dimensional time-series dataset, you can follow these steps in MATLAB. The dlarray format is useful for handling multi-dimensional arrays, and MATLAB's Deep Learning Toolbox provides tools to work with such data.
Here’s a step-by-step guide:
Step 1: Prepare the Data
Make sure your data is in the correct format. Assuming you have a 4D dlarray where the dimensions are arranged as [features, time, batch, channels].
Step 2: Define the Autoencoder Architecture
Define the architecture of your autoencoder. The encoder part will compress the input data to a lower-dimensional representation, and the decoder part will reconstruct the input data from this lower-dimensional representation.
Step 3: Train the Autoencoder
Use the trainNetwork function to train your autoencoder with the specified architecture and training options.
Here’s an example code snippet to illustrate these steps:
% Assuming your data is in a 4D dlarray format
% data: [features, time, batch, channels]
% Example data dimensions
numFeatures = 625;
numTimeSteps = 50000;
numBatches = 10; % Example number of batches
numChannels = 1; % Example number of channels
% Load your data (replace this with your actual data loading code)
data = randn(numFeatures, numTimeSteps, numBatches, numChannels, 'single');
data = dlarray(data, 'CBTC'); % 'CBTC' stands for 'Channel', 'Batch', 'Time', 'Channel'
% Define the autoencoder architecture
inputSize = [numFeatures, numTimeSteps, numChannels];
% Encoder
encoderLayers = [
imageInputLayer(inputSize, 'Name', 'input', 'Normalization', 'none')
convolution2dLayer([3, 3], 16, 'Padding', 'same', 'Name', 'conv1')
reluLayer('Name', 'relu1')
maxPooling2dLayer([2, 2], 'Stride', [2, 2], 'Name', 'maxpool1')
convolution2dLayer([3, 3], 8, 'Padding', 'same', 'Name', 'conv2')
reluLayer('Name', 'relu2')
fullyConnectedLayer(50, 'Name', 'fc1') % Reduce to 50 features (example)
];
% Decoder
decoderLayers = [
fullyConnectedLayer(prod([numFeatures, numTimeSteps, numChannels]), 'Name', 'fc2')
reluLayer('Name', 'relu3')
transposedConv2dLayer([3, 3], 8, 'Cropping', 'same', 'Name', 'deconv1')
reluLayer('Name', 'relu4')
transposedConv2dLayer([3, 3], 16, 'Cropping', 'same', 'Name', 'deconv2')
reluLayer('Name', 'relu5')
transposedConv2dLayer([3, 3], numChannels, 'Cropping', 'same', 'Name', 'deconv3')
regressionLayer('Name', 'output')
];
% Combine encoder and decoder
layers = [
encoderLayers
decoderLayers
];
% Specify training options
options = trainingOptions('adam', ...
'MaxEpochs', 50, ...
'InitialLearnRate', 1e-3, ...
'MiniBatchSize', 128, ...
'Shuffle', 'every-epoch', ...
'Plots', 'training-progress', ...
'Verbose', false);
% Train the autoencoder
net = trainNetwork(data, data, layers, options);
% Extract the encoder part of the network
encoderNet = layerGraph(net.Layers(1:numel(encoderLayers)));
% Save the trained network
save('autoencoderNet.mat', 'net', 'encoderNet');
Notes:
  • Adjust the architecture according to your specific needs. The example provided is a simple convolutional autoencoder.
  • Modify the number of layers, filter sizes, and the number of features in the fully connected layer to fit your dataset and desired dimensionality reduction.
  • Ensure your data is normalized appropriately before feeding it into the network.
This approach should help you train an autoencoder to reduce the dimensionality of your high-dimensional time-series dataset.
  1 Comment
Shubham
Shubham on 27 Jun 2024
Hi Yash,
Thanks for the detailed answer! Since I am dealing with the sequential data, the imageInputLayer is not suitable. Do you think the following implementation is more suitable while dealing with time series data?
function [val_loss,encoderNet,netBest] = designLSTMEncoder(net_params,batch_size,X_train,X_val,numEpochs,minEpochs,validationFrequency,isOptimizeArchitecture,isReturningBestValLossModel)
numFeatures = size(X_train,1);
% Encoder
encoderLayers = [
sequenceInputLayer(numFeatures)
layerNormalizationLayer
lstmLayer(125)
dropoutLayer(0.1)
reluLayer
lstmLayer(25)
dropoutLayer(0.1)
reluLayer
fullyConnectedLayer(net_params.latent_space_dimension, 'Name', 'fc1') % Reduce to 5 features (example)
];
% Decoder
decoderLayers = [
fullyConnectedLayer(net_params.latent_space_dimension)
reluLayer
lstmLayer(25)
dropoutLayer(0.1)
reluLayer
lstmLayer(125)
dropoutLayer(0.1)
reluLayer
fullyConnectedLayer(numFeatures)
];
% Combine encoder and decoder
layers_opt = [
encoderLayers
decoderLayers
];
net = dlnetwork(layers_opt);
useGPU = canUseGPU();
% Define training options
iteration = 0;
epoch = 0;
batchSize = batch_size;
% Adam parameters
averageGrad = [];
averageSqGrad = [];
% Initialize training progress monitor
monitor = trainingProgressMonitor(Info="Epoch", XLabel="Iteration");
monitor.Info = ["LearningRate","Epoch","Iteration","ExecutionEnvironment"];
monitor.Metrics = ["Loss","Val_loss","Avg_Val_loss"];
groupSubPlot(monitor, "Training loss", "Loss");
groupSubPlot(monitor,"Validation loss","Val_loss")
groupSubPlot(monitor,"Average Validation loss","Avg_Val_loss")
% Initialize variables for tracking consecutive unchanged validation losses.
noImprovementCount = 0;
previous_val_loss = inf;
numValidations = 0;
% monitor.Stop = false();
% Network training
%Shuffle once
idx = randperm(size(X_train,3));
X_epoch = X_train(:,:,idx);
% Do not shuffle
if isOptimizeArchitecture
X_epoch = X_train(:,:,idx);
end
while epoch < numEpochs && ~monitor.Stop && noImprovementCount < 20
epoch = epoch + 1;
% Shuffle once each epoch
% idx = randperm(size(X_train,3));
% X_epoch = X_train(:,:,idx);
for i = 1:batchSize:size(X_train,3)
% Prepare mini-batch data
startIndex = i;
endIndex = min(i + batchSize - 1, size(X_train,3));
numIterationsPerEpoch = ceil(size(X_train,3)/batchSize);
numIterations = numEpochs*numIterationsPerEpoch;
% Assuming all sequence to have same length and passing as tensor data
X_miniBatch = dlarray(X_epoch(:,:,startIndex:endIndex),'CTB');
Xepoch_val = dlarray(X_val,'CTB');
if useGPU
X_miniBatch = gpuArray(X_miniBatch);
Xepoch_val = gpuArray(Xepoch_val);
end
X = X_miniBatch;
% Evaluate the model loss and gradients using dlfeval and the modelLoss function.
[loss,gradients] = dlfeval(@mseModelLoss_LSTM,net,X,X,0.0001);
% Update the network parameters using the Adam optimizer.
iteration = iteration + 1;
[net,averageGrad,averageSqGrad] = adamupdate(net,gradients,averageGrad,averageSqGrad,iteration); %The adamupdate without mentioning the learn rate works much better and does not have a bouncing optimzation path
if iteration == 1 || mod(iteration,validationFrequency) == 0
numValidations = numValidations + 1;
% Compute validation loss
[val_loss] = dlfeval(@mseModelLoss_LSTM,net,Xepoch_val,Xepoch_val,0.0001);
% Check if the validation loss has not changed
if epoch > minEpochs
if abs(val_loss - previous_val_loss) < (0.05*val_loss)
noImprovementCount = noImprovementCount + 1;
else
noImprovementCount = 0;
end
previous_val_loss = val_loss;
end
if isReturningBestValLossModel
if val_loss < previous_val_loss
netBest = net;
end
else
netBest = net;
end
recordMetrics(monitor, iteration, Val_loss=extractdata(val_loss));
end
% Update the training progress monitor
recordMetrics(monitor, iteration, Loss=extractdata(loss));
updateInfo(monitor, Epoch=string(epoch) + " of " + string(numEpochs), ...
Iteration=string(iteration) + " of " + string(numIterations));
monitor.Progress = 100 * epoch/numEpochs;
end
end
% Evaluate the performance of the network on the validation set
X_val_opt_pred = predict(net, dlarray(X_val,'CTB'));
X_val_opt_true = dlarray(X_val,'CTB');
val_loss = mse(X_val_opt_pred,X_val_opt_true);
% Extract the encoder part of the network
encoderNet = layerGraph(netBest.Layers(1:numel(encoderLayers)));
end

Sign in to comment.

More Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!