LSTM padding and masking

I am solving a sequence-to-sequence classification problem based on LSTM using Matlab 2020b. The sequences have varaible length so padding within each minibatch is needed. However, I am not sure if Matlab automatically do the masking when calculating the crossentroy loss as well as the training/validation accuracy. From the training plot, the reported accuracy (around 70%) is much lower than those manually calculated by using checkpoints (where I get around 90% accuracy). I suspect although Matlab 2020b supports sequence padding and validation data in LSTM, it still did not offer the option of masking to reduce the influence caused by padding. Any insights?

Answers (2)

Aditya Patil
Aditya Patil on 22 Dec 2020

2 votes

Currently, masking is not supported in MATLAB. I have brought the request to the notice of concerned people.
As a workaround, you can sort the inputs so that the amount of padding required is minimized. You may also set the minibatch size to 1, so that no padding is required.

1 Comment

Thank you! I was really curious about this as well since it can be done in python. I really hope they can add this feature.

Sign in to comment.

Haijun Ruan
Haijun Ruan on 21 Jul 2021
I am wondering whether masking is supported in MATLAB now.

Categories

Find more on Deep Learning Toolbox in Help Center and File Exchange

Asked:

on 10 Dec 2020

Answered:

on 21 Jul 2021

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!