Main Content

Code Generation for a Sequence-to-Sequence LSTM Network

This example demonstrates how to generate CUDA® code for a long short-term memory (LSTM) network. The example generates a MEX application that makes predictions at each step of an input timeseries. Two methods are demonstrated: a method using a standard LSTM network, and a method leveraging the stateful behavior of the same LSTM network. This example uses accelerometer sensor data from a smartphone carried on the body and makes predictions on the activity of the wearer. User movements are classified into one of five categories, namely dancing, running, sitting, standing, and walking. The example uses a pretrained LSTM network. For more information on training, see the Sequence Classification Using Deep Learning (Deep Learning Toolbox) example from Deep Learning Toolbox™.

Third-Party Prerequisites


This example generates CUDA MEX and has the following third-party requirements.

  • CUDA enabled NVIDIA® GPU and compatible driver.


For non-MEX builds such as static, dynamic libraries or executables, this example has the following additional requirements.

Verify GPU Environment

Use the coder.checkGpuInstall function to verify that the compilers and libraries necessary for running this example are set up correctly.

envCfg = coder.gpuEnvConfig('host');
envCfg.DeepLibTarget = 'cudnn';
envCfg.DeepCodegen = 1;
envCfg.Quiet = 1;

The lstmnet_predict Entry-Point Function

A sequence-to-sequence LSTM network enables you to make different predictions for each individual time step of a data sequence. The lstmnet_predict.m entry-point function takes an input sequence and passes it to a trained LSTM network for prediction. Specifically, the function uses the LSTM network trained in the Sequence to Sequence Classification Using Deep Learning example. The function loads the network object from the lstmnet_predict.mat file into a persistent variable and reuses the persistent object on subsequent prediction calls.

To display an interactive visualization of the network architecture and information about the network layers, use the analyzeNetwork (Deep Learning Toolbox) function.

function out = lstmnet_predict(in) %#codegen

% Copyright 2019-2021 The MathWorks, Inc. 

persistent mynet;

if isempty(mynet)
    mynet = coder.loadDeepLearningNetwork('lstmnet.mat');

% pass in input   
out = predict(mynet,in); 

Generate CUDA MEX

To generate CUDA MEX for the lstmnet_predict.m entry-point function, create a GPU configuration object and specify the target to be MEX. Set the target language to C++. Create a deep learning configuration object that specifies the target library as cuDNN. Attach this deep learning configuration object to the GPU configuration object.

cfg = coder.gpuConfig('mex');
cfg.TargetLang = 'C++';
cfg.DeepLearningConfig = coder.DeepLearningConfig('cudnn');

At compile time, GPU Coder™ must know the data types of all the inputs to the entry-point function. Specify the type and size of the input argument to the codegen command by using the coder.typeof function. For this example, the input is of double data type with a feature dimension value of three and a variable sequence length. Specifying the sequence length as variable-sized enables us to perform prediction on an input sequence of any length.

matrixInput = coder.typeof(double(0),[3 Inf],[false true]);

Run the codegen command.

codegen -config cfg lstmnet_predict -args {matrixInput} -report
Code generation successful: View report

Run Generated MEX on Test Data

Load the HumanActivityValidate MAT-file. This MAT-file stores the variable XValidate that contains sample timeseries of sensor readings on which you can test the generated code. Call lstmnet_predict_mex on the first observation.

load HumanActivityValidate
YPred1 = lstmnet_predict_mex(XValidate{1});

YPred1 is a 5-by-53888 numeric matrix containing the probabilities of the five classes for each of the 53888 time steps. For each time step, find the predicted class by calculating the index of the maximum probability.

[~, maxIndex] = max(YPred1, [], 1);

Associate the indices of max probability to the corresponding label. Display the first ten labels. From the results, you can see that the network predicted the human to be sitting for the first ten time steps.

labels = categorical({'Dancing', 'Running', 'Sitting', 'Standing', 'Walking'});
predictedLabels1 = labels(maxIndex);

Compare Predictions with Test Data

Use a plot to compare the MEX output data with the test data.

hold on
hold off

xlabel("Time Step")
title("Predicted Activities")
legend(["Predicted" "Test Data"])

Call Generated MEX on an Observation with a Different Sequence Length

Call lstmnet_predict_mex on the second observation with a different sequence length. In this example, XValidate{2} has a sequence length of 64480 whereas XValidate{1} had a sequence length of 53888. The generated code handles prediction correctly because we specified the sequence length dimension to be variable-size.

YPred2 = lstmnet_predict_mex(XValidate{2});
[~, maxIndex] = max(YPred2, [], 1);
predictedLabels2 = labels(maxIndex);

Generate MEX That Takes in Multiple Observations

If you want to perform prediction on many observations at once, you can group the observations together in a cell array and pass the cell array for prediction. The cell array must be a column cell array, and each cell must contain one observation. Each observation must have the same feature dimension, but the sequence lengths may vary. In this example, XValidate contains five observations. To generate a MEX that can take XValidate as input, specify the input type to be a 5-by-1 cell array. Further, specify that each cell be of the same type as matrixInput, the type you specified for the single observation in the previous codegen command.

matrixInput = coder.typeof(double(0),[3 Inf],[false true]);
cellInput = coder.typeof({matrixInput}, [5 1]);

codegen -config cfg lstmnet_predict -args {cellInput} -report
Code generation successful: View report
YPred3 = lstmnet_predict_mex(XValidate);

The output is a 5-by-1 cell array of predictions for the five observations passed in.

    {5×53888 single}
    {5×64480 single}
    {5×53696 single}
    {5×56416 single}
    {5×50688 single}

Generate MEX with Stateful LSTM

Instead of passing the entire timeseries to predict in one step, we can run prediction on an input by streaming in one timestep at a time, making use of the function predictAndUpdateState (Deep Learning Toolbox) This function takes in an input, produces an output prediction, and updates the internal state of the network so that future predictions take this initial input into account.

The entry-point function lstmnet_predict_and_update.m takes in a single-timestep input and processes the input using the predictAndUpdateState (Deep Learning Toolbox) function. predictAndUpdateState outputs a prediction for the input timestep and updates the network so that subsequent inputs are treated as subsequent timesteps of the same sample. After passing in all timesteps one at a time, the resulting output is the same as if all timesteps were passed in as a single input.

function out = lstmnet_predict_and_update(in) %#codegen

% Copyright 2019-2021 The MathWorks, Inc. 

persistent mynet;

if isempty(mynet)
    mynet = coder.loadDeepLearningNetwork('lstmnet.mat');

% pass in input
[mynet, out] = predictAndUpdateState(mynet,in);

Run codegen on this new design file. Since we are taking in a single timestep each call, we specify matrixInput to have a fixed sequence dimension of 1 instead of a variable sequence length.

matrixInput = coder.typeof(double(0),[3 1]);
codegen -config cfg lstmnet_predict_and_update -args {matrixInput} -report
Code generation successful: View report

Run the generated MEX on the first validation sample's first timestep.

firstSample = XValidate{1};
firstTimestep = firstSample(:,1);
YPredStateful = lstmnet_predict_and_update_mex(firstTimestep);
[~, maxIndex] = max(YPredStateful, [], 1);
predictedLabelsStateful1 = labels(maxIndex)
predictedLabelsStateful1 = categorical

Compare the output label with the ground truth.

ans = categorical

See Also



Related Examples

More About