neuralODELayer

Neural ODE layer

Since R2023b

expand all in page

Description

A neural ODE layer outputs the solution of an ODE.

Creation

Syntax

layer = neuralODELayer(net,tspan)

layer = neuralODELayer(net,tspan,Name=Value)

Description

layer = neuralODELayer(net,tspan) creates a neural ODE layer and sets the Network and TimeInterval properties.

example

layer = neuralODELayer(net,tspan,Name=Value) specifies additional properties using one or more name-value arguments.

example

Properties

expand all

`Network` — Neural network characterizing neural ODE function
`dlnetwork` object

Neural network characterizing neural ODE function, specified as a dlnetwork object.

If Network has one input, then predict(net,Y) defines the ODE system, where net is the network. If Network has two inputs, then predict(net,T,Y) defines the ODE system, where T is a time step repeated over the batch dimension.

The network size and format of the network inputs and outputs must match.

When GradientMode is "adjoint", the network State property must be empty. To use a network with a nonempty State property, set GradientMode to "direct".

`TimeInterval` — Interval of integration
numeric vector

Interval of integration, specified as a numeric vector with two or more elements. The elements in TimeInterval must be all increasing or all decreasing.

The solver imposes the initial conditions given by Y0 at the initial time TimeInterval(1), then integrates the ODE function from TimeInterval(1) to TimeInterval(end).

If TimeInterval has two elements, [t0 tf], then the solver returns the solution evaluated at point tf.
If TimeInterval has more than two elements, [t0 t1 ... tf], then the solver returns the solution evaluated at the given points [t1 ... tf]. The solver does not step precisely to each point specified in TimeInterval. Instead, the solver uses its own internal steps to compute the solution, then evaluates the solution at the points specified in TimeInterval. The solutions produced at the specified points are of the same order of accuracy as the solutions computed at each internal step.
Specifying several intermediate points has little effect on the efficiency of computation, but for large systems it can negatively affect memory management.

`GradientMode` — Method to compute gradients
`"direct"` (default) | `"adjoint"`

Method to compute gradients with respect to the initial conditions and parameters when using the dlgradient function, specified as one of these values:

"direct" — Compute gradients by backpropagating through the operations undertaken by the numerical solver. This option best suits large mini-batch sizes or when TimeInterval contains many values.
"adjoint" — Compute gradients by solving the associated adjoint ODE system. This option best suits small mini-batch sizes or when TimeInterval contains a small number of values.

When GradientMode is "adjoint", the network State property must be empty. To use a network with a nonempty State property, set GradientMode to "direct".

The dlaccelerate function does not support accelerating the dlode45 function when the GradientMode option is "direct". To accelerate the code that calls the dlode45 function, set the GradientMode option to "adjoint" or accelerate parts of your code that do not call the dlode45 function with the GradientMode option set to "direct".

The dlaccelerate function does not support accelerating networks that contain NeuralODELayer objects when the GradientMode option is "direct". To accelerate networks that contain NeuralODELayer objects, set the GradientMode option to "adjoint".

Warning

When GradientMode is "adjoint", all layers in the network must support acceleration. Otherwise, the software can return unexpected results.

When GradientMode is "adjoint", the software traces the ODE function input to determine the computation graph used for automatic differentiation. This tracing process can take some time and can end up recomputing the same trace. By optimizing, caching, and reusing the traces, the software can speed up the gradient computation.

For more information on deep learning function acceleration, see Deep Learning Function Acceleration for Custom Training Loops.

The NeuralODELayer object stores this property as a character vector.

`RelativeTolerance` — Relative error tolerance
`1e-3` (default) | positive scalar

Relative error tolerance, specified as a positive scalar. The relative tolerance applies to all components of the solution.

`AbsoluteTolerance` — Absolute error tolerance
`1e-6` (default) | positive scalar

Absolute error tolerance, specified as a positive scalar. The absolute tolerance applies to all components of the solution.

Examples

collapse all

Create Neural ODE Layer

Open Live Script

Create a neural ODE layer. Specify an ODE network containing a convolution layer followed by a tanh layer. Specify a time interval of [0, 1].

inputSize = [14 14 8];

layersODE = [
    imageInputLayer(inputSize)
    convolution2dLayer(3,8,Padding="same")
    tanhLayer];

netODE = dlnetwork(layersODE);

tspan = [0 1];
layer = neuralODELayer(netODE,tspan)

layer = 
  NeuralODELayer with properties:

                 Name: ''
         TimeInterval: [0 1]
         GradientMode: 'direct'
    RelativeTolerance: 1.0000e-03
    AbsoluteTolerance: 1.0000e-06

   Learnable Parameters
              Network: [1x1 dlnetwork]

   State Parameters
    No properties.

Use properties method to see a list of all properties.

Create a neural network containing a neural ODE layer.

layers = [
    imageInputLayer([28 28 1])
    convolution2dLayer([3 3],8,Padding="same",Stride=2)
    reluLayer
    neuralODELayer(netODE,tspan)
    fullyConnectedLayer(10)
    softmaxLayer];

net = dlnetwork(layers)

net = 
  dlnetwork with properties:

         Layers: [6x1 nnet.cnn.layer.Layer]
    Connections: [5x2 table]
     Learnables: [6x3 table]
          State: [0x3 table]
     InputNames: {'imageinput'}
    OutputNames: {'softmax'}
    Initialized: 1

  View summary with summary.

Tips

To apply the neural ODE operation in deep learning models defined as functions or in custom layer functions, use dlode45.

Algorithms

expand all

Neural Ordinary Differential Equation

The neural ordinary differential equation (ODE) operation returns the solution of a specified ODE. In particular, given an input, a neural ODE operation outputs the numerical solution of the ODE $y' = f (t, y, θ)$ for the time horizon (t₀,t₁) and with the initial condition y(t₀) = y₀, where t and y denote the ODE function inputs and θ is a set of learnable parameters. Typically, the initial condition y₀ is either the network input or the output of another deep learning operation.

To apply the operation, NeuralODELayer uses the ode45 function, which is based on an explicit Runge-Kutta (4,5) formula, the Dormand-Prince pair. It is a single-step solver—in computing y(t_n), it needs only the solution at the immediately preceding time point, y(t_n-1) [2] [3].

Layer Input and Output Formats

Layers in a layer array or layer graph pass data to subsequent layers as formatted dlarray objects. The format of a dlarray object is a string of characters in which each character describes the corresponding dimension of the data. The formats consist of one or more of these characters:

"S" — Spatial
"C" — Channel
"B" — Batch
"T" — Time
"U" — Unspecified

For example, you can describe 2-D image data that is represented as a 4-D array, where the first two dimensions correspond to the spatial dimensions of the images, the third dimension corresponds to the channels of the images, and the fourth dimension corresponds to the batch dimension, as having the format "SSCB" (spatial, spatial, channel, batch).

You can interact with these dlarray objects in automatic differentiation workflows, such as those for developing a custom layer, using a functionLayer object, or using the forward and predict functions with dlnetwork objects.

This table shows the supported input formats of NeuralODELayer objects and the corresponding output format. If the software passes the output of the layer to a custom layer that does not inherit from the nnet.layer.Formattable class, or a FunctionLayer object with the Formattable property set to 0 (false), then the layer receives an unformatted dlarray object with dimensions ordered according to the formats in this table. The formats listed here are only a subset. The layer may support additional formats such as formats with additional "S" (spatial) or "U" (unspecified) dimensions.

If TimeInterval contains more than two elements, then the layer outputs data with a "T" (time) dimension.

Input Format	`TimeInterval`	Output Format
`"CB"` (channel, batch)	`[t0 tf]`	`"CB"` (channel, batch)
`"CB"` (channel, batch)	`[t0 t1 ... tf]`	`"CBT"` (channel, batch, time)
`"SCB"` (spatial, channel, batch)	`[t0 tf]`	`"SCB"` (spatial, channel, batch)
`"SCB"` (spatial, channel, batch)	`[t0 t1 ... tf]`	`"SCBT"` (spatial, channel, batch, time)
`"SSCB"` (spatial, spatial, channel, batch)	`[t0 tf]`	`"SSCB"` (spatial, spatial, channel, batch)
`"SSCB"` (spatial, spatial, channel, batch)	`[t0 t1 ... tf]`	`"SSCBT"` (spatial, spatial, channel, batch, time)
`"SSSCB"` (spatial, spatial, spatial, channel, batch)	`[t0 tf]`	`"SSSCB"` (spatial, spatial, spatial, channel, batch)
`"SSSCB"` (spatial, spatial, spatial, channel, batch)	`[t0 t1 ... tf]`	`"SSSCBT"` (spatial, spatial, spatial, channel, batch, time)
`"SC"` (spatial, channel)	`[t0 tf]`	`"SC"` (spatial, channel)
`"SC"` (spatial, channel)	`[t0 t1 ... tf]`	`"SCT"` (spatial, channel, time)
`"SSC"` (spatial, spatial, channel)	`[t0 tf]`	`"SSC"` (spatial, spatial, channel)
`"SSC"` (spatial, spatial, channel)	`[t0 t1 ... tf]`	`"SSCT"` (spatial, spatial, channel, time)
`"SSSC"` (spatial, spatial, spatial, channel, batch)	`[t0 tf]`	`"SSSC"` (spatial, spatial, spatial, channel, batch)
`"SSSC"` (spatial, spatial, spatial, channel, batch)	`[t0 t1 ... tf]`	`"SSSCT"` (spatial, spatial, spatial, channel, batch, time)
`"SB"` (spatial, batch)	`[t0 tf]`	`"SB"` (spatial, batch)
`"SB"` (spatial, batch)	`[t0 t1 ... tf]`	`"SBT"` (spatial, batch, time)
`"SSB"` (spatial, spatial, batch)	`[t0 tf]`	`"SSB"` (spatial, spatial, batch)
`"SSB"` (spatial, spatial, batch)	`[t0 t1 ... tf]`	`"SSBT"` (spatial, spatial, batch, time)
`"SSSB"` (spatial, spatial, spatial, batch)	`[t0 tf]`	`"SSSB"` (spatial, spatial, spatial, batch)
`"SSSB"` (spatial, spatial, spatial, batch)	`[t0 t1 ... tf]`	`"SSSBT"` (spatial, spatial, spatial, batch, time)
`"SS"` (spatial, spatial)	`[t0 tf]`	`"SS"` (spatial, spatial)
`"SS"` (spatial, spatial)	`[t0 t1 ... tf]`	`"SST"` (spatial, spatial, time)
`"SSS"` (spatial, spatial, spatial)	`[t0 tf]`	`"SSS"` (spatial, spatial, spatial)
`"SSS"` (spatial, spatial, spatial)	`[t0 t1 ... tf]`	`"SSST"` (spatial, spatial, spatial, time)

Version History

Introduced in R2023b

neuralODELayer

Description

Creation

Syntax

Description

Properties

`Network` — Neural network characterizing neural ODE function
`dlnetwork` object

`TimeInterval` — Interval of integration
numeric vector

`GradientMode` — Method to compute gradients
`"direct"` (default) | `"adjoint"`

`RelativeTolerance` — Relative error tolerance
`1e-3` (default) | positive scalar

`AbsoluteTolerance` — Absolute error tolerance
`1e-6` (default) | positive scalar

Examples

Create Neural ODE Layer

Tips

Algorithms

Neural Ordinary Differential Equation

Layer Input and Output Formats

Version History

See Also

Topics

neuralODELayer

Description

Creation

Syntax

Description

Properties

Network — Neural network characterizing neural ODE function dlnetwork object

TimeInterval — Interval of integration numeric vector

GradientMode — Method to compute gradients "direct" (default) | "adjoint"

RelativeTolerance — Relative error tolerance 1e-3 (default) | positive scalar

AbsoluteTolerance — Absolute error tolerance 1e-6 (default) | positive scalar

Examples

Create Neural ODE Layer

Tips

Algorithms

Neural Ordinary Differential Equation

Layer Input and Output Formats

Version History

See Also

Topics

`Network` — Neural network characterizing neural ODE function
`dlnetwork` object

`TimeInterval` — Interval of integration
numeric vector

`GradientMode` — Method to compute gradients
`"direct"` (default) | `"adjoint"`

`RelativeTolerance` — Relative error tolerance
`1e-3` (default) | positive scalar

`AbsoluteTolerance` — Absolute error tolerance
`1e-6` (default) | positive scalar