# gru

## Syntax

## Description

The gated recurrent unit (GRU) operation allows a network to learn dependencies between time steps in time series and sequence data.

applies a gated recurrent unit (GRU) calculation to input `Y`

= gru(`X`

,`H0`

,`weights`

,`recurrentWeights`

,`bias`

)`X`

using the
initial hidden state `H0`

, and parameters `weights`

,
`recurrentWeights`

, and `bias`

. The input
`X`

must be a formatted `dlarray`

. The output
`Y`

is a formatted `dlarray`

with the same dimension
format as `X`

, except for any `"S"`

dimensions.

The `gru`

function updates the hidden state using the hyperbolic
tangent function (tanh) as the state activation function. The `gru`

function uses the sigmoid function given by $$\sigma (x)={(1+{e}^{-x})}^{-1}$$ as the gate activation function.

`[`

also returns the hidden state after the GRU operation.`Y`

,`hiddenState`

] = gru(`X`

,`H0`

,`weights`

,`recurrentWeights`

,`bias`

)

`___ = gru(`

also specifies the dimension format `X`

,`H0`

,`weights`

,`recurrentWeights`

,`bias`

,DataFormat=FMT)`FMT`

when `X`

is not
a formatted `dlarray`

. The output `Y`

is an unformatted
`dlarray`

with the same dimension order as `X`

, except
for any `"S"`

dimensions.

`___ = gru(`

specifies additional options using one or more name-value arguments.`X`

,`H0`

,`weights`

,`recurrentWeights`

,`bias`

,Name=Value)

## Examples

## Input Arguments

## Output Arguments

## More About

## References

[1] Cho, Kyunghyun, Bart Van Merriënboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. "Learning phrase representations using RNN encoder-decoder for statistical machine translation." *arXiv preprint arXiv:1406.1078* (2014).

## Extended Capabilities

## Version History

**Introduced in R2020a**

## See Also

`dlarray`

| `fullyconnect`

| `softmax`

| `dlgradient`

| `dlfeval`

| `lstm`

| `attention`