Main Content

cycleGANGenerator

Create CycleGAN generator network for image-to-image translation

Since R2021a

Description

example

net = cycleGANGenerator(inputSize) creates a CycleGAN generator network for input of size inputSize. For more information about the network architecture, see CycleGAN Generator Network.

This function requires Deep Learning Toolbox™.

example

net = cycleGANGenerator(inputSize,Name=Value) modifies aspects of the CycleGAN network using name-value arguments.

Examples

collapse all

Specify the network input size for RGB images of size 256-by-256.

inputSize = [256 256 3];

Create a CycleGAN generator that generates RGB images of the input size.

net = cycleGANGenerator(inputSize)
net = 
  dlnetwork with properties:

         Layers: [72x1 nnet.cnn.layer.Layer]
    Connections: [80x2 table]
     Learnables: [94x3 table]
          State: [0x3 table]
     InputNames: {'inputLayer'}
    OutputNames: {'fActivation'}
    Initialized: 1

  View summary with summary.

Display the network.

analyzeNetwork(net)

Specify the network input size for RGB images of size 128-by-128 pixels.

inputSize = [128 128 3];

Create a CycleGAN generator with six residual blocks. Add the prefix "cycleGAN6_" to all layer names.

net = cycleGANGenerator(inputSize,"NumResidualBlocks",6, ...
    "NamePrefix","cycleGAN6_")
net = 
  dlnetwork with properties:

         Layers: [54x1 nnet.cnn.layer.Layer]
    Connections: [59x2 table]
     Learnables: [70x3 table]
          State: [0x3 table]
     InputNames: {'cycleGAN6_inputLayer'}
    OutputNames: {'cycleGAN6_fActivation'}
    Initialized: 1

  View summary with summary.

Display the network.

analyzeNetwork(net)

Input Arguments

collapse all

Network input size, specified as a 3-element vector of positive integers. inputSize has the form [H W C], where H is the height, W is the width, and C is the number of channels.

Example: [28 28 3] specifies an input size of 28-by-28 pixels for a 3-channel image.

Name-Value Arguments

Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Example: net = cycleGANGenerator(inputSize,NumFiltersInFirstBlock=32) creates a network with 32 filters in the first convolution layer.

Before R2021a, use commas to separate each name and value, and enclose Name in quotes.

Example: net = cycleGANGenerator(inputSize,"NumFiltersInFirstBlock",32) creates a network with 32 filters in the first convolution layer.

Number of downsampling blocks in the network encoder module, specified as a positive integer. In total, the network downsamples the input by a factor of 2^NumDownsamplingBlocks. The decoder module consists of the same number of upsampling blocks.

Number of filters in the first convolution layer, specified as a positive even integer.

Number of output channels, specified as "auto" or a positive integer. When you specify "auto", the number of output channels is the same as the number of input channels.

Filter size in the first and last convolution layers, specified as a positive odd integer or 2-element vector of positive odd integers of the form [height width]. When you specify the filter size as a scalar, the filter has identical height and width.

Filter size in intermediate convolution layers, specified as a positive odd integer or 2-element vector of positive odd integers of the form [height width]. The intermediate convolution layers are the convolution layers excluding the first and last convolution layer. When you specify the filter size as a scalar, the filter has identical height and width. Typical values are between 3 and 7.

Number of residual blocks, specified as a positive integer. Typically, this value is set to 6 for images of size 128-by-128 and 9 for images of size 256-by-256 or larger.

Style of padding used in the network, specified as one of these values.

PaddingValueDescriptionExample
Numeric scalarPad with the specified numeric value

[314159265][2222222222222222314222215922222652222222222222222]

"symmetric-include-edge"Pad using mirrored values of the input, including the edge values

[314159265][5115995133144113314415115995622655662265565115995]

"symmetric-exclude-edge"Pad using mirrored values of the input, excluding the edge values

[314159265][5626562951595141314139515951562656295159514131413]

"replicate"Pad using repeated border elements of the input

[314159265][3331444333144433314441115999222655522265552226555]

Method used to upsample activations, specified as one of these values:

Data Types: char | string

Weight initialization used in convolution layers, specified as "glorot", "he", "narrow-normal", or a function handle. For more information, see Specify Custom Weight Initialization Function (Deep Learning Toolbox).

Activation function to use in the network, specified as one of these values. For more information and a list of available layers, see Activation Layers (Deep Learning Toolbox).

  • "relu" — Use a reluLayer (Deep Learning Toolbox)

  • "leakyRelu" — Use a leakyReluLayer (Deep Learning Toolbox) with a scale factor of 0.2

  • "elu" — Use an eluLayer (Deep Learning Toolbox)

  • A layer object

Activation function after the final convolution layer, specified as one of these values. For more information and a list of available layers, see Activation Layers (Deep Learning Toolbox).

  • "tanh" — Use a tanhLayer (Deep Learning Toolbox)

  • "sigmoid" — Use a sigmoidLayer (Deep Learning Toolbox)

  • "softmax" — Use a softmaxLayer (Deep Learning Toolbox)

  • "none" — Do not use a final activation layer

  • A layer object

Normalization operation to use after each convolution, specified as one of these values. For more information and a list of available layers, see Normalization Layers (Deep Learning Toolbox).

Probability of dropout, specified as a number in the range [0, 1]. If you specify a value of 0, then the network does not include dropout layers. If you specify a value greater than 0, then the network includes a dropoutLayer (Deep Learning Toolbox) in each residual block.

Prefix to all layer names in the network, specified as a string or character vector.

Data Types: char | string

Output Arguments

collapse all

CycleGAN generator network, returned as a dlnetwork (Deep Learning Toolbox) object.

More About

collapse all

CycleGAN Generator Network

A cycleGAN generator network consists of an encoder module followed by a decoder module. The default network follows the architecture proposed by Zhu et. al. [1].

The encoder module downsamples the input by a factor of 2^NumDownsamplingBlocks. The encoder module consists of an initial block of layers, NumDownsamplingBlocks downsampling blocks, and NumResidualBlocks residual blocks. The decoder module upsamples the input by a factor of 2^NumDownsamplingBlocks. The decoder module consists of NumDownsamplingBlocks upsampling blocks and a final block.

The table describes the blocks of layers that comprise the encoder and decoder modules.

Block TypeLayersDiagram of Default Block
Initial block
  • An imageInputLayer (Deep Learning Toolbox).

  • A convolution2dLayer (Deep Learning Toolbox) with a stride of [1 1] and a filter size of FilterSizeInFirstAndLastBlocks.

  • An optional normalization layer, specified by the NormalizationLayer name-value argument.

  • An activation layer specified by the ActivationLayer name-value argument.

Image input layer, 2-D convolution layer, instance normalization layer, ReLU layer

Downsampling block
  • A convolution2dLayer (Deep Learning Toolbox) with a stride of [2 2] to perform downsampling. The convolution layer has a filter size of FilterSizeInIntermediateBlocks.

  • An optional normalization layer, specified by the NormalizationLayer name-value argument.

  • An activation layer specified by the ActivationLayer name-value argument.

2-D convolution layer, instance normalization layer, ReLU layer

Residual block
  • A convolution2dLayer (Deep Learning Toolbox) with a stride of [1 1] and a filter size of FilterSizeInIntermediateBlocks.

  • An optional normalization layer, specified by the NormalizationLayer name-value argument.

  • An activation layer specified by the ActivationLayer name-value argument.

  • An optional dropoutLayer (Deep Learning Toolbox). By default, residual blocks omit a dropout layer. Include a dropout layer by specifying the Dropout name-value argument as a value in the range (0, 1].

  • A second convolution2dLayer (Deep Learning Toolbox).

  • An optional second normalization layer.

  • An additionLayer (Deep Learning Toolbox) that provides a skip connection between every block.

2-D convolution layer, instance normalization layer, ReLU layer, 2-D convolution layer, instance normalization layer, addition layer

Upsampling block
  • An upsampling layer that upsamples by a factor of 2 according to the UpsampleMethod name-value argument. The convolution layer has a filter size of FilterSizeInIntermediateBlocks.

  • An optional normalization layer, specified by the NormalizationLayer name-value argument.

  • An activation layer specified by the ActivationLayer name-value argument.

Transposed 2-D convolution layer, instance normalization layer, ReLU layer

Final block
  • A convolution2dLayer (Deep Learning Toolbox) with a stride of [1 1] and a filter size of FilterSizeInFirstAndLastBlocks.

  • An optional activation layer specified by the FinalActivationLayer name-value argument.

2-D convolution layer, tanh layer

References

[1] Zhu, Jun-Yan, Taesung Park, Phillip Isola, and Alexei A. Efros. "Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks." In 2017 IEEE International Conference on Computer Vision (ICCV), 2242–2251. Venice: IEEE, 2017. https://ieeexplore.ieee.org/document/8237506.

[2] Zhu, Jun-Yan, Taesung Park, and Tongzhou Wang. "CycleGAN and pix2pix in PyTorch." https://github.com/junyanz/pytorch-CycleGAN-and-pix2pix.

Version History

Introduced in R2021a