# crossentropy

Neural network performance

## Syntax

``perf = crossentropy(net,targets,outputs,perfWeights)``
``perf = crossentropy(___,Name,Value)``

## Description

example

````perf = crossentropy(net,targets,outputs,perfWeights)` calculates a network performance given targets and outputs, with optional performance weights and other parameters. The function returns a result that heavily penalizes outputs that are extremely inaccurate (`y` near `1-t`), with very little penalty for fairly correct classifications (`y` near `t`). Minimizing cross-entropy leads to good classifiers.The cross-entropy for each pair of output-target elements is calculated as: ```ce = -t .* log(y)```.The aggregate cross-entropy performance is the mean of the individual values: `perf = sum(ce(:))/numel(ce)`.Special case (N = 1): If an output consists of only one element, then the outputs and targets are interpreted as binary encoding. That is, there are two classes with targets of 0 and 1, whereas in 1-of-N encoding, there are two or more classes. The binary cross-entropy expression is: `ce = -t .* log(y) - (1-t) .* log(1-y) `.```
````perf = crossentropy(___,Name,Value)` supports customization according to the specified name-value pair arguments.```

## Examples

collapse all

This example shows how to design a classification network with cross-entropy and 0.1 regularization, then calculate performance on the whole dataset.

```[x,t] = iris_dataset; net = patternnet(10); net.performParam.regularization = 0.1; net = train(net,x,t); y = net(x); perf = crossentropy(net,t,y,{1},'regularization',0.1)```
```perf = 0.0278 ```

This example shows how to set up the network to use the `crossentropy` during training.

```net = feedforwardnet(10); net.performFcn = 'crossentropy'; net.performParam.regularization = 0.1; net.performParam.normalization = 'none';```

## Input Arguments

collapse all

Neural network, specified as a network object.

Example: `net = feedforwardnet(10);`

Neural network target values, specified as a matrix or cell array of numeric values. Network target values define the desired outputs, and can be specified as an `N`-by-`Q` matrix of `Q` `N`-element vectors, or an `M`-by-`TS` cell array where each element is an `Ni`-by-`Q` matrix.  In each of these cases, `N` or `Ni` indicates a vector length, `Q` the number of samples, `M` the number of signals for neural networks with multiple outputs, and `TS` is the number of time steps for time series data.  `targets` must have the same dimensions as `outputs`.

The target matrix columns consist of all zeros and a single 1 in the position of the class being represented by that column vector. When N = 1, the software uses cross entropy for binary encoding, otherwise it uses cross entropy for 1-of-N encoding. `NaN` values are allowed to indicate unknown or don't-care output values.  The performance of `NaN` target values is ignored.

Data Types: `double` | `cell`

Neural network output values, specified as a matrix or cell array of numeric values. Network output values can be specified as an `N`-by-`Q` matrix of `Q` `N`-element vectors, or an `M`-by-`TS` cell array where each element is an `Ni`-by-`Q` matrix. In each of these cases, `N` or `Ni` indicates a vector length, `Q` the number of samples, `M` the number of signals for neural networks with multiple outputs and `TS` is the number of time steps for time series data. `outputs` must have the same dimensions as `targets`.

Outputs can include `NaN` to indicate unknown output values, presumably produced as a result of `NaN` input values (also representing unknown or don't-care values). The performance of `NaN` output values is ignored.

General case (N>=2): The columns of the output matrix represent estimates of class membership, and should sum to 1. You can use the `softmax` transfer function to produce such output values. Use `patternnet` to create networks that are already set up to use cross-entropy performance with a softmax output layer.

Data Types: `double` | `cell`

Performance weights, specified as a vector or cell array of numeric values. Performance weights are an optional argument defining the importance of each performance value, associated with each target value, using values between 0 and 1. Performance values of 0 indicate targets to ignore, values of 1 indicate targets to be treated with normal importance. Values between 0 and 1 allow targets to be treated with relative importance.

Performance weights have many uses. They are helpful for classification problems, to indicate which classifications (or misclassifications) have relatively greater benefits (or costs). They can be useful in time series problems where obtaining a correct output on some time steps, such as the last time step, is more important than others. Performance weights can also be used to encourage a neural network to best fit samples whose targets are known most accurately, while giving less importance to targets which are known to be less accurate.

`perfWeights` can have the same dimensions as `targets` and `outputs`. Alternately, each dimension of the performance weights can either match the dimension of `targets` and `outputs`, or be 1. For instance, if `targets` is an `N`-by-`Q` matrix defining `Q` samples of `N`-element vectors, the performance weights can be `N`-by-`Q` indicating a different importance for each target value, or `N`-by-`1` defining a different importance for each row of the targets, or `1`-by-`Q` indicating a different importance for each sample, or be the scalar 1 (i.e. 1-by-1) indicating the same importance for all target values.

Similarly, if `outputs` and `targets` are cell arrays of matrices, the `perfWeights` can be a cell array of the same size, a row cell array (indicating the relative importance of each time step), a column cell array (indicating the relative importance of each neural network output), or a cell array of a single matrix or just the matrix (both cases indicating that all matrices have the same importance values).

For any problem, a `perfWeights` value of `{1}` (the default) or the scalar 1 indicates all performances have equal importance.

Data Types: `double` | `cell`

### Name-Value Pair Arguments

Specify optional comma-separated pairs of `Name,Value` arguments. `Name` is the argument name and `Value` is the corresponding value. `Name` must appear inside quotes. You can specify several name and value pair arguments in any order as `Name1,Value1,...,NameN,ValueN`.

Example: `'normalization','standard'` specifies the inputs and targets to be normalized to the range (-1,+1).

Proportion of performance attributed to weight/bias values, specified as a double between 0 (the default) and 1. A larger value penalizes the network for large weights, and the more likely the network function will avoid overfitting.

Example: `'regularization',0`

Data Types: `single` | `double`

Normalization mode for outputs, targets, and errors, specified as `'none'`, `'standard'`, or `'percent'`. `'none'` performs no normalization. `'standard'` results in outputs and targets being normalized to (-1, +1), and therefore errors in the range (-2, +2).`'percent'` normalizes outputs and targets to (-0.5, 0.5) and errors to (-1, 1).

Example: `'normalization','standard'`

Data Types: `char`

## Output Arguments

collapse all

Network performance, returned as a double in the range (0,1).