# processpca

Process columns of matrix with principal component analysis

## Syntax

```[Y,PS] = processpca(X,maxfrac) [Y,PS] = processpca(X,FP) Y = processpca('apply',X,PS) X = processpca('reverse',Y,PS) name = processpca('name') fp = processpca('pdefaults') names = processpca('pdesc') processpca('pcheck',fp); ```

## Description

`processpca` processes matrices using principal component analysis so that each row is uncorrelated, the rows are in the order of the amount they contribute to total variation, and rows whose contribution to total variation are less than `maxfrac` are removed.

`[Y,PS] = processpca(X,maxfrac)` takes `X` and an optional parameter,

 `X` `N`-by-`Q` matrix `maxfrac` Maximum fraction of variance for removed rows (default is 0)

and returns

 `Y` `M`-by-`Q` matrix with `N` `-` `M` rows deleted `PS` Process settings that allow consistent processing of values

`[Y,PS] = processpca(X,FP)` takes parameters as a struct: `FP.maxfrac`.

`Y = processpca('apply',X,PS)` returns `Y`, given `X` and settings `PS`.

`X = processpca('reverse',Y,PS)` returns `X`, given `Y` and settings `PS`.

`name = processpca('name')` returns the name of this process method.

`fp = processpca('pdefaults')` returns default process parameter structure.

`names = processpca('pdesc')` returns the process parameter descriptions.

`processpca('pcheck',fp);` throws an error if any parameter is illegal.

## Examples

Here is how to format a matrix with an independent row, a correlated row, and a completely redundant row so that its rows are uncorrelated and the redundant row is dropped.

```x1_independent = rand(1,5) x1_correlated = rand(1,5) + x1_independent; x1_redundant = x1_independent + x1_correlated x1 = [x1_independent; x1_correlated; x1_redundant] [y1,ps] = processpca(x1) ```

Next, apply the same processing settings to new values.

```x2_independent = rand(1,5) x2_correlated = rand(1,5) + x1_independent; x2_redundant = x1_independent + x1_correlated x2 = [x2_independent; x2_correlated; x2_redundant]; y2 = processpca('apply',x2,ps) ```

Reverse the processing of `y1` to get `x1` again.

```x1_again = processpca('reverse',y1,ps) ```

collapse all

### Reduce Input Dimensionality Using `processpca`

In some situations, the dimension of the input vector is large, but the components of the vectors are highly correlated (redundant). It is useful in this situation to reduce the dimension of the input vectors. An effective procedure for performing this operation is principal component analysis. This technique has three effects: it orthogonalizes the components of the input vectors (so that they are uncorrelated with each other), it orders the resulting orthogonal components (principal components) so that those with the largest variation come first, and it eliminates those components that contribute the least to the variation in the data set. The following code illustrates the use of `processpca`, which performs a principal-component analysis using the processing setting `maxfrac of 0.02`.

```[pn,ps1] = mapstd(p); [ptrans,ps2] = processpca(pn,0.02); ```

The input vectors are first normalized, using `mapstd`, so that they have zero mean and unity variance. This is a standard procedure when using principal components. In this example, the second argument passed to `processpca` is 0.02. This means that `processpca` eliminates those principal components that contribute less than 2% to the total variation in the data set. The matrix `ptrans` contains the transformed input vectors. The settings structure `ps2` contains the principal component transformation matrix. After the network has been trained, these settings should be used to transform any future inputs that are applied to the network. It effectively becomes a part of the network, just like the network weights and biases. If you multiply the normalized input vectors `pn` by the transformation matrix `transMat`, you obtain the transformed input vectors `ptrans`.

If `processpca` is used to preprocess the training set data, then whenever the trained network is used with new inputs, you should preprocess them with the transformation matrix that was computed for the training set, using `ps2`. The following code applies a new set of inputs to a network already trained.

```pnewn = mapstd('apply',pnew,ps1); pnewtrans = processpca('apply',pnewn,ps2); a = sim(net,pnewtrans); ```

Principal component analysis is not reliably reversible. Therefore it is only recommended for input processing. Outputs require reversible processing functions.

Principal component analysis is not part of the default processing for `feedforwardnet`. You can add this with the following command:

```net.inputs{1}.processFcns{end+1} = 'processpca'; ```

## Algorithms

Values in rows whose elements are not all the same value are set to

```y = 2*(x-minx)/(maxx-minx) - 1; ```

Values in rows with all the same value are set to 0.

## Version History

Introduced in R2006a