# ecmnmle

Mean and covariance of incomplete multivariate normal data

## Syntax

```[Mean,Covariance] = ecmnmle(Data,InitMethod,MaxIterations,Tolerance,Mean0,Covar0)
```

## Arguments

 `Data` `NUMSAMPLES`-by-`NUMSERIES` matrix with `NUMSAMPLES` samples of a `NUMSERIES`-dimensional random vector. Missing values are indicated by `NaN`s. A sample is also called an observation or a record. `InitMethod` (Optional) Character vector that identifies one of three defined initialization methods to compute initial estimates for the mean and covariance of the data. If `InitMethod` = `[]` or `''`, the default method `nanskip` is used. The initialization methods are: `nanskip` — (Default) Skip all records with `NaN`s. `twostage` — Estimate mean. Fill `NaN`s with mean. Then estimate covariance.`diagonal` — Form a diagonal covariance. NoteIf you supply `Mean0` and `Covar0`, `InitMethod` is not executed. `MaxIterations` (Optional) Maximum number of iterations for the expectation conditional maximization (ECM) algorithm. Default = `50`. `Tolerance` (Optional) Convergence tolerance for the ECM algorithm (Default = `1.0e-8`.) If `Tolerance` ≤ `0`, perform maximum iterations specified by `MaxIterations` and do not evaluate the objective function at each step unless in display mode, as described below. `Mean0` (Optional) Initial `NUMSERIES`-by-`1` column vector estimate for the mean. If you leave `Mean0` unspecified (`[]`), the method specified by `InitMethod` is used. If you specify `Mean0`, you must also specify `Covar0`. `Covar0` (Optional) Initial `NUMSERIES`-by-`NUMSERIES` matrix estimate for the covariance, where the input matrix must be positive-definite. If you leave `Covar0` unspecified (`[]`), the method specified by `InitMethod` is used. If you specify `Covar0`, you must also specify `Mean0`.

## Description

`[Mean,Covariance] = ecmnmle(Data,InitMethod,MaxIterations,Tolerance,Mean0,Covar0)` estimates the mean and covariance of a data set. If the data set has missing values, this routine implements the ECM algorithm of Meng and Rubin [2] with enhancements by Sexton and Swensen [3]. ECM stands for expectation conditional maximization, a conditional maximization form of the EM algorithm of Dempster, Laird, and Rubin [4].

This routine has two operational modes.

### Display Mode

With no output arguments, this mode displays the convergence of the ECM algorithm. It estimates and plots objective function values for each iteration of the ECM algorithm until termination, as shown in the following plot.

Display mode can determine `MaxIter` and `Tolerance` values or serve as a diagnostic tool. The objective function is the negative log-likelihood function of the observed data and convergence to a maximum likelihood estimate corresponds with minimization of the objective.

### Estimation Mode

With output arguments, this mode estimates the mean and covariance via the ECM algorithm.

## Examples

To see an example of how to use `ecmnmle`, run the program `ecmguidemo`.

## Algorithms

collapse all

### Model

The general model is

`$Z\sim N\left(Mean,\text{\hspace{0.17em}}Covariance\right),$`

where each row of `Data` is an observation of Z.

Each observation of Z is assumed to be iid (independent, identically distributed) multivariate normal, and missing values are assumed to be missing at random (MAR). See Little and Rubin [1] for a precise definition of MAR.

This routine estimates the mean and covariance from given data. If data values are missing, the routine implements the ECM algorithm of Meng and Rubin [2] with enhancements by Sexton and Swensen [3].

If a record is empty (every value in a sample is `NaN`), this routine ignores the record because it contributes no information. If such records exist in the data, the number of nonempty samples used in the estimation is ≤ `NumSamples`.

The estimate for the covariance is a biased maximum likelihood estimate (MLE). To convert to an unbiased estimate, multiply the covariance by `Count`/(`Count` – 1), where `Count` is the number of nonempty samples used in the estimation.

### Requirements

This routine requires consistent values for `NUMSAMPLES` and `NUMSERIES` with `NUMSAMPLES` > `NUMSERIES`. It must have enough nonmissing values to converge. Finally, it must have a positive-definite covariance matrix. Although the references provide some necessary and sufficient conditions, general conditions for existence and uniqueness of solutions in the missing-data case, do not exist. The main failure mode is an ill-conditioned covariance matrix estimate. Nonetheless, this routine works for most cases that have less than 15% missing data (a typical upper bound for financial data).

### Initialization Methods

This routine has three initialization methods that cover most cases, each with its advantages and disadvantages. The ECM algorithm always converges to a minimum of the observed negative log-likelihood function. If you override the initialization methods, you must ensure that the initial estimate for the covariance matrix is positive-definite.

The following is a guide to the supported initialization methods.

### nanskip

The `nanskip` method works well with small problems (fewer than 10 series or with monotone missing data patterns). It skips over any records with `NaN`s and estimates initial values from complete-data records only. This initialization method tends to yield fastest convergence of the ECM algorithm. This routine switches to the `twostage` method if it determines that significant numbers of records contain `NaN`.

### twostage

The `twostage` method is the best choice for large problems (more than 10 series). It estimates the mean for each series using all available data for each series. It then estimates the covariance matrix with missing values treated as equal to the mean rather than as `NaN`s. This initialization method is robust but tends to result in slower convergence of the ECM algorithm.

### diagonal

The `diagonal` method is a worst-case approach that deals with problematic data, such as disjoint series and excessive missing data (more than 33% of data missing). Of the three initialization methods, this method causes the slowest convergence of the ECM algorithm. If problems occur with this method, use display mode to examine convergence and modify either `MaxIterations` or `Tolerance`, or try alternative initial estimates with `Mean0` and `Covar0`. If all else fails, try

```Mean0 = zeros(NumSeries); Covar0 = eye(NumSeries,NumSeries); ```

Given estimates for mean and covariance from this routine, you can estimate standard errors with the companion routine `ecmnstd`.

### Convergence

The ECM algorithm does not work for all patterns of missing values. Although it works in most cases, it can fail to converge if the covariance becomes singular. If this occurs, plots of the log-likelihood function tend to have a constant upward slope over many iterations as the log of the negative determinant of the covariance goes to zero. In some cases, the objective fails to converge due to machine precision errors. No general theory of missing data patterns exists to determine these cases. An example of a known failure occurs when two time series are proportional wherever both series contain nonmissing values.

## References

[1] Little, Roderick J. A. and Donald B. Rubin. Statistical Analysis with Missing Data. 2nd Edition. John Wiley & Sons, Inc., 2002.

[2] Meng, Xiao-Li and Donald B. Rubin. “Maximum Likelihood Estimation via the ECM Algorithm.” Biometrika. Vol. 80, No. 2, 1993, pp. 267–278.

[3] Sexton, Joe and Anders Rygh Swensen. “ECM Algorithms that Converge at the Rate of EM.” Biometrika. Vol. 87, No. 3, 2000, pp. 651–662.

[4] Dempster, A. P., N. M. Laird, and Donald B. Rubin. “Maximum Likelihood from Incomplete Data via the EM Algorithm.” Journal of the Royal Statistical Society. Series B, Vol. 39, No. 1, 1977, pp. 1–37.