normfit

Normal parameter estimates

Description

example

[muHat,sigmaHat] = normfit(x) returns estimates of normal distribution parameters (the mean muHat and standard deviation sigmaHat), given the sample data in x. muHat is the sample mean, and sigmaHat is the square root of the unbiased estimator of the variance.

[muHat,sigmaHat,muCI,sigmaCI] = normfit(x) also returns 95% confidence intervals for the parameter estimates on the mean and standard deviation in the arrays muCI and sigmaCI, respectively.

example

[muHat,sigmaHat,muCI,sigmaCI] = normfit(x,alpha) specifies the confidence level for the confidence intervals to be 100(1–alpha)%.

[___] = normfit(x,alpha,censoring) specifies whether each value in x is right-censored or not. Use the logical vector censoring in which 1 indicates observations that are right-censored and 0 indicates observations that are fully observed. With censoring, muHat and sigmaHat are the maximum likelihood estimates (MLEs).

[___] = normfit(x,alpha,censoring,freq) specifies the frequency or weights of observations.

example

[___] = normfit(x,alpha,censoring,freq,options) specifies optimization options for the iterative algorithm normfit to use to compute MLEs with censoring. Create options by using the function statset.

You can pass in [] for alpha, censoring, and freq to use their default values.

Examples

collapse all

Generate 1000 normal random numbers from the normal distribution with mean 3 and standard deviation 5.

rng('default') % For reproducibility
x = normrnd(3,5,[1000,1]);

Find the parameter estimates and the 99% confidence intervals.

[muHat,sigmaHat,muCI,sigmaCI] = normfit(x,0.01)
muHat = 2.8368
sigmaHat = 4.9948
muCI = 2×1

    2.4292
    3.2445

sigmaCI = 2×1

    4.7218
    5.2989

muHat is the sample mean, and sigmaHat is the square root of the unbiased estimator of the variance. muCI and sigmaCI contain the 99% confidence intervals of the mean and standard deviation parameters, respectively. The first row is the lower bound, and the second row is the upper bound.

Find the MLEs of a data set with censoring by using normfit. Use statset to specify the iterative algorithm options that normfit uses to compute MLEs for censored data, and then find the MLEs again.

Load the sample data.

load lightbulb

The first column of the data contains the lifetime (in hours) of two types of bulbs. The second column contains the binary variable indicating whether the bulb is fluorescent or incandescent. 1 indicates that the bulb is fluorescent, and 0 indicates that the bulb is incandescent. The third column contains the censorship information, where 0 indicates the bulb is observed until failure, and 1 indicates the item (bulb) is censored.

Find the indices for fluorescent bulbs.

idx = find(lightbulb(:,2) == 0);

Assume that the lifetime follows the normal distribution, and find the MLEs of the normal distribution parameters. The second input argument of normfit specifies the confidence level. Pass in [] to use its default value 0.05. The third input argument specifies the censorship information.

censoring = lightbulb(idx,3) == 1;
[muHat1,sigmaHat1] = normfit(lightbulb(idx,1),[],censoring)
muHat1 = 9.4966e+03
sigmaHat1 = 3.0640e+03

Display the default algorithm parameters that normfit uses to estimate the normal distribution parameters.

statset('normfit')
ans = struct with fields:
          Display: 'off'
      MaxFunEvals: 200
          MaxIter: 100
           TolBnd: 1.0000e-06
           TolFun: 1.0000e-08
       TolTypeFun: []
             TolX: 1.0000e-08
         TolTypeX: []
          GradObj: []
         Jacobian: []
        DerivStep: []
      FunValCheck: []
           Robust: []
     RobustWgtFun: []
           WgtFun: []
             Tune: []
      UseParallel: []
    UseSubstreams: []
          Streams: {}
        OutputFcn: []

Save the options using a different name. Change how the results are displayed (Display) and the termination tolerance for the objective function (TolFun).

options = statset('normfit');
options.Display = 'final';
options.TolFun = 1e-10;

Alternatively, you can specify algorithm parameters by using the name-value pair arguments of the function statset.

options = statset('Display','final','TolFun',1e-10);

Find the MLEs with the new algorithm parameters.

[muHat2,sigmaHat2] = normfit(lightbulb(idx,1),[],censoring,[],options)
Successful convergence: Norm of gradient less than OPTIONS.TolFun
muHat2 = 9.4966e+03
sigmaHat2 = 3.0640e+03

normfit displays a report on the final iteration.

The function normfit finds the sample mean and the square root of the unbiased estimator of the variance with no censoring. The sample mean is equal to the MLE of the mean parameter, but the square root of the unbiased estimator of the variance is not equal to the MLE of the standard deviation parameter.

Find the normal distribution parameters by using normfit, convert them into MLEs, and then compare the negative log likelihoods of the estimates by using normlike.

Generate 100 normal random numbers from the standard normal distribution.

rng('default') % For reproducibility
n = 100;
x = normrnd(0,1,[n,1]);

Find the sample mean and the square root of the unbiased estimator of the variance.

[muHat,sigmaHat] = normfit(x)
muHat = 0.1231
sigmaHat = 1.1624

Convert the square root of the unbiased estimator of the variance into the MLE of the standard deviation parameter.

sigmaHat_MLE = sqrt((n-1)/n)*sigmaHat
sigmaHat_MLE = 1.1566

The difference between sigmaHat and sigmaHat_MLE is negligible for large n.

Alternatively, you can find the MLEs by using the function mle.

phat = mle(x)
phat = 1×2

    0.1231    1.1566

phat(1) and phat(2) are the MLEs of the mean and the standard deviation parameter, respectively.

Confirm that the log likelihood of the MLEs (muHat and sigmaHat_MLE) is greater than the log likelihood of the unbiased estimators (muHat and sigmaHat) by using the normlike function.

logL = -normlike([muHat,sigmaHat],x)
logL = -156.4424
logL_MLE = -normlike([muHat,sigmaHat_MLE],x)
logL_MLE = -156.4399

Input Arguments

collapse all

Sample data, specified as a vector.

Data Types: single | double

Significance level for the confidence intervals, specified as a scalar in the range (0,1). The confidence level is 100(1–alpha)%, where alpha is the probability that the confidence intervals do not contain the true value.

Example: 0.01

Data Types: single | double

Indicator for the censoring of each value in x, specified as a logical vector of the same size as x. Use 1 for observations that are right-censored and 0 for observations that are fully observed.

The default is an array of 0s, meaning that all observations are fully observed.

Data Types: logical

Frequency or weights of observations, specified as a nonnegative vector that is the same size as x. The freq input argument typically contains nonnegative integer counts for the corresponding elements in x, but can contain any nonnegative values.

To obtain the weighted MLEs for a data set with censoring, specify weights of observations, normalized to the number of observations in x.

The default is an array of 1s, meaning one observation per element of x.

Data Types: single | double

Optimization options, specified as a structure. options determines the control parameters for the iterative algorithm that normfit uses to compute MLEs for censored data.

Create options by using the function statset or by creating a structure array containing the fields and values described in this table.

Field NameValueDefault Value
Display

Amount of information displayed by the algorithm.

  • 'off' — Displays no information.

  • 'final' — Displays the final output.

'off'
MaxFunEvals

Maximum number of objective function evaluations allowed, specified as a positive integer.

200
MaxIter

Maximum number of iterations allowed, specified as a positive integer.

100
TolBnd

Lower bound of the standard deviation parameter estimate, specified as a positive scalar.

The bounds for the mean and standard deviation parameter estimates are [–Inf,Inf] and [TolBnd,Inf], respectively.

1e-6
TolFun

Termination tolerance for the objective function value, specified as a positive scalar.

1e-8
TolX

Termination tolerance for the parameters, specified as a positive scalar.

1e-8

You can also enter statset('normfit') in the Command Window to see the names and default values of the fields that normfit accepts in the options structure.

Example: statset('Display','final','MaxIter',1000) specifies to display the final information of the iterative algorithm results, and change the maximum number of iterations allowed to 1000.

Data Types: struct

Output Arguments

collapse all

Estimate of the mean parameter of the normal distribution, returned as a scalar.

  • With no censoring, muHat is the sample mean.

  • With censoring, muHat is the MLE. To compute the weighted MLE, specify the weights of observations by using freq.

Estimate of the standard deviation parameter of the normal distribution, returned as a scalar.

  • With no censoring, sigmaHat is the square root of the unbiased estimator of the variance. To compute the MLE with no censoring, use the mle function.

  • With censoring, sigmaHat is the MLE. To compute the weighted MLE, specify the weights of observations by using freq.

Confidence interval for the mean parameter of the normal distribution, returned as a 2-by-1 column vector containing the lower and upper bounds of the 100(1–alpha)% confidence interval.

The first and second rows correspond to the lower and upper bounds of the confidence intervals, respectively.

Confidence interval for the standard deviation parameter of the normal distribution, returned as a 2-by-1 column vector containing the lower and upper bounds of the 100(1–alpha)% confidence interval.

The first and second rows correspond to the lower and upper bounds of the confidence intervals, respectively.

Algorithms

To compute the confidence intervals, normfit uses the exact method for uncensored data and the Wald method for censored data. The exact method provides exact coverage for uncensored samples based on t and chi-square distributions.

Alternative Functionality

normfit is a function specific to normal distribution. Statistics and Machine Learning Toolbox™ also offers the generic functions mle, fitdist, and paramci and the Distribution Fitter app, which support various probability distributions.

  • mle returns MLEs and the confidence intervals of MLEs for the parameters of various probability distributions. You can specify the probability distribution name or a custom probability density function.

  • Create a NormalDistribution probability distribution object by fitting the distribution to data using the fitdist function or the Distribution Fitter app. The object properties mu and sigma store the parameter estimates. To obtain the confidence intervals for the parameter estimates, pass the object to paramci.

References

[1] Evans, M., N. Hastings, and B. Peacock. Statistical Distributions. 2nd ed. Hoboken, NJ: John Wiley & Sons, Inc., 1993.

[2] Lawless, J. F. Statistical Models and Methods for Lifetime Data. Hoboken, NJ: Wiley-Interscience, 1982.

[3] Meeker, W. Q., and L. A. Escobar. Statistical Methods for Reliability Data. Hoboken, NJ: John Wiley & Sons, Inc., 1998.

Extended Capabilities

C/C++ Code Generation
Generate C and C++ code using MATLAB® Coder™.

Introduced before R2006a