Main Content

gpfit

Generalized Pareto parameter estimates

Description

pHat = gpfit(x) returns the maximum likelihood estimates (MLEs) of the generalized Pareto (GP) distribution parameters (shape and scale), given the sample data in x.

Other functions for the generalized Pareto distribution, such as gpcdf, allow a threshold (location) parameter theta. However, gpfit does not estimate theta, and assumes its value to be zero. To fit data with a known value of theta, subtract theta from x before calling gpfit.

[pHat,pCI] = gpfit(x,alpha) specifies the confidence level for the confidence intervals to be 100(1 – alpha)%.

example

[___] = gpfit(x,alpha,options) specifies optimization options for the iterative algorithm gpfit uses to compute the MLEs. Create options by using the function statset.

You can specify [] for alpha to use its default value of 0.05.

Examples

collapse all

Generate 100 random numbers from the generalized Pareto distribution with the shape parameter k=0.5, scale parameter sigma=2, and location parameter theta=1.

rng(0,"twister") % For reproducibility
k = 0.5;
sigma = 2;
theta = 1;
x = gprnd(k,sigma,theta,100,1);

Find the maximum likelihood estimates and the 99% confidence intervals for the distribution parameters. The gpfit function does not estimate the location parameter theta, and assumes its value is zero. To fit the shape and scale parameters, subtract theta from the x values.

[pHat,pCI] = gpfit(x-theta,0.01)
pHat = 1×2

    0.4991    1.7605

pCI = 2×2

    0.0782    1.0932
    0.9201    2.8350

pHat(1) and pHat(2) are the estimates of the shape and scale parameters, respectively. pCI contains the 99% confidence intervals of each parameter. The values in the first row are the lower bounds, and the values in the second row are the upper bounds.

Input Arguments

collapse all

Sample data, specified as a numeric vector.

Data Types: single | double

Significance level for the confidence intervals, specified as a scalar in the range [0,1]. The confidence level is 100(1 – alpha)%, where alpha is the probability that the confidence intervals do not contain the true value. You can specify [] for alpha to use its default value of 0.05.

Data Types: single | double

Optimization options, specified as a structure. options determines the control parameters for the iterative algorithm used by gpfit to compute MLEs.

Create options by using the function statset or by creating a structure array containing the fields and values described in this table.

Field NameValueDefault Value
Display

Amount of information displayed by the algorithm

  • "off" — Displays no information

  • "final" — Displays the final output

  • "notify" — Displays output only if the function does not converge

"off"
MaxFunEvals

Maximum number of objective function evaluations allowed, specified as a positive integer.

400
MaxIter

Maximum number of iterations allowed, specified as a positive integer.

200
TolBnd

Lower bound of the standard deviation parameter estimate, specified as a positive scalar.

The bounds for the mean and standard deviation parameter estimates are [–Inf,Inf] and [TolBnd,Inf], respectively.

1e-6
TolFun

Termination tolerance for the objective function value, specified as a positive scalar

1e-6
TolX

Termination tolerance for the parameters, specified as a positive scalar

1e-6
OutputFcn

Specify one or more user-defined functions that an optimization function calls at each iteration, either as a function handle or as a cell array of function handles. For more information, see Optimization Solver Output Functions.

[]

You can also enter statset("gpfit") in the Command Window to see the names and default values of the fields included in the options structure.

Example: statset(Display="final",MaxIter=1000) specifies to display the final information of the iterative algorithm results, and change the maximum number of iterations allowed to 1000.

Data Types: struct

Output Arguments

collapse all

Estimates of the parameters k (shape) and sigma (scale) of the GP distribution, returned as a 1-by-2 numeric row vector.

Confidence intervals for the parameters, returned as a 2-by-2 numeric matrix containing the lower and upper bounds of the 100(1 – alpha)% confidence interval.

The first and second rows correspond to the lower and upper bounds of the confidence intervals, respectively.

Alternative Functionality

gpfit is a function specific to the generalized Pareto distribution. Statistics and Machine Learning Toolbox™ also offers the generic functions mle, fitdist, and paramci and the Distribution Fitter app, which support various probability distributions.

  • mle returns MLEs and the confidence intervals of MLEs for the parameters of various probability distributions. You can specify the probability distribution name or a custom probability density function.

  • Create a GeneralizedParetoDistribution probability distribution object by fitting the distribution to data using the fitdist function or the Distribution Fitter app. The object properties k and sigma store the parameter estimates. To obtain the confidence intervals for the parameter estimates, pass the object to paramci.

References

[1] Embrechts, P., C. Klüppelberg, and T. Mikosch. Modelling Extremal Events for Insurance and Finance. New York: Springer, 1997.

[2] Kotz, S., and S. Nadarajah. Extreme Value Distributions: Theory and Applications. London: Imperial College Press, 2000.

Extended Capabilities

expand all

Version History

Introduced before R2006a