LOG PEARSON TYPE III distribution

31 views (last 30 days)
Raj Arora
Raj Arora on 28 Jun 2021
Commented: Raj Arora on 3 Jul 2021
Could someone kindly tell me how to calculate the inverse of the CDF of a log pearson type III distribution?
In fact, I need to use Monte Carlo sampling to generate samples for certain log pearson type III distributed data. I should be able to take the inverse of the CDF of a log pearson type III distribution for this.

Answers (1)

David Goodmanson
David Goodmanson on 29 Jun 2021
Edited: David Goodmanson on 2 Jul 2021
Hi Raj,
MODIFIED
Since the pearson type iii is a gamma distribution with a shifted and scaled variable, that means you can use the Matlab gamrnd function to produce random draws and proceed accordingly.
The pearson type iii distribution depends on three parameters. This answer assumes that the task is to create random draws from a distribution with three known parameters. For parameters a,b,p and for the log distribution, the pdf is 'pdfa' below. If the task is fitting the parameters from a known data set, then this answer does not apply.
Instead of gamrnd, you can use the inverse cdf method as you mentioned, which is shown in fig. 2 below. As is almost always the case with that method, there are some problems producing exteme outliers when the cdf is close to 1. But it works reasonably well.
% pearson type iii in log variable:
% pdf = (1/b)*logxs.^(p-1).*exp(-logxs)/gamma(p)
% where
% logxs = (log(x)-a)/b is a shifted and scaled log(x),
% with log(x) >= a
n = 1e5; % number of random draws
p = 3;
a = 4.7;
b = 2.4;
% constuct pdfa, the analytic pdf
logx = a:.01:50;
logxs = (logx-a)/b; % shifted and scaled log(x) [1]
pdfa = (1/b)*gampdf(logxs,p,1);
% random draws using gamrnd
randlogxs = gamrnd(p,1,1,n); % shifted and scaled log(x)
randlogx = randlogxs*b +a; % undo [1] for pearson type iii
figure(1)
histogram(randlogx,'Normalization','pdf'); grid on
hold on
plot(logx,pdfa)
hold off
% random draws using inverse of cdf
cdfa = cumtrapz(logx,pdfa);
r = rand(1,n);
r(r>max(cdfa)) = [];
randlogx = spline(cdfa,logx,r); % or randlogx = interp1(cdfa,logx,r,'spline');
figure(2)
histogram(randlogx,'Normalization','pdf'); grid on
hold on
plot(logx,pdfa)
hold off
  5 Comments
David Goodmanson
David Goodmanson on 2 Jul 2021
Edited: David Goodmanson on 2 Jul 2021
Hello Raj,
I see what you mean. It looks like of the four parameters in Wikipedia, one of them is redundant. The much clearer Wolfram Mathworld definition is (adjusting for the variable being log)
pdf = (1/b)*logxs.^(p-1).*exp(-logxs)/gamma(p)
where logx denotes log(x) and
logxs = (logx-a)/b is a shifted and scaled log(x) [1]
The answer has been modified.
Raj Arora
Raj Arora on 3 Jul 2021
Yes david your last comment is clear to me. I have some data with me with that data I already know all three paramter(given below a,b,p) and clearly I can take scaled value(log values) of all these. The question now revolves around the data provided as well as the three parameters that I have. I need to generate as many samples as I like using the same method, so let's say 1e7 (number of sample want to generate).
I simply needed those samples using the inverse CDF of the log pearson type III distribution, which will obviously be in the range of 841-19200.
Can you consider the below data and help me with this
Data : 841, 1200, 2190, 4960, 6530, 9110, 11500, 14500, 19200
b =-0.3161 ; p =12.4640 ; a = 12.4647

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!