How can I simulate data from two different distribution using Copulas?

Hi!
I have been trying to create a Monte Carlo simulation model based on two set of data. Data X seem to fit a gamma distribution and data Y seem to fit either gumbel or lognormal distribution (lets assume it is lognormal for the sake of it).
So I basically tried to follow the guide "Simulating dependent random variables using copulas" but I can't seem to make it work. Probably because I don't fully get the whole concept behind it, but I know it should work for what I want to get. Right now I just seem to generate random values from 0 to 1 but I need it to do this in correlation to my data.
This is my code:
n=1000;
X=xlsread('XXXXXX.xlsx');
Y=xlsread('YYYYYY.xlsx');
rho=corr(X,Y)
Z = mvnrnd([0 0],[1 rho;rho 1], n);
U = normcdf(Z);
Q = [gaminv(U(:,1),2,1) exp(U(:,2))];
So what is left is to insert my data to the simulation, and I also assume I would have to fit my data using the gamfit and lognfit function.

 Accepted Answer

OK, then I think you are almost there. You have the data X, Y, and rhoXY. Start by estimating the parameters for your gamma and lognormal distributions, like this:
Xparms = gamfit(X);
Yparms = lognfit(Y);
Next you need to adjust a parameter that I'll call rhoZ. Run the following section repeatedly, changing the value of rhoZ each time until the final GivesCorr value matches (closely enough) the observed rhoXY in your real data.
BIGN = 1000000;
rhoZ = 0.35; % Change the 0.35 until GivesCorr (below) matches your observed rhoXY
Z = mvnrnd([0 0],[1 rhoZ;rhoZ 1], BIGN);
U = normcdf(Z);
Xrnd = gaminv(U(:,1),Xparms(1),Xparms(2));
Yrnd = logninv(U(:,2),Yparms(1),Yparms(2));
GivesCorr = corr(Xrnd,Yrnd)
Now, using the rhoZ that you identified above, you can generate your X,Y random samples with the n you really want:
n = 1000;
Z = mvnrnd([0 0],[1 rhoZ;rhoZ 1], n);
U = normcdf(Z);
Xrnd = gaminv(U(:,1),Xparms(1),Xparms(2));
Yrnd = logninv(U(:,2),Yparms(1),Yparms(2));
I think the Xrnd and Yrnd that will result from this process will be what you want.

More Answers (1)

The question is a little confusing because the U values are from 0 to 1 but the Q values aren't.
Q(1,:) looks like a gamma but Q(2,:) doesn't look like either gumbel or lognormal.
Maybe you want :
Q = [gaminv(U(:,1),2,1) exp(Z(:,2))]; % to get variable 2 as a lognormal

5 Comments

n=1000;
X=xlsread('XXXXXX.xlsx');
Y=xlsread('YYYYYY.xlsx');
rho=corr(X,Y)
Z = mvnrnd([0 0],[1 rho;rho 1], n);
Q = inv(Z(:,1),2,1) exp(Z(:,2))];
I mean technically I assume I could also just remove normfit entirely?.
But What If I added my X, which is load (kN/m2), and Y data (days) to my Q row.
Would I do something like:
d=gamfit(X)
c=lognfit(Y)
and multiply it with the Q row? How would that look like?
PS: The whole idea is based on this example:
With the difference being that they just use random numbers, while I want to combine it with my own input data
Not sure I understand what you mean by "combine it with my own data." Is this a fair summary of your situation?
  1. You have observed values of X & Y pairs, where X seems to be a gumbel and Y seems to be a lognormal.
  2. The correlation of the observed values is rhoXY.
  3. For simulation purposes, you want to generate new X/Y pairs from the same 2 distributions that will give approximately the same correlation.
In this summary, your own data are only used to estimate distribution parameters and correlation, but maybe you mean something more by "combine with".
Side note: the rhoXY that you compute from your non-normal data is probably not the value that you want to pass to mvnrnd. Due to the nonlinear transformation, the generated XY data will have a different correlation than the mvn rho.
Sorry for being unclear. The porpuse is to write a program for a monte carlo simulation while also considering the correlation between data X and Y.
So with this program I want to compute results from my data against random inputs I generate in mvrnd.
From the example I refered to, I kinda gathered I have to acount for the correlation my including the rho value, as thats how they did it. I am probably misunderstanding something essential in that example I am reffering, which makes it more confusing.
Writing a program for monte carlo simulation could mean 100 different things...you are not specifying any of the details, and these matter.
Let me try again: See my question about (1) - (3) above: Is this a fair summary of your situation. Please just answer yes if it is, and explain why if it is not.
1. Yes, however mistakenly I wrote gumbel but meant gamma, sorry about that!
2. Yes
3. Yes

Sign in to comment.

Products

Asked:

on 5 Dec 2019

Commented:

on 6 Dec 2019

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!