Asked by Michael Ziedalski
on 21 Feb 2018

Hello, all. So I am still gaining experience with Matlab and am currently trying to generate sequences of two numbers, x and y, such that their sum is <= 1. The naive way I immediately thought to do this would be by 1) generating x within the range [0, .5], and 2) keep generating y until it is < x, which would guarantee my condition, but introduce some significant statistical bias.

Is there some standard, statistically robust way to do this, guys? I would be very grateful for any of your input on this matter.

Answer by Roger Stafford
on 22 Feb 2018

Edited by Roger Stafford
on 22 Feb 2018

Accepted Answer

(Corrected) Assuming you restrict x and y to non-negative values, the set of x and y values for which x+y<=1 would be a triangular area in the xy plane. You can obtain an area-wise uniform distribution of x and y with the following.

For generating a single pair:

x = 1-sqrt(rand);

y = (1-x)*rand;

To get row vectors with n elements each:

x = 1-sqrt(rand(1,n));

y = (1-x).*rand(1,n);

Sign in to comment.

Answer by Jeff Miller
on 21 Feb 2018

It isn't entirely clear what joint distribution you want for (x,y), but here is one possibility:

x = rand; % uniform 0 to 1

y = (1-x)*rand; % uniform 0 to 1-x

Michael Ziedalski
on 22 Feb 2018

Thank you for going the extra mile and providing these nice graphs! What happened in the first graph was what I suspected would happen if one used some naive way of ensuring the condition. Would you care to elaborate more on the math that brought you to take the square root of the previous number? Or mathematically know that the area would be a triangle characterized by those points?

I can intuitively see why a triangle describes the area here, but in higher dimensions it would be difficult, and I would very much appreciate a pointer in the right direction on the math involved.

Roger Stafford
on 22 Feb 2018

Michael: I will attempt to answer your question. Draw a vertical line at a value x and consider the area of the triangle to the right of the line. It is proportional to (1-x)^2, so the probability of choosing an x to the right should be proportional to (1-x)^2:

p = k*(1-x)^2

and setting x = 0 it is clear that k equals 1. Hence

1-x = sqrt(p)

We replace p by Matlab's 'rand' which then plays to role of p to get:

1-x = sqrt(rand)

x = 1-sqrt(rand)

That is, let r1 < r2 be two possible values of rand. Then the probability of rand lying between them is r2-r1. The corresponding values of x are x1 = 1-sqrt(r1) and x2 = 1-sqrt(r2) and we have r1 = (1-x1)^2 and r2 = (1-x2)^2. Hence the probability of lying between x1 and x2 is

r2-r1 = (1-x2)^2-(1-x1)^2

which is what we wish to achieve since that is proportional to the area between x1 and x2 under the line x+y=1.

[For higher n dimensional "triangles", otherwise known as 'simplexes', the answer will of course be different and involve n-th roots.]

César
on 26 Mar 2019

Dear @Roger Stafford,

Could you please give also some explanations of how do y = (1-x).*rand(N,1) give the correcrt distribution please?

Sign in to comment.

Opportunities for recent engineering grads.

Apply Today
## 0 Comments

Sign in to comment.