How to plot bivariate cumulative probability centred to a point?

4 views (last 30 days)
Hi,
I am trying to create a probability contour plot from a scatter plot centred in the areas with the largest density.
Basically going from this:
To this:
Where:
Probability for a point to be in the red area is 68% Probability for a point to be in the yellow area is 95% Probability for a point to be in the green area is 99.7% Probability for a point to be in the blue area is 99.99%
(I am not sure about my colour choices, I should probably have drawn it from blue being 68% to red being 99.99%).
I have tried to use histcounts2 using the following code:
x = randn(1000,1) + randn(1000,1);
y= randn(1000,1) + randn(1000,1);
X = [x,y];
N=histcounts2(x,y);
N1 = histcounts2(x,y,'Normalization','countdensity');
N2 = histcounts2(x,y,'Normalization','cumcount');
N3 = histcounts2(x,y,'Normalization','probability');
N4 = histcounts2(x,y,'Normalization','pdf');
N5 = histcounts2(x,y,'Normalization','cdf');
figure
subplot(3,3,1);
scatter(x,y, '.');
title('Scatter plot of the data');
subplot(3,3,2); %to visualize the density for each bin
h = histogram2(x,y,'DisplayStyle','tile','ShowEmptyBins','on');
title('Density plot from the data - using histogram2');
subplot(3,3,3);
contour(N);
colorbar
title('contour plot - using histcounts2 - default normalization');
subplot(3,3,4);
contour(N1);
colorbar
title('contour plot - using histcounts2 - countdensity normalization');
subplot(3,3,5);
contour(N2);
colorbar
title('contour plot - using histcounts2 - cumcount normalization');
subplot(3,3,6);
contour(N3);
colorbar
title('contour plot - using histcounts2 - probability normalization');
subplot(3,3,7);
contour(N4);
colorbar
title('contour plot - using histcounts2 - pdf normalization');
subplot(3,3,8);
contour(N5);
colorbar
title('contour plot - using histcounts2 - cdf normalization');
I get the following outcome from running this code (note that outcomes might vary due to the random nature of the first two lines of code):
N =
0 0 0 0 0 1 0 0 0 0
0 0 0 1 0 1 0 0 0 0
0 0 1 7 2 3 1 1 0 0
0 1 1 15 15 14 12 0 0 0
2 3 7 31 43 40 32 10 1 0
0 2 22 40 65 64 46 22 4 1
0 3 22 42 71 61 38 8 5 1
0 3 7 21 46 51 25 11 1 1
0 0 3 6 15 11 10 6 0 0
0 0 2 2 5 1 4 2 0 0
0 0 0 1 2 1 0 1 0 0
The result is as expected, as the 'cumcount' or 'cdf' starts its cumulative count from the origin of the matrix (cf. Matlab documentation - histcounts2 -> Output arguments -> N> ).
Is there any way to 'cumcount' or 'cdf' centered in the most dense area so that I have an output similar to the one hand drawn? Basically, I would like to centre it to (5,7) for N, as it is its largest value (71).
Thank you in advance,
  2 Comments
Benoit Espinola
Benoit Espinola on 7 Aug 2018
Would it make sense if I made:
x = randn(1000,1) + randn(1000,1);
y= randn(1000,1) + randn(1000,1);
X = [x,y];
N = 1-histcounts2(x,y,'Normalization','pdf');
figure();
contour(N);
colorbar
?
I get this:
Benoit Espinola
Benoit Espinola on 7 Aug 2018
Running the last comment several times, I get this figure:
implying I have two areas with 93% probability... I would think that the two areas combined have a 93% probability to find a point of the scatter plot (and not each area individually).
Am I wrong?

Sign in to comment.

Answers (0)

Categories

Find more on Data Distribution Plots in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!