Normalization of probability distribution function

Question

0 votes

I'm trying to obtain a probability distribution curve with an area equal to 1. I have a dataset of about 3 million values ranging from 0 to 3.5 but this range changes depending on other input parameters irrelevant to my question. I'm basically trying to assign probabilities from 0 to 1 based on experimental data to later apply a Monte Carlo but for some reason I can't figure out why the area under my distribution curve is much larger than 1 and I can't seem to find a way to normalize it. Here is my code pertaining the issue and a snippet of the figure I obtain. Thank you so much in advance to anyone that helps.

EDIT: The value for the variable "area" is actually 1, I just can't seem to visualize how the probabilities amount to 1 in the y-axis.

% Compute the kernel density estimation for electron energy distribution

% from main ion

[f,xi] = ksdensity(Electron_info(:,2));

% Compute the area under the curve

area = trapz(xi,f);

% Normalize the kernel density estimate by the area

f_norm = f/area;

% plot the KDE curve

figure(1)

plot(xi, f_norm,'r')

hold on

% Set plot properties

legend('Ion excitation')

xlabel('Electron Energy (eV)')

ylabel('Probability Density')

title('Electron Energy Distribution')

xlim([0 3.5])

set(gcf, 'Color','w')

0 Comments
Show -2 older comments Hide -2 older comments

Sign in to comment.

Sign in to answer this question.

Follow Question

Answer 1

the cyclist on 25 Apr 2023

1 vote

It might help if you uploaded your data, so that we can run your code.

To me, eyeballing your curve does look like it has area 1. It's a little bit of guesswork to try to figure out why you don't perceive that. Is it because you have a peak near 0.8? Remember, that peak is sharp and narrow -- running over a range of x from about 2.6 to 3 (and not all the y values there are as high as 0.8). That peak contributes perhaps about 0.25 to the area.

The broad, flattish region contributies about 0.3*(2.5-0.5) = 0.6.

The left peak contributes about 0.3*0.5 = 0.15.

I see no problem.

2 Comments
Show None Hide None

Jorge Fernandez on 25 Apr 2023

First off thank you for the answer and second you are correct. What I was trying to do was to obtain a function where the values of y correspond to the probability of finding a value of x. i.e. If I look at x = 5 and see where this value intercepts with the function, the corresponding y would be the probability.

the cyclist on 25 Apr 2023

Edited: the cyclist on 25 Apr 2023

Open in MATLAB Online

For continuous functions, the probability of getting any exact, individual point (e.g. x=5) is zero. This can be a tricky point to grasp at first. It might help to realize to that there are an infinite number of x values, so if the each had a finite probability, then the total probability would be infinite.

Instead, you use the probability density function (which is what you have), and estimate the probability of a range of points, but using the area under the probability density.

If you have a discrete function, then you could plot the probablity itself, such as

x = [1 2 3];

p = [0.2 0.5 0.3];

bar(x,p)

Sign in to comment.

Answer 2

Torsten on 25 Apr 2023

Edited: Torsten on 25 Apr 2023

Open in MATLAB Online

2 votes

I think the kernel density is already normalized ...

From the documentation:

[f,xi] = ksdensity(x) returns a probability density estimate, f, for the sample data in the vector or two-column matrix x. The estimate is based on a normal kernel function, and is evaluated at equally-spaced points, xi, that cover the range of the data in x. ksdensity estimates the density at 100 points for univariate data, or 900 points for bivariate data.

If you want to see the cumulated area under the curve, use

[f,xi] = ksdensity(Electron_info(:,2),'Function','cdf');

2 Comments
Show None Hide None

Jorge Fernandez on 25 Apr 2023

First off thanks for the answer. You are indeed correct, however what I'm trying to plot (if possible) was to obtain a function where the values of y correspond to the probability of finding a value of x. i.e. If I look at x = 5 and see where this value intercepts with the function, the corresponding y would be the probability.

Torsten on 25 Apr 2023

The probability density function gives information about the probability for an interval of x-values. The probability to get a single x-value for a continuous distribution is always 0.

Sign in to comment.

Normalization of probability distribution function

0 Comments
Show -2 older comments Hide -2 older comments

Accepted Answer

2 Comments
Show None Hide None

More Answers (1)

2 Comments
Show None Hide None

Categories

Tags

Community Treasure Hunt

Normalization of probability distribution function

0 Comments Show -2 older comments Hide -2 older comments

Accepted Answer

2 Comments Show None Hide None

More Answers (1)

2 Comments Show None Hide None

Categories

Tags

See Also

Community Treasure Hunt

0 Comments
Show -2 older comments Hide -2 older comments

2 Comments
Show None Hide None

2 Comments
Show None Hide None