Finding entropy from a probability distribution
10 views (last 30 days)
Show older comments
dear all,
i am trying to find distribution of a random variable by using "hist" command. i am getting the distribution, but i want to calculate the entropy from that histogram. can anyone please help me to solve this issue. any kind of help is greatful for me. thanks in advance.
0 Comments
Accepted Answer
the cyclist
on 27 Jan 2012
x = randn(100000,1);
[counts,binCenters] = hist(x,100);
binWidth = diff(binCenters);
binWidth = [binWidth(end),binWidth]; % Replicate last bin width for first, which is indeterminate.
nz = counts>0; % Index to non-zero bins
frequency = counts(nz)/sum(counts(nz));
H = -sum(frequency.*log(frequency./binWidth(nz)))
It seems that the most common references (i.e. Wikipedia!) are assuming a discrete random variate (with a specified probability mass function), rather than a discrete approximation to a continuous variate. In that case, the "bin width" is effectively 1. Here is a reference that discusses the case of non-unit bin width, and has the formula that I used as the basis of the above calculation: http://www2.warwick.ac.uk/fac/soc/economics/staff/academic/wallis/publications/entropy.pdf
6 Comments
the cyclist
on 28 Jan 2012
I frankly cannot explain the entire concept of entropy to you here. I think that a careful reading of this Wikipedia page is a good start: http://en.wikipedia.org/wiki/Entropy_(information_theory)
the cyclist
on 28 Jan 2012
Rather than my guessing at what you may have done incorrectly with histc(), maybe you could post a new question specifically about this? If you have not done so, I suggest you carefully read the documentation for hist() and histc(). The help files are very precise about how the calculations are done. For example, the help file for histc() is very specific about how it treats cases that land exactly on a bin edge.
More Answers (0)
See Also
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!