Histogram gives different BinCounts while appending BinEdges

4 views (last 30 days)
B Yin on 29 Apr 2018
Dear all,
I try to shrink the BinEdges in the histogram plot while keeping the remaining BinEdges the same as before. But histogram gives different BinCounts. This should not happen.
Best,
Binglun

1 Comment

Huina Mao on 29 Apr 2018
If you use the old function ‘hist’, the same results come. The new function histogram might have some uncertainty. Need MathWorks double check.

Star Strider on 29 Apr 2018
You have different numbers of bins in each subplot. In subplot(3,1,1), you define 61 bins, in subplot(3,1,2), 41, and in subplot(3,1,3), 31. Different numbers of bins are going to produce different bin counts.
You can determine this easily enough by calculating them and then looking at the lengths of each vector:
E1 = (13.83: 0.0005 :13.86);
E2 = (13.83: 0.0005 :13.85);
E3 = (13.83: 0.0005 :13.845);

B Yin on 30 Apr 2018
Thanks for your comment. As you can see, the bin edges 13.83: 0.0005 :13.845 are included in all the three plots. The bin counts should be the same in this region since these bin edges are the same.
Actually, the first two plots give the same bin counts. Only the third one is different. Why?
Best,
Binglun
Star Strider on 30 Apr 2018
Because of the way the colon (link) operator/function calculates the intervals, the values are not exactly the same:
E1 = (13.83: 0.0005 :13.86);
E1a = (13.83: 0.0005 :13.86);
E2 = (13.83: 0.0005 :13.85);
E3 = (13.83: 0.0005 :13.845);
L1 = ismember(E1,E1a);
M1 = nnz(L1);
L2 = ismember(E1,E2);
M2 = nnz(L2);
L3 = ismember(E1,E3);
M3 = nnz(L3);
L4 = ismember(E2,E3);
M4 = nnz(L4);
For example, ‘E3’ has a length of 31 elements, although only 24 of them are the same as those in ‘E2’.
B Yin on 2 May 2018
The use of
>> e1 = 13.83 + (0:60)*0.0005;
>> e2 = 13.83 + (0:40)*0.0005;
>> e3 = 13.83 + (0:30)*0.0005;
seems works.
Then why for the same interval, hist and hitogram give different plots? Which one gives the correct bin counts? See attachments.
Best,
Binglun

Philip Borghesani on 2 May 2018
The differences in histograms are due to slightly different algorithms used in the two functions. Hist uses the input points as centers for the bins and histogram uses them as edges. This is documented in each functions documentation.
Your code is forcing the histogram routine to compare floating point values at logical equality this will eventually burn you in every situation.
A much better way to produce a histogram for your data is to offset the input bins by 1/2 your minimum quantization value.
h3=histogram( x, (13.83: 0.0005 :13.845) +0.00005);
In addition your instrument appears to have some bias toward certain input values examine:
h3=histogram( x, (13.835: 0.0001 :13.845) +0.00005);
This is magnifying the effect of the comparison errors in the output histograms.