Using chi2gof to test two distributions
16 views (last 30 days)
Show older comments
I want to use the chi2gof to test if two distributions come from a common distribution (null hypothesis) or if they do not come from a common distribution (alternative hypothesis). I have binned observational data (x), binned model data (y), and the bin edges (bins). Both the observational and model data are counts per bin.
x= [41 22 11 10 9 5 2 3 2]
y= [38.052 24.2655 15.4665 9.8595 6.2895 4.011 2.562 1.6275 2.8665]
bins=[0:9:81]
Because the data is already binned and because I'm testing x against y, I used the following code
[h,p,stat]=chi2gof(x,'Edges',bins,'Expected',y)
Manual calculation of the chi2 test statistic results in 4.6861 with a probablity of p=.7905. The above function however, produces a very different result. The resulting stats show different bin edges than designated, the ovserved counts per bin do not match x, the chi2 test statistic is ~87, and p<0.001. Could someone please explain why I'm getting such dramatically different results?
0 Comments
Accepted Answer
Jeff Miller
on 7 Feb 2019
Sorry, the x's really do have to be the data values. Try this:
bins=[0:9:81]
xvals = bins(1:end-1)+4.5; % Here are some fake data values that belong in each bin.
xcounts= [41 22 11 10 9 5 2 3 2] % These are the counts of the data values in each bin.
y= [38.052 24.2655 15.4665 9.8595 6.2895 4.011 2.562 1.6275 2.8665];
[h,p,stat]=chi2gof(xvals,'Edges',bins,'Expected',y,'Frequency',xcounts,'EMin',1)
This will give you your 4.68. By default, chi2gof groups small bins (less than 5) together, and 'EMin' tells it not to do that.
More Answers (2)
Jeff Miller
on 6 Feb 2019
It looks like chi2gof expects the values in x to be the actual, original scores, not the bin counts. Try adding 'Frequency',x to the parameter list.
Sim
on 14 Aug 2024
Edited: Sim
on 14 Aug 2024
The CHI2TEST2 Version 1.0.0.0 (2.44 KB) function by James Meldrum performs the two-sample chi-square test.
The Chi-squared test needs binned data. However, as far as I understand, you need to give the raw data, and not the binned data, as inputs of CHI2TEST2.
Indeed, CHI2TEST2 places the raw data into bins:
bins = unique([x1(:,1); x2(:,1)]); % create a bin for each unique value
0 Comments
See Also
Categories
Find more on Hypothesis Tests in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!