Using chi2gof to test two distributions

Question

Allie on 6 Feb 2019

0
Link

Direct link to this question

https://ch.mathworks.com/matlabcentral/answers/443557-using-chi2gof-to-test-two-distributions

Commented: Allie on 7 Feb 2019

I want to use the chi2gof to test if two distributions come from a common distribution (null hypothesis) or if they do not come from a common distribution (alternative hypothesis). I have binned observational data (x), binned model data (y), and the bin edges (bins). Both the observational and model data are counts per bin.

x= [41 22 11 10 9 5 2 3 2]
y= [38.052 24.2655 15.4665 9.8595 6.2895 4.011 2.562 1.6275 2.8665]
bins=[0:9:81]

Because the data is already binned and because I'm testing x against y, I used the following code

[h,p,stat]=chi2gof(x,'Edges',bins,'Expected',y)

Manual calculation of the chi2 test statistic results in 4.6861 with a probablity of p=.7905. The above function however, produces a very different result. The resulting stats show different bin edges than designated, the ovserved counts per bin do not match x, the chi2 test statistic is ~87, and p<0.001. Could someone please explain why I'm getting such dramatically different results?

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Sign in to answer this question.

Answer 1

Jeff Miller on 7 Feb 2019

1
Link

Direct link to this answer

https://ch.mathworks.com/matlabcentral/answers/443557-using-chi2gof-to-test-two-distributions#answer_360025

Open in MATLAB Online

Sorry, the x's really do have to be the data values. Try this:

bins=[0:9:81]
xvals = bins(1:end-1)+4.5;   % Here are some fake data values that belong in each bin.
xcounts= [41 22 11 10 9 5 2 3 2]  % These are the counts of the data values in each bin.
y= [38.052 24.2655 15.4665 9.8595 6.2895 4.011 2.562 1.6275 2.8665];
[h,p,stat]=chi2gof(xvals,'Edges',bins,'Expected',y,'Frequency',xcounts,'EMin',1)

This will give you your 4.68. By default, chi2gof groups small bins (less than 5) together, and 'EMin' tells it not to do that.

1 Comment
Show -1 older commentsHide -1 older comments

Allie on 7 Feb 2019

This worked! Thank you

Sign in to comment.

Answer 2

Jeff Miller on 6 Feb 2019

0
Link

Direct link to this answer

https://ch.mathworks.com/matlabcentral/answers/443557-using-chi2gof-to-test-two-distributions#answer_359843

It looks like chi2gof expects the values in x to be the actual, original scores, not the bin counts. Try adding 'Frequency',x to the parameter list.

1 Comment
Show -1 older commentsHide -1 older comments

Allie on 7 Feb 2019

Edited: Allie on 7 Feb 2019

Open in MATLAB Online

This did not work. The stat output is below. As you can see, it changed the edges and expected values from what I originally input and the chi2stat became even bigger.

stat = 
    chi2stat: 234.4383
          df: 5
       edges: [0 9 18 27 36 45 81]
           O: [12 30 22 0 41 0]
           E: [38.0520 24.2655 15.4665 9.8595 6.2895 11.0670]

Sign in to comment.

Using chi2gof to test two distributions

0 Comments
Show -2 older commentsHide -2 older comments

Accepted Answer

1 Comment
Show -1 older commentsHide -1 older comments

More Answers (1)

1 Comment
Show -1 older commentsHide -1 older comments

See Also

Categories

Tags

Community Treasure Hunt

Using chi2gof to test two distributions

0 Comments Show -2 older commentsHide -2 older comments

Accepted Answer

1 Comment Show -1 older commentsHide -1 older comments

More Answers (1)

1 Comment Show -1 older commentsHide -1 older comments

See Also

Categories

Tags

Community Treasure Hunt

0 Comments
Show -2 older commentsHide -2 older comments

1 Comment
Show -1 older commentsHide -1 older comments

1 Comment
Show -1 older commentsHide -1 older comments