Reduce the inputs for the chi2gof function (follow-up question of: Using chi2gof to test two distributions)
4 views (last 30 days)
Show older comments
@Allie had the following inputs, i.e. the binned observational data (x), the binned model data (y), and the bin edges (bins):
x= [41 22 11 10 9 5 2 3 2];
y= [38.052 24.2655 15.4665 9.8595 6.2895 4.011 2.562 1.6275 2.8665];
bins=[0:9:81];
and asked how to use "chi2gof to test if two distributions come from a common distribution (null hypothesis) or if they do not come from a common distribution (alternative hypothesis)".
bins=[0:9:81];
xvals = bins(1:end-1)+4.5; % Here are some fake data values that belong in each bin.
xcounts= [41 22 11 10 9 5 2 3 2]; % These are the counts of the data values in each bin.
y= [38.052 24.2655 15.4665 9.8595 6.2895 4.011 2.562 1.6275 2.8665];
[h,p,stat]=chi2gof(xvals,'Edges',bins,'Expected',y,'Frequency',xcounts,'EMin',1)
However, by re-doing all the calculations, i.e.
% inputs
alpha = 0.05;
x= [41 22 11 10 9 5 2 3 2]; % <-- observed binned data
y= [38.052 24.2655 15.4665 9.8595 6.2895 4.011 2.562 1.6275 2.8665]; % <-- expected binned data
% Chi-square goodness-of-fit test
chi2 = sum(((x-y).^2)./y); % <-- chi-square value
df = length(x)-1; % <-- degrees of freedom
p = chi2cdf(chi2,df,'upper'); % <-- p-value calculated from the chi-square distribution with "df" degrees of freedom
if p>alpha % <-- accept or reject the null hypothesis, based on the p-value
h = 0; % <-- disp('we fail to reject the null hypothesis')
else
h = 1; % <-- disp('we reject the null hypothesis')
end
table([chi2; df; p; h],'RowNames',{'chi-square','df','p-value','h'},'VariableNames',{'Stats'})
I have realized that "xvals" and the "bins" are not necessary for the Chi-square goodness-of-fit test.
Then, is it possible to still use the chi2gof function, but just employing "x" and "y" (and "alpha") as the only inputs?
0 Comments
Accepted Answer
Arnav
on 25 Jul 2024
You are correct in seeing that xvals and bins are not necessary for the chi-square test if you have the actual bin counts x, the expected bin counts y, and the significance level alpha.
Going through the chi2gof Documentation, the first argument i.e., x is mandatory. Because of this we need to provide a xvals even if we do not use the binning of the function. So, any dummy vector of length equal to the number of bins will work as xvals.
You can find the documentation here: https://www.mathworks.com/help/stats/chi2gof.html#btv1j1v-3
The following code snippet gets the results of chi-square test by only using x, y and alpha with chi2gof. This works the same as the custom code you provided.
% inputs
alpha = 0.05;
x = [41 22 11 10 9 5 2 3 2];
y = [38.052 24.2655 15.4665 9.8595 6.2895 4.011 2.562 1.6275 2.8665];
% Chi-square goodness-of-fit test
[h, p_value, stats]=chi2gof(1:length(x), 'Expected', y, 'Frequency', x, 'EMin', 0, 'Alpha', alpha)
More Answers (0)
See Also
Categories
Find more on Hypothesis Tests in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!