Reduce the inputs for the chi2gof function (follow-up question of: Using chi2gof to test two distributions)

5 views (last 30 days)
This is a follow-up question of Using chi2gof to test two distributions.
@Allie had the following inputs, i.e. the binned observational data (x), the binned model data (y), and the bin edges (bins):
x= [41 22 11 10 9 5 2 3 2];
y= [38.052 24.2655 15.4665 9.8595 6.2895 4.011 2.562 1.6275 2.8665];
bins=[0:9:81];
and asked how to use "chi2gof to test if two distributions come from a common distribution (null hypothesis) or if they do not come from a common distribution (alternative hypothesis)".
The accepted answer comes from @Jeff Miller:
bins=[0:9:81];
xvals = bins(1:end-1)+4.5; % Here are some fake data values that belong in each bin.
xcounts= [41 22 11 10 9 5 2 3 2]; % These are the counts of the data values in each bin.
y= [38.052 24.2655 15.4665 9.8595 6.2895 4.011 2.562 1.6275 2.8665];
[h,p,stat]=chi2gof(xvals,'Edges',bins,'Expected',y,'Frequency',xcounts,'EMin',1)
h = 0
p = 0.7905
stat = struct with fields:
chi2stat: 4.6864 df: 8 edges: [0 9 18 27 36 45 54 63 72 81] O: [41 22 11 10 9 5 2 3 2] E: [38.0520 24.2655 15.4665 9.8595 6.2895 4.0110 2.5620 1.6275 2.8665]
However, by re-doing all the calculations, i.e.
% inputs
alpha = 0.05;
x= [41 22 11 10 9 5 2 3 2]; % <-- observed binned data
y= [38.052 24.2655 15.4665 9.8595 6.2895 4.011 2.562 1.6275 2.8665]; % <-- expected binned data
% Chi-square goodness-of-fit test
chi2 = sum(((x-y).^2)./y); % <-- chi-square value
df = length(x)-1; % <-- degrees of freedom
p = chi2cdf(chi2,df,'upper'); % <-- p-value calculated from the chi-square distribution with "df" degrees of freedom
if p>alpha % <-- accept or reject the null hypothesis, based on the p-value
h = 0; % <-- disp('we fail to reject the null hypothesis')
else
h = 1; % <-- disp('we reject the null hypothesis')
end
table([chi2; df; p; h],'RowNames',{'chi-square','df','p-value','h'},'VariableNames',{'Stats'})
ans = 4x1 table
Stats _______ chi-square 4.6864 df 8 p-value 0.79051 h 0
I have realized that "xvals" and the "bins" are not necessary for the Chi-square goodness-of-fit test.
Then, is it possible to still use the chi2gof function, but just employing "x" and "y" (and "alpha") as the only inputs?

Accepted Answer

Arnav
Arnav on 25 Jul 2024
Hi @Sim,
You are correct in seeing that xvals and bins are not necessary for the chi-square test if you have the actual bin counts x, the expected bin counts y, and the significance level alpha.
Going through the chi2gof Documentation, the first argument i.e., x is mandatory. Because of this we need to provide a xvals even if we do not use the binning of the function. So, any dummy vector of length equal to the number of bins will work as xvals.
The following code snippet gets the results of chi-square test by only using x, y and alpha with chi2gof. This works the same as the custom code you provided.
% inputs
alpha = 0.05;
x = [41 22 11 10 9 5 2 3 2];
y = [38.052 24.2655 15.4665 9.8595 6.2895 4.011 2.562 1.6275 2.8665];
% Chi-square goodness-of-fit test
[h, p_value, stats]=chi2gof(1:length(x), 'Expected', y, 'Frequency', x, 'EMin', 0, 'Alpha', alpha)
h = 0
p_value = 0.7905
stats = struct with fields:
chi2stat: 4.6864 df: 8 edges: [1.0000 1.8889 2.7778 3.6667 4.5556 5.4444 6.3333 7.2222 8.1111 9.0000] O: [41 22 11 10 9 5 2 3 2] E: [38.0520 24.2655 15.4665 9.8595 6.2895 4.0110 2.5620 1.6275 2.8665]

More Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!