How does fsrftest calculate the p-value?
8 views (last 30 days)
Show older comments
I am trying to understand how the fsrftest works in MATLAB. From the documentation, I understand that it uses an F-Test to test a null hypothesis and alternative hypothesis. Subsequently the p-value is used to determine the importance of the feature. From my understanding the p-value is also not compared with a significance level and as such this function does not actually reject/accept either hypothesis but rather just uses the p-value to rank features.
My question is regarding how is the p-value calculated? Is the process the same as ANOVA?
0 Comments
Accepted Answer
Ive J
on 8 Jan 2024
Edited: Ive J
on 8 Jan 2024
At the end of doc you can see it uses -log(p) to rank features, so there is no significance level here. And yes, it's same as ANOVA (to be precise, it's a GLM), note that NumBins argument is used to bin continuous features.
n = 100; % sample size
data = table;
data.BMI = randi([18, 50], n, 1);
% bin BMI into two categories
med_bmi = median(data.BMI);
idx = data.BMI > med_bmi;
data.BMI(idx) = 1;
data.BMI(~idx) = 0;
data.Sex = randi([0, 1], n, 1);
data.Target = randn(n, 1);
mdl_bmi = fitlm(data(:, ["BMI", "Target"]))
mdl_sex = fitlm(data(:, ["Sex", "Target"]))
[~, sc] = fsrftest(data, "Target", "NumBins", 2);
p = exp(-sc)
0 Comments
More Answers (0)
See Also
Categories
Find more on ANOVA in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!