What is the Interpretation of the p-Value from runstest() ?
9 views (last 30 days)
Show older comments
For the function runstest the null hypothesis is "that the values in the data vector x come in random order."
Running 100 run tests on rand yields the following for the test decision and the p-value
rng(100);
for ii = 1:100
[h(ii),p(ii)] = runstest(rand(1e4,1));
end
plot(1:100,h,'-o',1:100,p,'-o')
In 97 cases, we can't reject the null hypothesis (is it surprising that the null hypothesis was rejected in even three of the cases?).
My real question is about the variability in the p-value. The doc states that: "Small values of p cast doubt on the validity of the null hypothesis."
How small is "small" in this context? What does "cast doubt" mean?
Regardless, is it expected that the p-value would have such variability when running the same test on data sets that, I would hope, are (in some sense) the same wrt to their "random ordering?
2 Comments
dpb
on 18 Sep 2024
I am not familiar with the specific test and the MATLAB doc doesn't provide any details of the specific test statistic being calculated, so would have to dig into the bowels some to comment much more in depth.
The use of "cast doubt" is simply editorial; it has no technical meaning other than smaller values are less likely (under the test statistic) to have come from "random" sequences.
The definition of random for this purpose simply means how many consecutive values are above/below the mean of the sample; with a unform distribution, the probablility is 50:50 any given value is above/below the mean and the probability of the next being of the same direction of the previous is (theoretically) independent of the prior value as well. Given that, it doesn't seem at all surprising to me that whether 2, 3, ..., N consecutive values are above/below the mean would be quite variable from one sample to another
Accepted Answer
Jeff Miller
on 19 Sep 2024
I'm not sure how much detail you want about hypothesis testing, but the fact is that the p values you get from repeated tests of a null hypothesis are uniformly distributed between 0 and 1 when the null hypothesis is true (at least for continuous test statistics, but the approximation to uniform is also quite good for most discrete ones). This is an inherent property of hypothesis tests because of the way they are constructed.
In your runs test example, the null hypothesis is true because the rng has no serial dependence, so across your 100 tests the p values are approximately uniform. (Run 10,000 and the approximation will be better.) You would get the same thing with repeated t-tests of a true null hypothesis, repeated p values for a sample correlation when the true population correlation is zero, etc.
I doubt if I can make it much more intutive, but here is a try: different random realizations of the same process do give different results, which may conform more or less exactly to what is expected under the null model. One way to pick up deviations from the null model is to scale those differences so that all of the difference sizes are equally likely when the null model is true.
0 Comments
More Answers (0)
See Also
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!