# Combining 3-way-ANOVA of stratified bootstrapped datasets

4 views (last 30 days)
Jonas on 15 Apr 2021
Commented: Jeff Miller on 21 Apr 2021
I've the following situation:
i have a dataset which includes a head motion parameter for 16 different groups classified according to 4 different age groups and 4 different stimulus angles. one head motion parameter belongs to one trial and the 16 subgroups don't have equal number of trials (ranging from 21 to about 200). Each trial brings an onset error measurement with it.
I used stratified bootstrapping to generate 5000 'new' datasets, each having the same number of group members as in the original data set.
Now i want to run a 3-way-anova on each bootstrap data set. So here are 2 questions:
1. I was not sure if i need to rebalance the data set because the groups have different number of trials. At the moment I rebalanced the design of each bootrap data set by upsampling smaler groups and downsampling bigger groups to a size which equals the median of group sizes. I know that the function anovan() allows inbalanced input data, but does it compensate for it?
2. To run an ANOVA my intuition says that I have to run an ANOVA on each bootrap data set and combine the ANOVA results. How can this be done? Is it valid to take e.g. the mean of the p values?
I have attached a mat file which contains 3 variables, contains stimulus angle on the first column and age group number in the second column. The other two variables are 16x1 cells which contain 5000 bootrapped sets of the head motion paramter and the factor onset error. The design is already rebalanced to 123 trials per subgroup.
best regards
Jonas
Jeff Miller on 21 Apr 2021
I'm really not convinced that the bootstrapping is adding much info here, since the results with the bootstrapping will necessarily largely match those of the original sample being bootstrapped.
Because of that (bootstrap sample p's must be similar those of original sample), the Bonferroni would be wildly conservative. Bonferroni is meant to be used when all the p values are independent of one another, but here they are not: the bootstrap p's will tend to be relatively close to that of the original sample.