Power calculations for 1-way ANOVA with possibly unequal group sizes

Published:

I was recently asked to do sample size and power calculations in the setting of a 1-way ANOVA with unequal group sizes. There are solutions to do this when the group sizes are equal, but the answer is not as commonly available for the case where group sizes are unequal.

For the proposed power calculations, we were comparing an intervention arm of size 200 to two control arms, each of size 20. It was reasonable to assume the following:

  • All outcomes were normally distributed with standard deviation, $\sigma = 1$.
  • The first and largest group had mean $\delta$, while the other two groups had mean 0.

I first considered using simulations by generating data under the above assumptions, fitting a 1-way ANOVA assuming equal variances, and doing this $R$ times and calculating the proportion where the null hypothesis was rejected at the 0.05-level, i.e., power. But this can also be done in this setting by using the underlying F-test and corresponding non-centrality parameter. Here I frame things in terms of Cohen’s effect size.

1-way ANOVA set-up and test statistic

For ANOVA with equal group sizes $n_1=n_2=\dots=n_G$, and common variance $\sigma^2$. Let $N$ denote the total sample size, i.e., $N=n_1+\dots+n_G$, and $\mu_W$ denoted the weighted mean of all groups. Additionally, define $\sigma_m^2$ as follows:

\[\sigma_m^2 = \displaystyle\sum_{i=1}^G \displaystyle\frac{n_i}{N}\left(\mu_i - \mu_W \right)^2.\]

Cohen (1988) defined the effect size $f$ as:

\[f = \sqrt{\displaystyle\frac{\sigma_m^2}{\sigma^2}}.\]

Under the alternative hypothesis, the 1-way ANOVA test statistic follows a non-central F-distribution, $F_{G-1, N-G, \alpha}$, with non-centrality parameter $\lambda = Nf^2$, where $\alpha$ is the type I error rate.

Analytic approach under our assumptions

Under our assumptions above, we can obtain simplified expressions for $\mu_W$ and $\sigma_m^2$ as follows:

\[\mu_W=\displaystyle\frac{n_1}{N}\delta,\]

which can be used to calculate

\[\sigma_m^2 = \displaystyle\frac{n_1}{N}\left(\delta - \mu_W\right)^2 +\displaystyle\frac{n_2}{N}\mu_W^2 + \displaystyle\frac{n_3}{N}\mu_W^2 + \displaystyle\frac{n_4}{N}\mu_W^2 = \delta^2\left[\displaystyle\frac{n_1(N-n_1)}{N^2}\right].\]

The effect size is given by $f=\sigma_m$, which can be easily obtained from the above expression.

Note that for a given effect size $f$, in the setting described above, $\delta$ is given by the following, which may help with understanding the assumed data generating mechanism:

\[\delta = f \displaystyle\frac{N}{\sqrt{n_1(N-n_1)}}.\]