One-Way ANOVA
One-Way ANOVA
What Is One-Way ANOVA?
ANOVA stands for Analysis of Variance. The one-way ANOVA tests whether the means of three or more independent groups differ significantly. It is the extension of the independent samples t-test to more than two groups.
Despite its name, ANOVA actually compares means by analyzing variance — specifically, it compares the variability between groups to the variability within groups. If the between-group variability is substantially larger than the within-group variability, there is evidence that at least one group mean differs from the others.
The test produces an F-ratio:
If the null hypothesis is true (all group means are equal), the F-ratio should be close to 1.0. The further is above 1, the stronger the evidence against .
When to Use It
Use a one-way ANOVA when:
- You have one continuous dependent variable (e.g., test scores, reaction time, weight loss)
- You have one categorical independent variable (the factor) with three or more levels (groups)
- The groups are independent (different participants in each group)
Examples:
- Comparing patient satisfaction scores across four hospital departments
- Comparing crop yield for three fertilizer types
- Comparing exam performance across five study methods
Why not just run multiple t-tests? Running all pairwise t-tests inflates the Type I error rate. With 3 groups, you would need 3 comparisons; with 5 groups, 10 comparisons. At , the probability of at least one false positive with comparisons is:
For 3 comparisons: (14.3% false positive rate instead of 5%). ANOVA controls this by testing all groups simultaneously with a single omnibus test.
Assumptions
-
Independence of observations. Participants are randomly and independently assigned to groups. Each person appears in only one group.
-
Normality. The dependent variable is approximately normally distributed within each group. ANOVA is robust to moderate violations when sample sizes are roughly equal and each group has .
-
Homogeneity of variances. The population variances are equal across all groups. Test this with Levene's test. If violated, use Welch's ANOVA or the Brown-Forsythe test as alternatives.
Rule of thumb: If the largest group standard deviation is no more than twice the smallest, the assumption is reasonably satisfied.
Formula
Decomposing Variance
The total variability in the data is partitioned into two sources:
Between-groups sum of squares (variability due to group differences):
Within-groups sum of squares (variability due to individual differences within groups):
Mean Squares
Where is the number of groups and is the total sample size.
F-Ratio
The F-statistic follows an -distribution with and .
Effect Size: Eta-Squared
This tells you the proportion of total variance explained by the group variable.
| Interpretation | |
|---|---|
| .01 | Small |
| .06 | Medium |
| .14 | Large |
Partial eta-squared () is often reported in factorial designs and equals in a one-way ANOVA. Be aware that SPSS reports partial by default.
Omega-squared () is a less biased alternative:
Worked Example
Scenario: A marketing researcher tests three advertising strategies (humor, emotional, and informational) to see which produces the highest purchase intention ratings (scale of 1-10). Each group has 8 participants.
| Humor | Emotional | Informational |
|---|---|---|
| 7 | 8 | 5 |
| 6 | 9 | 4 |
| 8 | 7 | 6 |
| 7 | 8 | 5 |
| 5 | 9 | 3 |
| 8 | 8 | 6 |
| 6 | 7 | 4 |
| 7 | 6 | 5 |
Step 1: Calculate group means and grand mean.
- (Humor)
- (Emotional)
- (Informational)
Step 2: Calculate .
Step 3: Calculate .
For each group, sum the squared deviations from the group mean:
- Humor:
- Emotional:
- Informational:
Step 4: Calculate mean squares.
Step 5: Calculate the F-ratio.
Step 6: Determine the p-value.
With and , the critical value of at is approximately 3.47. Our far exceeds this, so .
Step 7: Calculate effect size.
This is a very large effect — advertising strategy explains 65% of the variance in purchase intention.
Post-Hoc Tests
A significant F-test tells you that at least one group mean differs, but not which groups differ. Post-hoc tests identify specific pairwise differences while controlling the family-wise error rate.
Tukey's HSD (Honestly Significant Difference)
The most commonly used post-hoc test. Compares all possible pairs of means while maintaining the overall .
Where is the studentized range statistic.
Bonferroni Correction
Divides by the number of comparisons. More conservative than Tukey for many comparisons but works with unequal sample sizes.
Where is the number of pairwise comparisons: .
Which Post-Hoc Test to Use?
| Test | Best When |
|---|---|
| Tukey HSD | Equal sample sizes, all pairwise comparisons needed |
| Bonferroni | Unequal sample sizes, or only a few planned comparisons |
| Games-Howell | Unequal variances and/or unequal sample sizes |
| Dunnett | Comparing each group to a single control group |
Interpretation
For our example, , , .
This means:
- The omnibus test is significant. At least one advertising strategy produces different purchase intention ratings than the others.
- The effect is large. Advertising strategy explains 65% of the variance in purchase intention.
- Post-hoc tests are needed to determine which specific groups differ.
Tukey HSD post-hoc tests would likely reveal:
- Emotional > Informational ()
- Humor > Informational ()
- Emotional vs. Humor ( may or may not be significant — the difference is 1.0 point)
Common Mistakes
-
Running multiple t-tests instead of ANOVA. This inflates the family-wise Type I error rate. Use ANOVA first, then post-hoc tests if significant.
-
Skipping post-hoc tests after a significant F. The ANOVA F-test only tells you something differs — you need post-hoc tests to learn what differs.
-
Running post-hoc tests after a non-significant F. If the omnibus F is not significant, do not go fishing for pairwise differences.
-
Ignoring unequal variances. If Levene's test is significant, use Welch's ANOVA with Games-Howell post-hoc tests instead of the standard ANOVA with Tukey.
-
Reporting as partial (or vice versa). In a one-way ANOVA these are identical, but in factorial designs they differ. Be explicit about which you report.
-
Confusing Cohen's and Cohen's . For ANOVA, use Cohen's for power analysis:
Where (small), (medium), (large).
How to Run It
Effect size (eta-squared)
library(effectsize) eta_squared(model)
Post-hoc pairwise comparisons (Tukey HSD)
TukeyHSD(model)
```python
from scipy import stats
import pingouin as pg
# Using scipy
f_stat, p_value = stats.f_oneway(group1, group2, group3)
# Using pingouin (includes effect size and post-hoc)
aov = pg.anova(dv='score', between='group', data=df)
print(aov)
# Post-hoc tests
posthoc = pg.pairwise_tukey(dv='score', between='group', data=df)
print(posthoc)
```
Go to Analyze > Compare Means > One-Way ANOVA
Move your dependent variable into the Dependent List
Move your grouping variable into the Factor box
Click Post Hoc and select Tukey (or Bonferroni for unequal sample sizes)
Click Options and check Descriptive and Homogeneity of variance test
Click OK
If Levene's test is significant, use Analyze > Compare Means > One-Way ANOVA > Options > Welch and use Games-Howell post-hoc tests instead of Tukey.
Use the Data Analysis ToolPak (enable via File > Options > Add-ins):
Go to Data > Data Analysis > Anova: Single Factor
Select your input range (each group in a separate column)
Set Grouped By: Columns
Set alpha to 0.05
Click OK
Excel produces the ANOVA summary table with F-statistic, p-value, and F-critical. It does not produce post-hoc tests or effect sizes — calculate eta-squared manually as SSbetween / SStotal.
Ready to calculate?
Now that you understand the concept, use the free Effect Size Calculator on Subthesis to run your own analysis.
Related Concepts
Independent Samples t-Test
Learn how to conduct and interpret an independent samples t-test, including assumptions, formulas, worked examples, and APA reporting guidelines.
Effect Size
Learn what effect size is, why it matters more than p-values alone, and how to calculate and interpret Cohen's d, Hedges' g, and eta-squared for your research.
Statistical Power & Power Analysis
Learn what statistical power is, why 80% is the standard threshold, and how to conduct a power analysis to determine if your study can detect real effects.
Kruskal-Wallis H Test
Learn how to conduct and interpret a Kruskal-Wallis H test, the non-parametric alternative to one-way ANOVA, with formulas, a worked example, and APA reporting guidelines.
Two-Way (Factorial) ANOVA
Learn how to conduct and interpret a two-way ANOVA, including main effects, interaction effects, formulas, a worked example, and APA reporting guidelines.
Repeated Measures ANOVA
Learn how to conduct and interpret a repeated measures ANOVA: compare means across three or more time points or conditions from the same participants, test sphericity, and apply corrections.
Sample Size Determination
Learn how to calculate the right sample size for your research study using power analysis, effect size estimates, and practical planning considerations.