Two-Way (Factorial) ANOVA
Two-Way ANOVA
What Is a Two-Way ANOVA?
A two-way ANOVA (also called a factorial ANOVA) tests the effects of two categorical independent variables (called factors) on a continuous dependent variable. It simultaneously answers three questions:
- Main effect of Factor A: Does the dependent variable differ across the levels of Factor A, averaging over Factor B?
- Main effect of Factor B: Does the dependent variable differ across the levels of Factor B, averaging over Factor A?
- Interaction effect (A x B): Does the effect of Factor A depend on the level of Factor B (and vice versa)?
The interaction effect is what makes the two-way ANOVA uniquely powerful. It allows you to detect situations where the combination of two factors produces an outcome that neither factor alone would predict.
Example: Suppose a new drug lowers blood pressure in young adults but has no effect in older adults. Neither age alone nor drug alone tells the whole story — you need the interaction to understand the pattern.
When to Use It
Use a two-way ANOVA when:
- You have one continuous dependent variable (e.g., exam scores, reaction time, sales revenue)
- You have two categorical independent variables (e.g., teaching method and class size, drug type and dosage level)
- The groups are independent (different participants in each cell of the design)
Examples:
- Teaching method (lecture vs. active learning) x class size (small vs. large) on exam scores
- Gender (male vs. female) x treatment (drug vs. placebo) on symptom reduction
- Advertising type (humor vs. emotional vs. informational) x medium (TV vs. social media) on brand recall
Design notation: A 2 x 3 ANOVA has two levels of Factor A and three levels of Factor B, producing 6 cells. A 2 x 2 design has four cells and is the simplest factorial design.
Understanding Interaction Effects
The interaction is often the most important finding in a factorial ANOVA. An interaction means the effect of one factor changes depending on the level of the other factor.
No interaction (additive effects): The lines in an interaction plot are roughly parallel. Active learning is better than lecture regardless of class size, and the advantage is about the same in both small and large classes.
Interaction present: The lines in an interaction plot cross or diverge. Active learning might outperform lecture in small classes but show little advantage in large classes. The effect of teaching method depends on class size.
When the interaction is significant, interpret it first. The main effects can be misleading because averaging across levels of the other factor hides the conditional pattern. Report simple effects — the effect of Factor A at each level of Factor B — rather than relying on marginal means alone.
Assumptions
-
Independence of observations. Each participant appears in only one cell of the design.
-
Normality. The dependent variable should be approximately normally distributed within each cell. The two-way ANOVA is robust to moderate violations when cell sizes are equal and at least 15-20 per cell.
-
Homogeneity of variances. The variances should be roughly equal across all cells of the design. Test with Levene's test. If violated, consider a more robust method or transform the data.
-
No significant outliers. Extreme values within cells can distort the F-tests. Check boxplots within each cell.
Formula
Decomposing Variance
The total variance is partitioned into four components:
Main effect of Factor A:
Main effect of Factor B:
Interaction effect:
Within-groups (error):
Where = number of levels of Factor A, = number of levels of Factor B, = number of participants per cell.
Mean Squares and F-Ratios
Each effect has its own F-test:
Effect Size: Partial Eta-Squared
For each effect:
| Interpretation | |
|---|---|
| .01 | Small |
| .06 | Medium |
| .14 | Large |
Worked Example
Scenario: An educational researcher investigates the effects of teaching method (Lecture vs. Active Learning) and class size (Small: 20 students vs. Large: 80 students) on final exam scores. This is a 2 x 2 between-subjects factorial design with 8 students per cell ().
Cell means and standard deviations:
| Small Class | Large Class | Marginal Mean | |
|---|---|---|---|
| Lecture | = 72, = 6 | = 70, = 7 | 71.0 |
| Active Learning | = 85, = 5 | = 74, = 6 | 79.5 |
| Marginal Mean | 78.5 | 72.0 | = 75.25 |
Step 1: State the hypotheses.
- Main effect of Method: :
- Main effect of Class Size: :
- Interaction: : The effect of teaching method does not depend on class size
Step 2: Calculate sums of squares.
Each cell has . Factor A (Method) has levels, Factor B (Class Size) has levels.
(Method):
(Class Size):
(Interaction):
Compute the residual for each cell:
- Lecture, Small:
- Lecture, Large:
- Active, Small:
- Active, Large:
(Error):
Using for each cell:
Step 3: Calculate mean squares and F-ratios.
| Source | SS | df | MS | F |
|---|---|---|---|---|
| Method (A) | 578.0 | 1 | 578.0 | 15.84 |
| Class Size (B) | 338.0 | 1 | 338.0 | 9.26 |
| A x B | 162.0 | 1 | 162.0 | 4.44 |
| Within (Error) | 1022.0 | 28 | 36.5 | |
| Total | 2100.0 | 31 |
Step 4: Determine p-values.
Using the F-distribution with , :
- Method: ,
- Class Size: ,
- Interaction: ,
All three effects are statistically significant.
Step 5: Calculate effect sizes (partial eta-squared).
Step 6: Interpret the interaction.
Because the interaction is significant, we examine the simple effects:
- In small classes: Active learning () outperformed lecture () by 13 points
- In large classes: Active learning () outperformed lecture () by only 4 points
The advantage of active learning shrinks considerably in large classes. This is the interaction — teaching method and class size jointly influence exam scores in a way that cannot be explained by either factor alone.
Interpretation
The two-way ANOVA revealed significant main effects and a significant interaction:
- Main effect of teaching method: Active learning () produced significantly higher exam scores than lecture (), , , .
- Main effect of class size: Small classes () produced significantly higher scores than large classes (), , , .
- Interaction: The advantage of active learning over lecture was significantly larger in small classes (13 points) than in large classes (4 points), , , .
The interaction suggests that active learning is most effective in small class environments. In large classes, the benefit of active learning is reduced — possibly because large group sizes limit the student engagement and participation that active learning relies on.
Common Mistakes
-
Interpreting main effects when the interaction is significant. Main effects represent average differences across levels of the other factor. When an interaction is present, these averages are misleading. Focus on simple effects instead.
-
Ignoring unbalanced designs. When cell sizes are unequal, the sums of squares for different effects are no longer independent. Use Type III sums of squares (the default in most software) to correctly partition variance.
-
Not plotting the interaction. An interaction plot (with one factor on the x-axis, the DV on the y-axis, and separate lines for the other factor) is essential for understanding the pattern. Non-parallel lines indicate an interaction.
-
Confusing ordinal and disordinal interactions. In an ordinal interaction, one group is always higher but the gap changes in size (the lines do not cross). In a disordinal (crossover) interaction, the direction of the effect reverses (the lines cross). Disordinal interactions are typically more theoretically interesting.
-
Running separate one-way ANOVAs instead. Analyzing each factor separately ignores the interaction and wastes statistical power. The two-way ANOVA tests both factors and their interaction simultaneously.
-
Not checking assumptions within cells. Normality and homogeneity of variances must hold within each cell (combination of factor levels), not just within each level of a single factor.
How to Run It
Effect sizes (partial eta-squared)
library(effectsize) eta_squared(model, partial = TRUE)
Interaction plot
interaction.plot(mydata$class_size, mydata$method, mydata$score, xlab = "Class Size", ylab = "Exam Score", trace.label = "Method")
Simple effects (if interaction is significant)
library(emmeans) emm <- emmeans(model, ~ method | class_size) pairs(emm)
```python
import pandas as pd
import pingouin as pg
from statsmodels.formula.api import ols
from statsmodels.stats.anova import anova_lm
# Using statsmodels (Type II SS by default)
model = ols('score ~ C(method) * C(class_size)', data=df).fit()
table = anova_lm(model, typ=2)
print(table)
# Using pingouin
aov = pg.anova(dv='score', between=['method', 'class_size'], data=df)
print(aov)
# Interaction plot
import matplotlib.pyplot as plt
pd.pivot_table(df, values='score', index='class_size',
columns='method').plot(marker='o')
plt.ylabel('Exam Score')
plt.show()
```
Go to Analyze > General Linear Model > Univariate
Move your dependent variable (e.g., Exam Score) into the Dependent Variable box
Move both independent variables (e.g., Method and Class Size) into the Fixed Factor(s) box
Click Options: check Descriptive statistics, Estimates of effect size, and Homogeneity tests
Click Plots: put one factor on the Horizontal Axis and the other in Separate Lines, then click Add
Click Post Hoc if any factor has 3+ levels and select Tukey
Click OK
SPSS reports the Tests of Between-Subjects Effects table with F-statistics, p-values, and partial eta-squared for each main effect and the interaction. If the interaction is significant, examine the Estimated Marginal Means for simple effects.
Use the Data Analysis ToolPak (enable via File > Options > Add-ins):
Go to Data > Data Analysis > Anova: Two-Factor With Replication
Select the input range (data arranged with one factor in rows and the other in columns, with replicates stacked within each cell)
Enter the number of Rows per sample (replicates per cell)
Set alpha to 0.05
Click OK
Excel produces an ANOVA summary table with SS, df, MS, F, p-value, and F-critical for both main effects and the interaction. Calculate partial eta-squared manually as SSeffect / (SSeffect + SSwithin). Excel does not produce interaction plots or post-hoc tests.
Ready to calculate?
Now that you understand the concept, use the free Effect Size Calculator on Subthesis to run your own analysis.
Related Concepts
One-Way ANOVA
Learn how to conduct a one-way ANOVA to compare three or more group means, including F-ratio formulas, post-hoc tests, and effect size with eta-squared.
Effect Size
Learn what effect size is, why it matters more than p-values alone, and how to calculate and interpret Cohen's d, Hedges' g, and eta-squared for your research.
Statistical Power & Power Analysis
Learn what statistical power is, why 80% is the standard threshold, and how to conduct a power analysis to determine if your study can detect real effects.