Repeated Measures ANOVA
Repeated Measures ANOVA
What Is Repeated Measures ANOVA?
Repeated measures ANOVA is an extension of the paired-samples t-test to three or more related conditions. It tests whether the mean of a continuous outcome differs across multiple time points, treatments, or conditions when the same participants are measured in all conditions.
Like the one-way ANOVA, it produces an F-ratio. However, whereas the one-way ANOVA compares independent groups (between-subjects), repeated measures ANOVA compares related measurements (within-subjects). Because each participant serves as their own control, this design is more powerful — individual differences are removed from the error term.
The F-ratio for a repeated measures design is:
Where reflects variability in how participants respond differently across conditions (the condition subjects interaction), rather than overall within-group variability.
When to Use It
Use repeated measures ANOVA when:
- The same participants are measured under three or more conditions or at three or more time points.
- Your dependent variable is continuous (interval or ratio scale).
- You want to determine whether there is a statistically significant change across the conditions.
Examples:
- Measuring anxiety scores before, during, and after an intervention
- Comparing reaction times under easy, medium, and hard task conditions
- Testing pain levels at 0, 2, 4, and 8 weeks post-surgery
If you have only two time points, use a paired-samples t-test. If different participants are in each group, use a one-way ANOVA.
Assumptions
-
Continuous dependent variable. The outcome must be measured at the interval or ratio level.
-
Related groups. The same participants provide data for all levels of the within-subjects factor (or the groups are matched).
-
No significant outliers. Extreme values in any condition can distort results. Check boxplots for each condition.
-
Normality. The dependent variable should be approximately normally distributed at each time point. With sample sizes above 20-30, repeated measures ANOVA is fairly robust to this.
-
Sphericity. This is the critical assumption unique to repeated measures designs. Sphericity requires that the variances of the differences between all pairs of conditions are equal. For example, with three conditions (A, B, C), the variance of (A - B) should equal the variance of (A - C) and the variance of (B - C).
Testing Sphericity: Mauchly's Test
Mauchly's test evaluates whether the sphericity assumption holds:
- : Sphericity is met (variances of differences are equal)
- : Sphericity is violated
If Mauchly's test is significant (), sphericity is violated and corrections are needed:
- Greenhouse-Geisser correction (): Reduces the degrees of freedom to correct for the violation. More conservative. Use when .
- Huynh-Feldt correction (): Less conservative than Greenhouse-Geisser. Use when .
The epsilon () value ranges from (maximum violation) to 1.0 (perfect sphericity), where is the number of conditions.
Formula
Partitioning Variance
In a repeated measures design, total variability is decomposed as:
The within-subjects variability is further decomposed:
Where:
- reflects differences among the condition means
- reflects the condition subjects interaction (how inconsistently participants respond across conditions)
Degrees of Freedom
- (where = number of conditions)
- (where = number of participants)
Mean Squares and F-Ratio
Effect Size: Partial Eta-Squared
| Interpretation | |
|---|---|
| .01 | Small |
| .06 | Medium |
| .14 | Large |
Greenhouse-Geisser Corrected Degrees of Freedom
When sphericity is violated, multiply the degrees of freedom by :
The F-value itself does not change — only the degrees of freedom (and hence the p-value) change.
Worked Example
Scenario: A clinical psychologist measures test anxiety scores (0-50 scale) in students at three time points: before an intervention (Pre), at the midpoint (Mid), and after the intervention (Post).
| Participant | Pre () | Mid () | Post () |
|---|---|---|---|
| 1 | 38 | 32 | 25 |
| 2 | 42 | 35 | 28 |
| 3 | 35 | 30 | 22 |
| 4 | 40 | 34 | 26 |
| 5 | 45 | 38 | 30 |
| 6 | 36 | 31 | 24 |
Step 1: Calculate the condition means and grand mean.
Step 2: Calculate .
Step 3: Calculate .
is computed as the residual variability after removing both subject effects and condition effects. For each cell, the residual is:
Where is the mean for participant across all conditions.
Participant means: , , , , , .
Computing the squared residuals and summing yields:
Step 4: Calculate mean squares and the F-ratio.
Step 5: Determine degrees of freedom and p-value.
With and , this enormous F-value yields .
Step 6: Check sphericity.
Suppose Mauchly's test gives , . Since , sphericity is not violated and no correction is needed.
Step 7: Calculate effect size.
This is an extremely large effect. Test anxiety declined dramatically across the three time points.
Step 8: Post-hoc pairwise comparisons.
With a significant omnibus F, conduct Bonferroni-corrected pairwise comparisons:
- Pre vs. Mid: ,
- Mid vs. Post: ,
- Pre vs. Post: ,
All pairwise comparisons are significant — anxiety decreased significantly at each stage of the intervention.
Interpretation
The repeated measures ANOVA revealed a significant effect of time on test anxiety, , , . Anxiety scores decreased from pre-intervention () to mid-intervention () to post-intervention (), and every pairwise comparison was statistically significant. The intervention appears to have produced a large and consistent reduction in test anxiety.
What If Sphericity Is Violated?
If Mauchly's test had been significant, you would report the corrected results. For example, with :
- Corrected
- Corrected
- Report: , (Greenhouse-Geisser corrected)
Common Mistakes
-
Ignoring sphericity. Always check Mauchly's test and apply a correction (Greenhouse-Geisser or Huynh-Feldt) when it is significant. Failing to correct inflates the Type I error rate.
-
Using a one-way ANOVA instead. If the same participants appear in every condition, a between-subjects ANOVA is incorrect because it treats the repeated measurements as independent, violating the independence assumption and wasting statistical power.
-
Not conducting post-hoc comparisons. A significant F-test tells you that at least one time point differs, but not which ones. Use Bonferroni-corrected pairwise comparisons or polynomial contrasts to identify specific differences.
-
Ignoring missing data. Standard repeated measures ANOVA uses listwise deletion — a participant missing one time point is dropped entirely. Consider mixed-effects models for data with missing observations.
-
Over-interpreting a time effect as a treatment effect. If there is no control group, changes over time could reflect maturation, practice effects, or regression to the mean rather than the intervention. A mixed ANOVA (between-within design) with a control group is stronger.
-
Reporting partial as . These are different in repeated measures designs. Clearly label which effect size you report.
How to Run It
Data must be in long format with columns:
participant, time (factor), score
result <- ezANOVA( data = mydata_long, dv = .(score), wid = .(participant), within = .(time), detailed = TRUE ) print(result)
Includes Mauchly's test and GG/HF corrections
Post-hoc pairwise comparisons (Bonferroni)
pairwise.t.test(mydata_long$score, mydata_long$time, paired = TRUE, p.adjust.method = "bonferroni")
```python
import pingouin as pg
# Data must be in long format with columns:
# participant, time, score
aov = pg.rm_anova(
data=df_long,
dv='score',
within='time',
subject='participant',
correction=True # applies GG correction if needed
)
print(aov)
# Sphericity test
spher = pg.sphericity(
data=df_long, dv='score',
within='time', subject='participant'
)
print(spher)
# Post-hoc pairwise comparisons
posthoc = pg.pairwise_tests(
data=df_long, dv='score',
within='time', subject='participant',
padjust='bonf'
)
print(posthoc)
```
Go to Analyze > General Linear Model > Repeated Measures
In the dialog, define your within-subjects factor (e.g., name it "Time" with 3 levels) and click Add, then Define
Move the three measurement variables (Pre, Mid, Post) into the within-subjects slots
Click Options: check Descriptive statistics, Estimates of effect size, and Observed power
Click Compare main effects and select Bonferroni as the confidence interval adjustment
Click Plots: move your within-subjects factor to the horizontal axis and click Add
Click OK
SPSS outputs Mauchly's Test of Sphericity, the Tests of Within-Subjects Effects table (with Sphericity Assumed, Greenhouse-Geisser, and Huynh-Feldt rows), Pairwise Comparisons with Bonferroni adjustment, and partial eta-squared as the effect size.
Excel does not have a built-in repeated measures ANOVA tool. The Data Analysis ToolPak offers "Anova: Two-Factor Without Replication," which can approximate a repeated measures design:
Arrange data so each row is a participant and each column is a condition (Pre, Mid, Post)
Go to Data > Data Analysis > Anova: Two-Factor Without Replication
Select the data range (including headers)
Set alpha to 0.05 and click OK
The "Rows" factor represents participants and the "Columns" factor represents your repeated measure. The F and p-value for the Columns factor test the within-subjects effect. Note that this method does not provide Mauchly's test, epsilon corrections, or post-hoc comparisons. For proper repeated measures analysis with sphericity tests, use R, Python, or SPSS.
Ready to calculate?
Now that you understand the concept, use the free Effect Size Calculator on Subthesis to run your own analysis.
Related Concepts
Paired Samples t-Test
Learn how to conduct a paired samples t-test for pre/post designs and repeated measures, with formulas, worked examples, and APA reporting format.
One-Way ANOVA
Learn how to conduct a one-way ANOVA to compare three or more group means, including F-ratio formulas, post-hoc tests, and effect size with eta-squared.
Effect Size
Learn what effect size is, why it matters more than p-values alone, and how to calculate and interpret Cohen's d, Hedges' g, and eta-squared for your research.