Skip to main content
Stats for Scholars
Concepts Decision Tree Reporting Calculators Blog Software Cheat Sheets
Concepts Decision Tree Reporting Calculators Blog Software Cheat Sheets
Home Concepts Repeated Measures ANOVA

Descriptive Statistics

  • Descriptive Statistics

Inferential Statistics

  • Chi-Square Test of Independence
  • Independent Samples t-Test
  • Kruskal-Wallis H Test
  • Logistic Regression
  • Mann-Whitney U Test
  • Multiple Linear Regression
  • One-Way ANOVA
  • Paired Samples t-Test
  • Pearson Correlation
  • Repeated Measures ANOVA
  • Simple Linear Regression
  • Two-Way (Factorial) ANOVA
  • Wilcoxon Signed-Rank Test

Effect Size & Power

  • Effect Size
  • Sample Size Determination
  • Statistical Power & Power Analysis

Reliability & Validity

  • Cronbach's Alpha
  • Inter-Rater Reliability

Repeated Measures ANOVA

advanced Inferential Statistics

Repeated Measures ANOVA

Purpose
Tests whether the mean of a continuous variable differs significantly across three or more related conditions or time points measured on the same participants.
When to Use
When the same participants are measured on a continuous outcome under three or more conditions or at three or more time points.
Data Type
One continuous dependent variable measured at 3+ levels of a within-subjects factor
Key Assumptions
Normality of differences, no significant outliers, sphericity (equal variances of differences between all pairs of conditions), tested with Mauchly's test.
Tools
Effect Size Calculator on Subthesis →

What Is Repeated Measures ANOVA?

Repeated measures ANOVA is an extension of the paired-samples t-test to three or more related conditions. It tests whether the mean of a continuous outcome differs across multiple time points, treatments, or conditions when the same participants are measured in all conditions.

Like the one-way ANOVA, it produces an F-ratio. However, whereas the one-way ANOVA compares independent groups (between-subjects), repeated measures ANOVA compares related measurements (within-subjects). Because each participant serves as their own control, this design is more powerful — individual differences are removed from the error term.

The F-ratio for a repeated measures design is:

F=MSconditionMSerrorF = \frac{MS_{condition}}{MS_{error}} F=MSerror​MScondition​​

Where MSerrorMS_{error}MSerror​ reflects variability in how participants respond differently across conditions (the condition ×\times× subjects interaction), rather than overall within-group variability.

When to Use It

Use repeated measures ANOVA when:

  • The same participants are measured under three or more conditions or at three or more time points.
  • Your dependent variable is continuous (interval or ratio scale).
  • You want to determine whether there is a statistically significant change across the conditions.

Examples:

  • Measuring anxiety scores before, during, and after an intervention
  • Comparing reaction times under easy, medium, and hard task conditions
  • Testing pain levels at 0, 2, 4, and 8 weeks post-surgery

If you have only two time points, use a paired-samples t-test. If different participants are in each group, use a one-way ANOVA.

Assumptions

  1. Continuous dependent variable. The outcome must be measured at the interval or ratio level.

  2. Related groups. The same participants provide data for all levels of the within-subjects factor (or the groups are matched).

  3. No significant outliers. Extreme values in any condition can distort results. Check boxplots for each condition.

  4. Normality. The dependent variable should be approximately normally distributed at each time point. With sample sizes above 20-30, repeated measures ANOVA is fairly robust to this.

  5. Sphericity. This is the critical assumption unique to repeated measures designs. Sphericity requires that the variances of the differences between all pairs of conditions are equal. For example, with three conditions (A, B, C), the variance of (A - B) should equal the variance of (A - C) and the variance of (B - C).

Testing Sphericity: Mauchly's Test

Mauchly's test evaluates whether the sphericity assumption holds:

  • H0H_0H0​: Sphericity is met (variances of differences are equal)
  • H1H_1H1​: Sphericity is violated

If Mauchly's test is significant (p<.05p < .05p<.05), sphericity is violated and corrections are needed:

  • Greenhouse-Geisser correction (ϵGG\epsilon_{GG}ϵGG​): Reduces the degrees of freedom to correct for the violation. More conservative. Use when ϵ<.75\epsilon < .75ϵ<.75.
  • Huynh-Feldt correction (ϵHF\epsilon_{HF}ϵHF​): Less conservative than Greenhouse-Geisser. Use when ϵ≥.75\epsilon \geq .75ϵ≥.75.

The epsilon (ϵ\epsilonϵ) value ranges from 1k−1\frac{1}{k-1}k−11​ (maximum violation) to 1.0 (perfect sphericity), where kkk is the number of conditions.

Formula

Partitioning Variance

In a repeated measures design, total variability is decomposed as:

SStotal=SSbetween-subjects+SSwithin-subjectsSS_{total} = SS_{between\text{-}subjects} + SS_{within\text{-}subjects} SStotal​=SSbetween-subjects​+SSwithin-subjects​

The within-subjects variability is further decomposed:

SSwithin-subjects=SScondition+SSerrorSS_{within\text{-}subjects} = SS_{condition} + SS_{error} SSwithin-subjects​=SScondition​+SSerror​

Where:

  • SSconditionSS_{condition}SScondition​ reflects differences among the condition means
  • SSerrorSS_{error}SSerror​ reflects the condition ×\times× subjects interaction (how inconsistently participants respond across conditions)

Degrees of Freedom

  • dfcondition=k−1df_{condition} = k - 1dfcondition​=k−1 (where kkk = number of conditions)
  • dferror=(k−1)(n−1)df_{error} = (k - 1)(n - 1)dferror​=(k−1)(n−1) (where nnn = number of participants)

Mean Squares and F-Ratio

MScondition=SSconditionk−1MS_{condition} = \frac{SS_{condition}}{k - 1} MScondition​=k−1SScondition​​

MSerror=SSerror(k−1)(n−1)MS_{error} = \frac{SS_{error}}{(k-1)(n-1)} MSerror​=(k−1)(n−1)SSerror​​

F=MSconditionMSerrorF = \frac{MS_{condition}}{MS_{error}} F=MSerror​MScondition​​

Effect Size: Partial Eta-Squared

ηp2=SSconditionSScondition+SSerror\eta_p^2 = \frac{SS_{condition}}{SS_{condition} + SS_{error}} ηp2​=SScondition​+SSerror​SScondition​​

ηp2\eta_p^2ηp2​ Interpretation
.01 Small
.06 Medium
.14 Large

Greenhouse-Geisser Corrected Degrees of Freedom

When sphericity is violated, multiply the degrees of freedom by ϵGG\epsilon_{GG}ϵGG​:

dfcondition∗=ϵGG×(k−1)df_{condition}^* = \epsilon_{GG} \times (k - 1) dfcondition∗​=ϵGG​×(k−1)

dferror∗=ϵGG×(k−1)(n−1)df_{error}^* = \epsilon_{GG} \times (k - 1)(n - 1) dferror∗​=ϵGG​×(k−1)(n−1)

The F-value itself does not change — only the degrees of freedom (and hence the p-value) change.

Worked Example

Scenario: A clinical psychologist measures test anxiety scores (0-50 scale) in n=6n = 6n=6 students at three time points: before an intervention (Pre), at the midpoint (Mid), and after the intervention (Post).

Participant Pre (T1T_1T1​) Mid (T2T_2T2​) Post (T3T_3T3​)
1 38 32 25
2 42 35 28
3 35 30 22
4 40 34 26
5 45 38 30
6 36 31 24

Step 1: Calculate the condition means and grand mean.

  • XˉPre=38+42+35+40+45+366=39.33\bar{X}_{Pre} = \frac{38+42+35+40+45+36}{6} = 39.33XˉPre​=638+42+35+40+45+36​=39.33
  • XˉMid=32+35+30+34+38+316=33.33\bar{X}_{Mid} = \frac{32+35+30+34+38+31}{6} = 33.33XˉMid​=632+35+30+34+38+31​=33.33
  • XˉPost=25+28+22+26+30+246=25.83\bar{X}_{Post} = \frac{25+28+22+26+30+24}{6} = 25.83XˉPost​=625+28+22+26+30+24​=25.83
  • Xˉgrand=39.33+33.33+25.833=32.83\bar{X}_{grand} = \frac{39.33 + 33.33 + 25.83}{3} = 32.83Xˉgrand​=339.33+33.33+25.83​=32.83

Step 2: Calculate SSconditionSS_{condition}SScondition​.

SScondition=n∑j=1k(Xˉj−Xˉgrand)2SS_{condition} = n \sum_{j=1}^{k} (\bar{X}_j - \bar{X}_{grand})^2 SScondition​=nj=1∑k​(Xˉj​−Xˉgrand​)2

=6[(39.33−32.83)2+(33.33−32.83)2+(25.83−32.83)2]= 6[(39.33-32.83)^2 + (33.33-32.83)^2 + (25.83-32.83)^2] =6[(39.33−32.83)2+(33.33−32.83)2+(25.83−32.83)2]

=6[42.25+0.25+49.00]=6×91.50=549.00= 6[42.25 + 0.25 + 49.00] = 6 \times 91.50 = 549.00 =6[42.25+0.25+49.00]=6×91.50=549.00

Step 3: Calculate SSerrorSS_{error}SSerror​.

SSerrorSS_{error}SSerror​ is computed as the residual variability after removing both subject effects and condition effects. For each cell, the residual is:

eij=Xij−Xˉj−Pˉi+Xˉgrande_{ij} = X_{ij} - \bar{X}_j - \bar{P}_i + \bar{X}_{grand} eij​=Xij​−Xˉj​−Pˉi​+Xˉgrand​

Where Pˉi\bar{P}_iPˉi​ is the mean for participant iii across all conditions.

Participant means: Pˉ1=31.67\bar{P}_1 = 31.67Pˉ1​=31.67, Pˉ2=35.00\bar{P}_2 = 35.00Pˉ2​=35.00, Pˉ3=29.00\bar{P}_3 = 29.00Pˉ3​=29.00, Pˉ4=33.33\bar{P}_4 = 33.33Pˉ4​=33.33, Pˉ5=37.67\bar{P}_5 = 37.67Pˉ5​=37.67, Pˉ6=30.33\bar{P}_6 = 30.33Pˉ6​=30.33.

Computing the squared residuals and summing yields:

SSerror=3.44SS_{error} = 3.44 SSerror​=3.44

Step 4: Calculate mean squares and the F-ratio.

MScondition=549.003−1=549.002=274.50MS_{condition} = \frac{549.00}{3-1} = \frac{549.00}{2} = 274.50 MScondition​=3−1549.00​=2549.00​=274.50

MSerror=3.44(3−1)(6−1)=3.4410=0.344MS_{error} = \frac{3.44}{(3-1)(6-1)} = \frac{3.44}{10} = 0.344 MSerror​=(3−1)(6−1)3.44​=103.44​=0.344

F=274.500.344=797.67F = \frac{274.50}{0.344} = 797.67 F=0.344274.50​=797.67

Step 5: Determine degrees of freedom and p-value.

With df1=2df_1 = 2df1​=2 and df2=10df_2 = 10df2​=10, this enormous F-value yields p<.001p < .001p<.001.

Step 6: Check sphericity.

Suppose Mauchly's test gives W=0.89W = 0.89W=0.89, p=.42p = .42p=.42. Since p>.05p > .05p>.05, sphericity is not violated and no correction is needed.

Step 7: Calculate effect size.

ηp2=549.00549.00+3.44=549.00552.44=.994\eta_p^2 = \frac{549.00}{549.00 + 3.44} = \frac{549.00}{552.44} = .994 ηp2​=549.00+3.44549.00​=552.44549.00​=.994

This is an extremely large effect. Test anxiety declined dramatically across the three time points.

Step 8: Post-hoc pairwise comparisons.

With a significant omnibus F, conduct Bonferroni-corrected pairwise comparisons:

  • Pre vs. Mid: XˉPre−XˉMid=6.00\bar{X}_{Pre} - \bar{X}_{Mid} = 6.00XˉPre​−XˉMid​=6.00, p<.001p < .001p<.001
  • Mid vs. Post: XˉMid−XˉPost=7.50\bar{X}_{Mid} - \bar{X}_{Post} = 7.50XˉMid​−XˉPost​=7.50, p<.001p < .001p<.001
  • Pre vs. Post: XˉPre−XˉPost=13.50\bar{X}_{Pre} - \bar{X}_{Post} = 13.50XˉPre​−XˉPost​=13.50, p<.001p < .001p<.001

All pairwise comparisons are significant — anxiety decreased significantly at each stage of the intervention.

Interpretation

The repeated measures ANOVA revealed a significant effect of time on test anxiety, F(2,10)=797.67F(2, 10) = 797.67F(2,10)=797.67, p<.001p < .001p<.001, ηp2=.99\eta_p^2 = .99ηp2​=.99. Anxiety scores decreased from pre-intervention (M=39.33M = 39.33M=39.33) to mid-intervention (M=33.33M = 33.33M=33.33) to post-intervention (M=25.83M = 25.83M=25.83), and every pairwise comparison was statistically significant. The intervention appears to have produced a large and consistent reduction in test anxiety.

What If Sphericity Is Violated?

If Mauchly's test had been significant, you would report the corrected results. For example, with ϵGG=0.68\epsilon_{GG} = 0.68ϵGG​=0.68:

  • Corrected df1=0.68×2=1.36df_1 = 0.68 \times 2 = 1.36df1​=0.68×2=1.36
  • Corrected df2=0.68×10=6.80df_2 = 0.68 \times 10 = 6.80df2​=0.68×10=6.80
  • Report: F(1.36,6.80)=797.67F(1.36, 6.80) = 797.67F(1.36,6.80)=797.67, p<.001p < .001p<.001 (Greenhouse-Geisser corrected)

Common Mistakes

  1. Ignoring sphericity. Always check Mauchly's test and apply a correction (Greenhouse-Geisser or Huynh-Feldt) when it is significant. Failing to correct inflates the Type I error rate.

  2. Using a one-way ANOVA instead. If the same participants appear in every condition, a between-subjects ANOVA is incorrect because it treats the repeated measurements as independent, violating the independence assumption and wasting statistical power.

  3. Not conducting post-hoc comparisons. A significant F-test tells you that at least one time point differs, but not which ones. Use Bonferroni-corrected pairwise comparisons or polynomial contrasts to identify specific differences.

  4. Ignoring missing data. Standard repeated measures ANOVA uses listwise deletion — a participant missing one time point is dropped entirely. Consider mixed-effects models for data with missing observations.

  5. Over-interpreting a time effect as a treatment effect. If there is no control group, changes over time could reflect maturation, practice effects, or regression to the mean rather than the intervention. A mixed ANOVA (between-within design) with a control group is stronger.

  6. Reporting partial η2\eta^2η2 as η2\eta^2η2. These are different in repeated measures designs. Clearly label which effect size you report.

How to Run It

```r # Repeated measures ANOVA in R using ez library(ez)

Data must be in long format with columns:

participant, time (factor), score

result <- ezANOVA( data = mydata_long, dv = .(score), wid = .(participant), within = .(time), detailed = TRUE ) print(result)

Includes Mauchly's test and GG/HF corrections

Post-hoc pairwise comparisons (Bonferroni)

pairwise.t.test(mydata_long$score, mydata_long$time, paired = TRUE, p.adjust.method = "bonferroni")

```python import pingouin as pg # Data must be in long format with columns: # participant, time, score aov = pg.rm_anova( data=df_long, dv='score', within='time', subject='participant', correction=True # applies GG correction if needed ) print(aov) # Sphericity test spher = pg.sphericity( data=df_long, dv='score', within='time', subject='participant' ) print(spher) # Post-hoc pairwise comparisons posthoc = pg.pairwise_tests( data=df_long, dv='score', within='time', subject='participant', padjust='bonf' ) print(posthoc) ```
  1. Go to Analyze > General Linear Model > Repeated Measures
  2. In the dialog, define your within-subjects factor (e.g., name it "Time" with 3 levels) and click Add, then Define
  3. Move the three measurement variables (Pre, Mid, Post) into the within-subjects slots
  4. Click Options: check Descriptive statistics, Estimates of effect size, and Observed power
  5. Click Compare main effects and select Bonferroni as the confidence interval adjustment
  6. Click Plots: move your within-subjects factor to the horizontal axis and click Add
  7. Click OK

SPSS outputs Mauchly's Test of Sphericity, the Tests of Within-Subjects Effects table (with Sphericity Assumed, Greenhouse-Geisser, and Huynh-Feldt rows), Pairwise Comparisons with Bonferroni adjustment, and partial eta-squared as the effect size.

Excel does not have a built-in repeated measures ANOVA tool. The Data Analysis ToolPak offers "Anova: Two-Factor Without Replication," which can approximate a repeated measures design:

  1. Arrange data so each row is a participant and each column is a condition (Pre, Mid, Post)
  2. Go to Data > Data Analysis > Anova: Two-Factor Without Replication
  3. Select the data range (including headers)
  4. Set alpha to 0.05 and click OK

The "Rows" factor represents participants and the "Columns" factor represents your repeated measure. The F and p-value for the Columns factor test the within-subjects effect. Note that this method does not provide Mauchly's test, epsilon corrections, or post-hoc comparisons. For proper repeated measures analysis with sphericity tests, use R, Python, or SPSS.

## How to Report in APA Format > A one-way repeated measures ANOVA was conducted to compare test anxiety scores across three time points (pre-intervention, mid-intervention, and post-intervention). Mauchly's test indicated that the assumption of sphericity was met, $W = 0.89$, $p = .42$. There was a statistically significant effect of time on test anxiety, $F(2, 10) = 797.67$, $p < .001$, $\eta_p^2 = .99$. Bonferroni-corrected post-hoc comparisons revealed significant decreases from pre- to mid-intervention ($M_{diff} = 6.00$, $p < .001$), from mid- to post-intervention ($M_{diff} = 7.50$, $p < .001$), and from pre- to post-intervention ($M_{diff} = 13.50$, $p < .001$). If sphericity were violated, report the corrected values: > Mauchly's test indicated that the assumption of sphericity was violated, $\chi^2(2) = 7.34$, $p = .026$. Therefore, degrees of freedom were corrected using Greenhouse-Geisser estimates of sphericity ($\epsilon = .68$). The results showed a significant effect of time, $F(1.36, 6.80) = 797.67$, $p < .001$, $\eta_p^2 = .99$. Key elements to include: - Mauchly's test result (and correction used if sphericity is violated) - The F-statistic with corrected degrees of freedom if applicable - Partial eta-squared ($\eta_p^2$) as effect size - Condition means and standard deviations - Post-hoc pairwise comparisons with correction method

Ready to calculate?

Now that you understand the concept, use the free Effect Size Calculator on Subthesis to run your own analysis.

Calculate Effect Size for Your ANOVA on Subthesis

Related Concepts

Paired Samples t-Test

Learn how to conduct a paired samples t-test for pre/post designs and repeated measures, with formulas, worked examples, and APA reporting format.

One-Way ANOVA

Learn how to conduct a one-way ANOVA to compare three or more group means, including F-ratio formulas, post-hoc tests, and effect size with eta-squared.

Effect Size

Learn what effect size is, why it matters more than p-values alone, and how to calculate and interpret Cohen's d, Hedges' g, and eta-squared for your research.

Stats for Scholars

Statistics for Researchers, Not Statisticians

A Subthesis Resource

Learn

  • Statistical Concepts
  • Choose a Test
  • APA Reporting
  • Blog

Resources

  • Calculators
  • Cheat Sheets
  • About
  • FAQ
  • Accessibility
  • Privacy
  • Terms

© 2026 Angel Reyes / Subthesis. All rights reserved.

Privacy Policy Terms of Use