Paired Samples t-Test
Paired Samples t-Test
What Is the Paired Samples t-Test?
The paired samples t-test (also called the dependent samples t-test or repeated-measures t-test) compares two means from the same group of participants. Instead of comparing two separate groups, you compare each participant's scores under two different conditions or at two different time points.
The key insight is that this test works with difference scores. For each participant, you calculate the difference between their two measurements, and then you test whether the average of those differences is significantly different from zero.
If the treatment has no effect, the average difference score should be close to zero.
When to Use It
Use a paired samples t-test when:
- The same participants are measured at two time points (pre-test/post-test design)
- The same participants are tested under two conditions (crossover or within-subjects design)
- Matched pairs are used — each participant in one condition is matched with a similar participant in the other condition
Examples:
- Blood pressure before and after taking a medication
- Student test scores at the beginning and end of a semester
- Reaction time under caffeine vs. placebo in the same participants
- Anxiety ratings before and after a therapy session
Why not just use an independent t-test? Because the paired t-test accounts for individual differences. Each person serves as their own control, which dramatically reduces variability and increases statistical power. This is one of the biggest advantages of within-subjects designs.
Assumptions
-
Continuous dependent variable. The outcome must be measured on an interval or ratio scale.
-
Related samples. The two measurements come from the same participants (or matched pairs). Each observation in one condition has a corresponding observation in the other.
-
Normality of difference scores. The differences between paired observations () should be approximately normally distributed. This is less strict than it sounds — with , the test is robust to non-normality. For smaller samples, check with a Shapiro-Wilk test or Q-Q plot of the differences.
-
No significant outliers in the difference scores. Extreme outliers in the differences can distort results. Inspect a boxplot of the difference scores.
Note: The paired t-test does not require homogeneity of variances (unlike the independent t-test), because it operates on a single set of difference scores.
Formula
Where:
- = mean of the difference scores
- = standard deviation of the difference scores
- = number of pairs
The denominator is the standard error of the mean difference.
Degrees of freedom:
The mean difference and its standard deviation are calculated as:
Worked Example
Scenario: A health researcher measures resting heart rate (BPM) in 12 participants before and after an 8-week mindfulness meditation program.
| Participant | Pre () | Post () | Difference () |
|---|---|---|---|
| 1 | 78 | 72 | -6 |
| 2 | 85 | 80 | -5 |
| 3 | 72 | 70 | -2 |
| 4 | 90 | 82 | -8 |
| 5 | 68 | 68 | 0 |
| 6 | 82 | 76 | -6 |
| 7 | 76 | 74 | -2 |
| 8 | 88 | 79 | -9 |
| 9 | 74 | 72 | -2 |
| 10 | 80 | 75 | -5 |
| 11 | 70 | 69 | -1 |
| 12 | 84 | 78 | -6 |
Step 1: State the hypotheses.
- : (no change in heart rate after the program)
- : (heart rate changed after the program)
Step 2: Calculate the mean and standard deviation of differences.
Sum of differences:
Sum of squared deviations from the mean:
Step 3: Calculate the t-statistic.
Step 4: Determine degrees of freedom and find the p-value.
For with (two-tailed): .
Step 5: Calculate the effect size (Cohen's ).
For a paired design, the effect size is:
Interpretation
Since , we reject the null hypothesis. There is a statistically significant decrease in resting heart rate after the 8-week mindfulness program.
The mean reduction was 4.33 BPM (), and the effect size is very large (). This means the average reduction was over 1.5 standard deviations of the within-person variability — a strong and consistent effect across participants.
The 95% confidence interval for the mean difference:
We are 95% confident that the true mean reduction in heart rate is between 2.51 and 6.16 BPM.
Common Mistakes
-
Using an independent t-test when data are paired. This is the most common error. If the same people are measured twice, you must use a paired t-test. An independent t-test ignores the within-person correlation and loses statistical power.
-
Checking normality of the raw scores instead of the differences. The assumption is that the difference scores are normally distributed, not the individual measurements at each time point.
-
Ignoring the direction of subtraction. Be consistent. If you calculate Post - Pre, a negative difference means a decrease. Flipping the direction mid-analysis leads to sign errors.
-
Assuming causation in pre/post designs without a control group. A significant pre-post difference could be due to the intervention, practice effects, regression to the mean, or maturation. A control group strengthens causal claims.
-
Not reporting the effect size. A significant paired t-test without leaves readers unable to judge practical significance. Always include it.
-
Ignoring outliers in the difference scores. One participant with an extreme change can dominate the results. Check for outliers with a boxplot of values.
How to Run It
Effect size (Cohen's d for paired data)
library(effsize) cohen.d(mydata$post, mydata$pre, paired = TRUE)
```python
from scipy import stats
import pingouin as pg
# Using scipy
t_stat, p_value = stats.ttest_rel(pre_scores, post_scores)
# Using pingouin (includes effect size)
result = pg.ttest(pre_scores, post_scores, paired=True)
print(result)
```
Go to Analyze > Compare Means > Paired-Samples T Test
Select your two measurement variables (e.g., Pre and Post) and move them into the Paired Variables box as a pair
Click OK
SPSS outputs the means of each measurement, the correlation between them, the mean difference, the t-statistic, degrees of freedom, and the p-value (two-tailed).
Use the T.TEST function:
=T.TEST(array1, array2, tails, type)
array1: range of pre-test scores
array2: range of post-test scores
tails: 2 (two-tailed)
type: 1 (paired)
Example: =T.TEST(A2:A13, B2:B13, 2, 1)
For the full output, use Data > Data Analysis > t-Test: Paired Two Sample for Means.
Ready to calculate?
Now that you understand the concept, use the free Effect Size Calculator on Subthesis to run your own analysis.
Related Concepts
Independent Samples t-Test
Learn how to conduct and interpret an independent samples t-test, including assumptions, formulas, worked examples, and APA reporting guidelines.
Effect Size
Learn what effect size is, why it matters more than p-values alone, and how to calculate and interpret Cohen's d, Hedges' g, and eta-squared for your research.
Statistical Power & Power Analysis
Learn what statistical power is, why 80% is the standard threshold, and how to conduct a power analysis to determine if your study can detect real effects.
Wilcoxon Signed-Rank Test
Learn how to conduct and interpret a Wilcoxon signed-rank test, the non-parametric alternative to the paired t-test, with formulas, a worked example, and APA reporting guidelines.
Repeated Measures ANOVA
Learn how to conduct and interpret a repeated measures ANOVA: compare means across three or more time points or conditions from the same participants, test sphericity, and apply corrections.
One-Way ANOVA
Learn how to conduct a one-way ANOVA to compare three or more group means, including F-ratio formulas, post-hoc tests, and effect size with eta-squared.