Independent Samples t-Test

Purpose

Compares the means of two independent (unrelated) groups to determine if there is a statistically significant difference between them.

When to Use

When you have one continuous dependent variable and one categorical independent variable with exactly two independent groups.

Data Type

Continuous (interval or ratio) dependent variable; binary categorical independent variable

Key Assumptions

Independence of observations, approximately normal distribution in each group (or large samples), homogeneity of variances (equal variances across groups).

Tools

Effect Size Calculator on Subthesis →

What Is the Independent Samples t-Test?

The independent samples t-test (also called the two-sample t-test or Student's t-test) compares the means of two separate groups to determine whether the difference between them is statistically significant. The groups must be independent — meaning the participants in one group are completely different people from those in the other group.

This is one of the most commonly used statistical tests in the social and behavioral sciences. Any time you randomly assign participants to two conditions (treatment vs. control, method A vs. method B), the independent t-test is your go-to analysis.

When to Use It

Use an independent samples t-test when:

You have one continuous dependent variable (e.g., test scores, reaction time, blood pressure)
You have one categorical independent variable with exactly two groups (e.g., treatment vs. control)
The two groups contain different participants (not the same people measured twice — that's a paired t-test)

Examples:

Comparing exam scores between students who used a study app vs. those who didn't
Comparing anxiety levels between a therapy group and a waitlist control group
Comparing salary between two departments

If you have three or more groups, use a one-way ANOVA instead.

Assumptions

The independent samples t-test requires four assumptions:

Independence of observations. Each participant's score is unrelated to every other participant's score. Violated if participants are clustered (e.g., students within classrooms).
Continuous dependent variable. The outcome must be measured on an interval or ratio scale.
Approximate normality. The dependent variable should be approximately normally distributed within each group. With sample sizes above 30 per group, the t-test is robust to moderate violations due to the Central Limit Theorem.
Homogeneity of variances. The variability in each group should be roughly equal. Test this with Levene's test. If variances are unequal, use Welch's t-test instead (most software offers this as an option).

Formula

Student's t-test (equal variances assumed)

t = \frac{\bar{X}_1 - \bar{X}_2}{\sqrt{\frac{s_p^2}{n_1} + \frac{s_p^2}{n_2}}}

Where the pooled variance is:

s_p^2 = \frac{(n_1 - 1)s_1^2 + (n_2 - 1)s_2^2}{n_1 + n_2 - 2}

Degrees of freedom:

df = n_1 + n_2 - 2

Welch's t-test (equal variances NOT assumed)

t = \frac{\bar{X}_1 - \bar{X}_2}{\sqrt{\frac{s_1^2}{n_1} + \frac{s_2^2}{n_2}}}

Degrees of freedom (Welch-Satterthwaite approximation):

df = \frac{\left(\frac{s_1^2}{n_1} + \frac{s_2^2}{n_2}\right)^2}{\frac{\left(\frac{s_1^2}{n_1}\right)^2}{n_1 - 1} + \frac{\left(\frac{s_2^2}{n_2}\right)^2}{n_2 - 1}}

Tip: Many statisticians recommend always using Welch's t-test. It performs as well as Student's t when variances are equal, and much better when they are not.

Worked Example

Scenario: A sports psychologist investigates whether visualization training improves free-throw accuracy in basketball players. She randomly assigns 20 players to a visualization group and 20 to a control group. After 4 weeks, she measures their free-throw accuracy (percentage made out of 50 attempts).

	Visualization Group	Control Group
$n$	20	20
$\bar{X}$	74.5	68.0
$s$	8.2	9.1

Step 1: State the hypotheses.

$H_0$ : $\mu_1 = \mu_2$ (no difference in free-throw accuracy)
$H_1$ : $\mu_1 \neq \mu_2$ (there is a difference)

Step 2: Calculate the pooled variance.

s_p^2 = \frac{(20 - 1)(8.2)^2 + (20 - 1)(9.1)^2}{20 + 20 - 2} = \frac{19(67.24) + 19(82.81)}{38}

s_p^2 = \frac{1277.56 + 1573.39}{38} = \frac{2850.95}{38} = 75.03

Step 3: Calculate the t-statistic.

t = \frac{74.5 - 68.0}{\sqrt{\frac{75.03}{20} + \frac{75.03}{20}}} = \frac{6.5}{\sqrt{3.75 + 3.75}} = \frac{6.5}{\sqrt{7.50}} = \frac{6.5}{2.74} = 2.37

Step 4: Determine degrees of freedom and find the p-value.

df = 20 + 20 - 2 = 38

Looking up $t = 2.37$ with $df = 38$ in a t-table (two-tailed): $p \approx .023$ .

Step 5: Calculate the effect size (Cohen's d).

d = \frac{\bar{X}_1 - \bar{X}_2}{s_p} = \frac{74.5 - 68.0}{\sqrt{75.03}} = \frac{6.5}{8.66} = 0.75

Interpretation

Since $p = .023 < .05$ , we reject the null hypothesis. There is a statistically significant difference in free-throw accuracy between the visualization group ( $M = 74.5$ , $SD = 8.2$ ) and the control group ( $M = 68.0$ , $SD = 9.1$ ).

The effect size of $d = 0.75$ indicates a medium-to-large effect. Players who practiced visualization scored about three-quarters of a standard deviation higher in free-throw accuracy than those who did not.

What to consider:

A significant result means the difference is unlikely due to chance alone — it does not tell you the difference is large or practically important. That's what the effect size is for.
Always examine the confidence interval for the mean difference to understand the plausible range: $6.5 \pm 2.02 \times 2.74 = [0.97, 12.03]$ .
The 95% CI does not include zero, consistent with the significant p-value.

Common Mistakes

Using an independent t-test when the data are paired. If the same participants are measured before and after an intervention, use a paired t-test. Using the wrong test inflates the standard error and reduces power.
Ignoring the equal variances assumption. If Levene's test is significant ( $p < .05$ ), the variances are unequal. Report Welch's t-test instead of Student's t-test.
Not checking normality. Examine histograms or Q-Q plots for each group. With $n < 30$ per group, severe skewness or outliers can invalidate results. Consider the Mann-Whitney U test as a non-parametric alternative.
Reporting only the p-value. Always report the means, standard deviations, t-statistic, degrees of freedom, p-value, and effect size. A complete result tells the whole story.
Conducting multiple t-tests instead of ANOVA. If you have three or more groups, do not run all pairwise t-tests — this inflates the family-wise error rate. Use a one-way ANOVA with post-hoc tests.
Confusing statistical significance with practical significance. With a large sample, even tiny differences can be statistically significant. Always evaluate the effect size and confidence interval.

How to Report in APA Format

An independent samples t-test was conducted to compare free-throw accuracy between the visualization training group and the control group. There was a significant difference in scores for the visualization group ( $M$ = 74.5, $SD$ = 8.2) and the control group ( $M$ = 68.0, $SD$ = 9.1), $t$ (38) = 2.37, $p$ = .023, $d$ = 0.75, 95% CI [0.97, 12.03]. These results suggest that visualization training has a meaningful positive effect on free-throw performance.

If Welch's correction was used:

A Welch's independent samples t-test indicated that the visualization group ( $M$ = 74.5, $SD$ = 8.2) scored significantly higher than the control group ( $M$ = 68.0, $SD$ = 9.1), $t$ (37.46) = 2.37, $p$ = .023, $d$ = 0.75.

Ready to calculate?

Now that you understand the concept, use the free Effect Size Calculator on Subthesis to run your own analysis.

Calculate Effect Size for Your t-Test on Subthesis

Independent Samples t-Test