Power Analysis

Purpose

Determines the probability that a study will detect a true effect when one exists, helping researchers plan adequately sized studies.

When to Use

Before data collection (a priori) to determine required sample size, or after a study (post hoc) to evaluate the study's ability to detect effects.

Data Type

Requires estimates of effect size, sample size, and alpha level — applies to any statistical test.

Key Assumptions

Assumes a specific effect size exists in the population. Accuracy depends on realistic effect size estimates from prior research or pilot data.

Tools

Sample Size & Power Analysis Calculator on Subthesis →

What Is Statistical Power?

Statistical power is the probability that your study will correctly reject the null hypothesis when it is actually false. In other words, power is the likelihood of detecting a real effect if one truly exists.

\text{Power} = P(\text{reject } H_0 \mid H_0 \text{ is false}) = 1 - \beta

Here, $\beta$ is the probability of a Type II error — failing to detect a real effect. If your study has power of 0.80, there is an 80% chance you will find a statistically significant result when the effect is real, and a 20% chance you will miss it.

A study with low power is like a metal detector with weak batteries — the gold might be there, but you won't find it.

When to Use It

Power analysis is used in two contexts:

A priori (before data collection). This is the most important use. You estimate the required sample size so your study has adequate power. Most IRBs, funding agencies, and dissertation committees require this.
Post hoc (after data collection). Sometimes used to interpret a non-significant result — but this is controversial. Many statisticians argue that observed power adds little beyond the p-value itself.

Always conduct an a priori power analysis when designing a study. It prevents two costly mistakes:

Too few participants: You waste time and resources on a study that cannot detect the effect.
Too many participants: You recruit more participants than needed, wasting resources and potentially exposing more people to experimental conditions unnecessarily.

The Four Components of Power

Statistical power is governed by four interconnected quantities. If you know any three, you can solve for the fourth:

Component	Symbol	Description
Effect size	$d$ , $f$ , $r$ , etc.	The magnitude of the effect you expect to find
Sample size	$n$ or $N$	The number of participants in your study
Significance level	$\alpha$	The probability of a Type I error (typically .05)
Power	$1 - \beta$	The probability of detecting a true effect

How they relate:

Larger effect size $\rightarrow$ higher power. Big effects are easier to detect.
Larger sample size $\rightarrow$ higher power. More data means more precision.
Higher alpha $\rightarrow$ higher power. A more lenient threshold makes it easier to reject $H_0$ (but increases Type I error risk).
Lower variability $\rightarrow$ higher power. Less noise in your data makes the signal clearer.

Why 0.80 Is the Standard

Cohen (1988) recommended a minimum power of 0.80 as a reasonable convention, meaning you accept a 20% chance of missing a real effect. This balances the risk of Type II errors against practical constraints like cost and feasibility.

Some fields now push for 0.90 or higher, especially for:

Confirmatory or pre-registered studies
Clinical trials where missing an effect has serious consequences
Replication studies

Formula

For a two-sample t-test, the relationship between power and sample size can be expressed through the non-centrality parameter $\lambda$ :

\lambda = d \sqrt{\frac{n}{2}}

Where $d$ is Cohen's d and $n$ is the sample size per group. The test statistic under the alternative hypothesis follows a non-central $t$ -distribution with non-centrality parameter $\lambda$ and $df = 2n - 2$ .

Power is then:

\text{Power} = P\left(t_{df, \lambda} > t_{\alpha, df}\right)

For a simplified sample size estimate (two-group comparison, equal groups, two-tailed test at $\alpha = .05$ and power = .80):

n \approx \frac{2(z_{1-\alpha/2} + z_{1-\beta})^2}{d^2}

Substituting the standard $z$ -values ( $z_{0.975} = 1.96$ , $z_{0.80} = 0.84$ ):

n \approx \frac{2(1.96 + 0.84)^2}{d^2} = \frac{2(7.84)}{d^2} = \frac{15.68}{d^2}

This gives the required sample size per group.

Worked Example

Scenario: A clinical psychologist wants to test whether a new cognitive-behavioral therapy (CBT) protocol reduces anxiety scores more than standard treatment. Prior research suggests a medium effect size of $d = 0.50$ . She wants 80% power with $\alpha = .05$ (two-tailed).

Step 1: Identify the parameters.

Effect size: $d = 0.50$
Power: $1 - \beta = 0.80$ , so $\beta = 0.20$
Significance level: $\alpha = .05$ (two-tailed)

Step 2: Apply the sample size formula.

n = \frac{2(z_{0.975} + z_{0.80})^2}{d^2} = \frac{2(1.96 + 0.84)^2}{0.50^2}

n = \frac{2(2.80)^2}{0.25} = \frac{2(7.84)}{0.25} = \frac{15.68}{0.25} = 62.72

Step 3: Round up.

n = 63 \text{ per group}

Interpretation: The psychologist needs at least 63 participants per group (126 total) to have an 80% chance of detecting a medium effect ( $d = 0.50$ ) at the .05 significance level. If she expects attrition, she should recruit more (e.g., add 10-20%).

Interpretation

When reporting power analysis results, state clearly:

What test you plan to use (e.g., independent samples t-test)
What effect size you assumed and where the estimate came from
What power level you targeted
What alpha level you used
The resulting sample size needed

If a post hoc power analysis reveals low power (e.g., 0.40), this means your study had only a 40% chance of detecting the effect. A non-significant result in an underpowered study is inconclusive — it does not mean the effect doesn't exist.

Common Mistakes

Using post hoc power to "explain" non-significant results. Observed power is a direct function of the p-value — it adds no new information. If $p > .05$ , observed power will always be below .50. Instead, report confidence intervals around your effect size estimate.
Using unrealistic effect size estimates. Researchers often assume medium effects ( $d = 0.50$ ) by default. Base your estimate on pilot data, meta-analyses, or the smallest effect size of practical interest (SESOI).
Forgetting to account for attrition. If you expect 15% dropout, multiply your required $n$ by $\frac{1}{1 - 0.15} = 1.18$ .
Ignoring the study design. Power calculations differ for independent t-tests, paired t-tests, ANOVA, regression, etc. Use the formula or software that matches your actual design.
Treating Cohen's benchmarks as universal. A "small" effect in one field may be practically important. Always consider what effect size matters in your specific research context.

How to Run It

```r # Power analysis in R — requires 'pwr' package library(pwr)

For an independent t-test

pwr.t.test(d = 0.5, sig.level = 0.05, power = 0.80, type = "two.sample")

For a one-way ANOVA (3 groups)

pwr.anova.test(k = 3, f = 0.25, sig.level = 0.05, power = 0.80)

For a correlation

pwr.r.test(r = 0.3, sig.level = 0.05, power = 0.80)

```python
from statsmodels.stats.power import TTestIndPower, FTestAnovaPower

# Independent t-test power analysis
analysis = TTestIndPower()
n = analysis.solve_power(effect_size=0.5, alpha=0.05, power=0.80)
print(f"Required n per group: {n:.0f}")

# One-way ANOVA
anova_power = FTestAnovaPower()
n = anova_power.solve_power(effect_size=0.25, alpha=0.05, power=0.80, k_groups=3)
print(f"Required n per group: {n:.0f}")
```

SPSS does not have a built-in power analysis module. Most researchers use one of these alternatives:

G*Power (free desktop software) — the most widely used tool for a priori power analysis. Download from HHU Düsseldorf.
Subthesis Power Calculator — free browser-based alternative at subthesis.com.



Excel does not have built-in power analysis functions. For quick calculations:

Use the free Subthesis Power Calculator in your browser
Or download G*Power (free) for comprehensive power analyses across all test types

These are the standard tools cited in dissertation method sections.



## How to Report in APA Format

In a Method section:

> An a priori power analysis was conducted using G*Power 3.1 (Faul et al., 2009) to determine the minimum sample size needed. For an independent samples t-test (two-tailed) with a medium effect size ($d$ = 0.50), $\alpha$ = .05, and power = .80, the required sample size was 63 per group (126 total). To account for anticipated attrition of approximately 15%, we aimed to recruit 73 participants per group (146 total).

Ready to calculate?

Now that you understand the concept, use the free Sample Size & Power Analysis Calculator on Subthesis to run your own analysis.

Run a Power Analysis on Subthesis

Statistical Power & Power Analysis