Statistical Power & Power Analysis
Power Analysis
What Is Statistical Power?
Statistical power is the probability that your study will correctly reject the null hypothesis when it is actually false. In other words, power is the likelihood of detecting a real effect if one truly exists.
Here, is the probability of a Type II error — failing to detect a real effect. If your study has power of 0.80, there is an 80% chance you will find a statistically significant result when the effect is real, and a 20% chance you will miss it.
A study with low power is like a metal detector with weak batteries — the gold might be there, but you won't find it.
When to Use It
Power analysis is used in two contexts:
- A priori (before data collection). This is the most important use. You estimate the required sample size so your study has adequate power. Most IRBs, funding agencies, and dissertation committees require this.
- Post hoc (after data collection). Sometimes used to interpret a non-significant result — but this is controversial. Many statisticians argue that observed power adds little beyond the p-value itself.
Always conduct an a priori power analysis when designing a study. It prevents two costly mistakes:
- Too few participants: You waste time and resources on a study that cannot detect the effect.
- Too many participants: You recruit more participants than needed, wasting resources and potentially exposing more people to experimental conditions unnecessarily.
The Four Components of Power
Statistical power is governed by four interconnected quantities. If you know any three, you can solve for the fourth:
| Component | Symbol | Description |
|---|---|---|
| Effect size | , , , etc. | The magnitude of the effect you expect to find |
| Sample size | or | The number of participants in your study |
| Significance level | The probability of a Type I error (typically .05) | |
| Power | The probability of detecting a true effect |
How they relate:
- Larger effect size higher power. Big effects are easier to detect.
- Larger sample size higher power. More data means more precision.
- Higher alpha higher power. A more lenient threshold makes it easier to reject (but increases Type I error risk).
- Lower variability higher power. Less noise in your data makes the signal clearer.
Why 0.80 Is the Standard
Cohen (1988) recommended a minimum power of 0.80 as a reasonable convention, meaning you accept a 20% chance of missing a real effect. This balances the risk of Type II errors against practical constraints like cost and feasibility.
Some fields now push for 0.90 or higher, especially for:
- Confirmatory or pre-registered studies
- Clinical trials where missing an effect has serious consequences
- Replication studies
Formula
For a two-sample t-test, the relationship between power and sample size can be expressed through the non-centrality parameter :
Where is Cohen's d and is the sample size per group. The test statistic under the alternative hypothesis follows a non-central -distribution with non-centrality parameter and .
Power is then:
For a simplified sample size estimate (two-group comparison, equal groups, two-tailed test at and power = .80):
Substituting the standard -values (, ):
This gives the required sample size per group.
Worked Example
Scenario: A clinical psychologist wants to test whether a new cognitive-behavioral therapy (CBT) protocol reduces anxiety scores more than standard treatment. Prior research suggests a medium effect size of . She wants 80% power with (two-tailed).
Step 1: Identify the parameters.
- Effect size:
- Power: , so
- Significance level: (two-tailed)
Step 2: Apply the sample size formula.
Step 3: Round up.
Interpretation: The psychologist needs at least 63 participants per group (126 total) to have an 80% chance of detecting a medium effect () at the .05 significance level. If she expects attrition, she should recruit more (e.g., add 10-20%).
Interpretation
When reporting power analysis results, state clearly:
- What test you plan to use (e.g., independent samples t-test)
- What effect size you assumed and where the estimate came from
- What power level you targeted
- What alpha level you used
- The resulting sample size needed
If a post hoc power analysis reveals low power (e.g., 0.40), this means your study had only a 40% chance of detecting the effect. A non-significant result in an underpowered study is inconclusive — it does not mean the effect doesn't exist.
Common Mistakes
- Using post hoc power to "explain" non-significant results. Observed power is a direct function of the p-value — it adds no new information. If , observed power will always be below .50. Instead, report confidence intervals around your effect size estimate.
- Using unrealistic effect size estimates. Researchers often assume medium effects () by default. Base your estimate on pilot data, meta-analyses, or the smallest effect size of practical interest (SESOI).
- Forgetting to account for attrition. If you expect 15% dropout, multiply your required by .
- Ignoring the study design. Power calculations differ for independent t-tests, paired t-tests, ANOVA, regression, etc. Use the formula or software that matches your actual design.
- Treating Cohen's benchmarks as universal. A "small" effect in one field may be practically important. Always consider what effect size matters in your specific research context.
How to Run It
For an independent t-test
pwr.t.test(d = 0.5, sig.level = 0.05, power = 0.80, type = "two.sample")
For a one-way ANOVA (3 groups)
pwr.anova.test(k = 3, f = 0.25, sig.level = 0.05, power = 0.80)
For a correlation
pwr.r.test(r = 0.3, sig.level = 0.05, power = 0.80)
```python
from statsmodels.stats.power import TTestIndPower, FTestAnovaPower
# Independent t-test power analysis
analysis = TTestIndPower()
n = analysis.solve_power(effect_size=0.5, alpha=0.05, power=0.80)
print(f"Required n per group: {n:.0f}")
# One-way ANOVA
anova_power = FTestAnovaPower()
n = anova_power.solve_power(effect_size=0.25, alpha=0.05, power=0.80, k_groups=3)
print(f"Required n per group: {n:.0f}")
```
SPSS does not have a built-in power analysis module. Most researchers use one of these alternatives:
G*Power (free desktop software) — the most widely used tool for a priori power analysis. Download from HHU Düsseldorf.
Subthesis Power Calculator — free browser-based alternative at subthesis.com.
Excel does not have built-in power analysis functions. For quick calculations:
Use the free Subthesis Power Calculator in your browser
Or download G*Power (free) for comprehensive power analyses across all test types
These are the standard tools cited in dissertation method sections.
Ready to calculate?
Now that you understand the concept, use the free Sample Size & Power Analysis Calculator on Subthesis to run your own analysis.
Related Concepts
Effect Size
Learn what effect size is, why it matters more than p-values alone, and how to calculate and interpret Cohen's d, Hedges' g, and eta-squared for your research.
Sample Size Determination
Learn how to calculate the right sample size for your research study using power analysis, effect size estimates, and practical planning considerations.
Independent Samples t-Test
Learn how to conduct and interpret an independent samples t-test, including assumptions, formulas, worked examples, and APA reporting guidelines.
One-Way ANOVA
Learn how to conduct a one-way ANOVA to compare three or more group means, including F-ratio formulas, post-hoc tests, and effect size with eta-squared.