Correlation Does Not Imply Causation: What It Actually Means
You've probably heard "correlation does not imply causation" so many times it's lost all meaning. But understanding why is essential for writing a credible dissertation — and for surviving your defense when a committee member probes your causal language.
What Correlation Actually Tells You
A correlation measures the strength and direction of a linear relationship between two variables. When two variables are correlated, they tend to move together: as one increases, the other tends to increase (positive correlation) or decrease (negative correlation).
That's all it tells you. It does not tell you:
- Whether Variable A causes Variable B
- Whether Variable B causes Variable A
- Whether some other variable causes both
The Three Possible Explanations
When you find a significant correlation between X and Y, there are always at least three explanations:
1. X Causes Y
Maybe increased study time really does cause higher test scores. This is often what researchers hope to show.
2. Y Causes X
Maybe higher test scores cause increased study time — students who do well become more motivated and study more. The direction of causation can be the reverse of what you assumed.
3. A Third Variable Causes Both
Maybe socioeconomic status influences both study time and test scores. Students from wealthier families may have more time to study and attend better-resourced schools. The correlation between study time and test scores could be entirely driven by this confounding variable.
This third explanation — confounding variables — is the most common reason correlations mislead us.
Spurious Correlations: The Entertaining Evidence
The website Spurious Correlations famously shows that U.S. per capita cheese consumption is highly correlated with the number of people who die by becoming tangled in their bedsheets. Obviously, cheese doesn't cause bedsheet fatalities. With enough variables and enough data, you'll always find coincidental correlations.
This isn't just a joke — it illustrates why you need theory, not just data, to make causal claims.
What You Need for Causal Claims
To move from correlation to causation, researchers generally need:
- Temporal precedence — the cause must come before the effect
- Covariation — the variables must be related (this is what correlation shows)
- Elimination of alternative explanations — you must rule out confounding variables
Experimental designs with random assignment accomplish all three. Participants are randomly placed in conditions (eliminating confounds), the treatment is applied first (temporal precedence), and outcomes are compared (covariation).
Correlational designs, by definition, lack random assignment. That's why you can't make causal claims from survey data, no matter how strong the correlation.
How to Write About Correlations in Your Dissertation
Language to Avoid
- "X caused an increase in Y"
- "X led to improved Y"
- "X resulted in higher Y"
- "X had an impact on Y" (this implies causation)
Language to Use
- "X was positively associated with Y"
- "X was significantly correlated with Y"
- "Higher levels of X were related to higher levels of Y"
- "There was a significant relationship between X and Y"
In Your Discussion Chapter
When interpreting correlational results, acknowledge the limitation explicitly:
"Because this study employed a correlational design, causal inferences cannot be drawn. The observed relationship between teacher self-efficacy and student achievement may be influenced by unmeasured variables such as school resources or administrative support."
When Committees Push Back
If your study is correlational and a committee member asks about causation, the confident answer is: "This study was not designed to establish causation. The findings suggest a relationship that warrants further investigation using experimental or quasi-experimental designs."
That's not a weakness — it's scientific honesty. Every study has limitations, and being upfront about yours demonstrates methodological maturity. For help choosing the right design and analysis, see our guide on choosing the right statistical test.