Understanding p-Values Without Losing Your Mind
The p-value might be the most misunderstood number in all of research. It shows up in every results chapter, every journal article, and every committee meeting — yet most graduate students (and plenty of professors) can't define it correctly.
Let's fix that.
What a p-Value Actually Is
A p-value is the probability of obtaining results at least as extreme as the ones you observed, assuming the null hypothesis is true.
Read that again slowly. The key phrase is "assuming the null hypothesis is true." The p-value doesn't tell you the probability that your hypothesis is correct. It tells you how surprising your data would be in a world where nothing is really going on.
What a p-Value Is NOT
This is where most people get tripped up. A p-value is not:
- The probability that your results are due to chance
- The probability that the null hypothesis is true
- The probability that you'll get the same results if you repeat the study
- A measure of how important or large your effect is
That last point is crucial. A p-value of 0.001 does not mean a bigger effect than p = 0.04. It means the result is more surprising under the null hypothesis — often because the sample was larger. For effect magnitude, you need effect sizes.
The Magic Number: 0.05
By convention, most fields use α = 0.05 as the threshold. If p < 0.05, we reject the null hypothesis and call the result "statistically significant." If p ≥ 0.05, we fail to reject the null.
This threshold is a convention, not a law of nature. Ronald Fisher suggested 0.05 as a convenient benchmark in the 1920s, and it stuck. There's nothing magical about it, and a result with p = 0.049 is not fundamentally different from one with p = 0.051.
How to Report p-Values in APA Style
APA 7th edition has specific formatting rules:
- Report exact p-values to two or three decimal places: p = .032
- For very small values: p < .001 (never write p = .000)
- Don't use "p = .05" — report the exact value your software gives you
- Always pair p-values with effect sizes and confidence intervals
A Helpful Analogy
Think of a p-value like a fire alarm. When it goes off (p < .05), it's telling you something unusual might be happening. But the alarm doesn't tell you how big the fire is, whether it's a real fire or burnt toast, or what you should do about it. You need more information — and that's what effect sizes, confidence intervals, and your own judgment provide.
Common p-Value Mistakes to Avoid
- Don't say "the results were not significant, so there's no effect." A non-significant result means you didn't find sufficient evidence, not that no effect exists. This is especially true with small sample sizes.
- Don't treat 0.05 as a cliff edge. Results at p = 0.06 aren't meaningless. Discuss them honestly.
- Don't run multiple tests and only report the significant ones. This is called p-hacking, and it's a serious research ethics issue.
- Don't confuse statistical significance with practical significance. They're different questions entirely.
The Bigger Picture
The p-value is one piece of a larger puzzle. Alongside confidence intervals, effect sizes, and your own domain knowledge, it helps you evaluate evidence. But it was never designed to be the final word. Treat it as a useful tool, not a verdict.