Chi-Square Test of Independence
Chi-Square Test of Independence
What Is the Chi-Square Test of Independence?
The chi-square () test of independence determines whether there is a statistically significant association between two categorical variables. It compares the frequencies you actually observed in your data to the frequencies you would expect if the two variables were completely unrelated.
The core idea is simple: if two variables are independent (unrelated), knowing someone's category on variable A should tell you nothing about their likely category on variable B. The chi-square test measures how far the observed data deviate from this expectation of independence.
Where = observed frequency and = expected frequency in each cell of the contingency table. The larger the value, the stronger the evidence of an association.
When to Use It
Use the chi-square test of independence when:
- Both variables are categorical (nominal or ordinal). Examples: gender, treatment group, pass/fail, education level, political party.
- You want to test whether there is a relationship or association between them.
- You have frequency counts (how many people fall into each combination of categories).
Examples:
- Is there an association between gender (male/female) and voting preference (Democrat/Republican/Independent)?
- Is treatment type (drug A/drug B/placebo) related to recovery outcome (recovered/not recovered)?
- Is education level (high school/bachelor's/graduate) associated with smartphone brand preference (Apple/Samsung/Other)?
Do NOT use chi-square when:
- Your dependent variable is continuous (use a t-test or ANOVA)
- You have paired/repeated categorical data (use McNemar's test)
- Expected cell counts are too small (use Fisher's exact test)
Assumptions
-
Independence of observations. Each participant contributes to only one cell of the contingency table. No person is counted twice.
-
Adequate expected frequencies. All expected cell counts should be 5 or greater. If more than 20% of cells have expected counts below 5, the chi-square approximation is unreliable. Options:
- Collapse categories to increase cell counts
- Use Fisher's exact test (especially for 2x2 tables)
-
Mutually exclusive categories. Each observation falls into one and only one cell of the table.
-
Sufficiently large sample. As a rule of thumb, total should be at least 5 times the number of cells in the table.
Formula
Step-by-Step Computation
1. Set up the contingency table with observed frequencies ().
2. Calculate expected frequencies for each cell:
Where:
- = row total for row
- = column total for column
- = grand total
3. Compute the chi-square statistic:
4. Degrees of freedom:
Where = number of rows and = number of columns.
Effect Size: Cramer's V
Where — the smaller of the number of rows or columns.
| Cramer's V | Interpretation () | Interpretation () |
|---|---|---|
| .10 | Small | Small |
| .30 | Medium | .21 Medium |
| .50 | Large | .35 Large |
( = , the smaller dimension minus 1)
For a 2x2 table, Cramer's V equals the phi coefficient (), and the interpretation benchmarks are the same as for a correlation coefficient.
Worked Example
Scenario: A university counseling center wants to know if the type of stress management workshop attended (Yoga, Meditation, or Exercise) is associated with whether students report reduced stress at follow-up (Yes or No). Data from 150 students:
Observed frequencies:
| Reduced Stress: Yes | Reduced Stress: No | Row Total | |
|---|---|---|---|
| Yoga | 35 | 15 | 50 |
| Meditation | 30 | 20 | 50 |
| Exercise | 20 | 30 | 50 |
| Column Total | 85 | 65 | 150 |
Step 1: State the hypotheses.
- : Workshop type and stress reduction are independent (no association).
- : Workshop type and stress reduction are associated.
Step 2: Calculate expected frequencies.
| Reduced Stress: Yes | Reduced Stress: No | |
|---|---|---|
| Yoga | ||
| Meditation | ||
| Exercise |
All expected counts are well above 5, so the assumption is met.
Step 3: Calculate .
Step 4: Degrees of freedom.
Step 5: Find the p-value.
The critical value for with at is 5.991. Our exceeds this, so .
Step 6: Calculate Cramer's V.
Interpretation
Since , , we reject the null hypothesis. There is a statistically significant association between workshop type and stress reduction.
Cramer's indicates a small-to-medium effect size. Workshop type is meaningfully related to stress outcomes, though other factors also play a role.
Looking at the observed vs. expected frequencies:
- Yoga had more stress reduction than expected (35 observed vs. 28.33 expected)
- Exercise had less stress reduction than expected (20 observed vs. 28.33 expected)
- Meditation was close to what independence would predict
This suggests yoga was the most effective workshop for stress reduction, while exercise was the least effective in this sample. However, the chi-square test does not tell you which specific cells drive the significant result — for that, examine standardized residuals. Cells with standardized residuals greater than are the primary contributors.
Common Mistakes
-
Using percentages instead of raw counts. The chi-square formula requires frequency counts. If your data are in percentages, convert them back to raw numbers first.
-
Violating the expected frequency assumption. If any expected cell count is below 5, the approximation is poor. Merge categories or use Fisher's exact test.
-
Testing dependent (paired) data. If the same participants are measured at two time points (e.g., preference before vs. after), use McNemar's test for 2x2 tables or the Cochran Q test for larger tables. The standard chi-square test requires independent observations.
-
Interpreting association as causation. A significant chi-square tells you the variables are related, not that one causes the other. The relationship could be driven by a confounding variable.
-
Forgetting to report the effect size. A significant chi-square with a very large can reflect a trivially small association. Always report Cramer's V (or phi for 2x2 tables) so readers can judge practical importance.
-
Applying the test to continuous variables. Do not artificially categorize continuous data (e.g., splitting age into "young" and "old") just to use a chi-square test. This loses information. Use a t-test or correlation instead.
-
Confusion with chi-square goodness of fit. The test of independence uses a two-way contingency table. The goodness-of-fit test examines whether a single variable's distribution matches an expected pattern. They are different tests.
How to Report in APA Format
A chi-square test of independence was performed to examine the association between workshop type and stress reduction. The relation between these variables was significant, (2, = 150) = 9.50, = .009, = .25. Students who attended yoga workshops were more likely to report reduced stress (70%) compared to those who attended meditation (60%) or exercise workshops (40%).
For a 2x2 table, report phi instead of Cramer's V:
A chi-square test of independence showed a significant association between gender and voting preference, (1, = 200) = 6.35, = .012, = .18.
Ready to calculate?
Now that you understand the concept, use the free Subthesis Research Tools on Subthesis to run your own analysis.
Related Concepts
Effect Size
Learn what effect size is, why it matters more than p-values alone, and how to calculate and interpret Cohen's d, Hedges' g, and eta-squared for your research.
Sample Size Determination
Learn how to calculate the right sample size for your research study using power analysis, effect size estimates, and practical planning considerations.
Statistical Power & Power Analysis
Learn what statistical power is, why 80% is the standard threshold, and how to conduct a power analysis to determine if your study can detect real effects.