Chi-Square Test of Independence

Purpose

Tests whether there is a statistically significant association (relationship) between two categorical variables.

When to Use

When both the independent and dependent variables are categorical (nominal or ordinal) and you want to test if they are related.

Data Type

Categorical (nominal or ordinal) for both variables. Data must be in the form of frequency counts, not percentages or proportions.

Key Assumptions

Independence of observations, adequate expected cell frequencies (all expected counts >= 5), and mutually exclusive categories.

Tools

Subthesis Research Tools on Subthesis →

What Is the Chi-Square Test of Independence?

The chi-square ( $\chi^2$ ) test of independence determines whether there is a statistically significant association between two categorical variables. It compares the frequencies you actually observed in your data to the frequencies you would expect if the two variables were completely unrelated.

The core idea is simple: if two variables are independent (unrelated), knowing someone's category on variable A should tell you nothing about their likely category on variable B. The chi-square test measures how far the observed data deviate from this expectation of independence.

\chi^2 = \sum \frac{(O - E)^2}{E}

Where $O$ = observed frequency and $E$ = expected frequency in each cell of the contingency table. The larger the $\chi^2$ value, the stronger the evidence of an association.

When to Use It

Use the chi-square test of independence when:

Both variables are categorical (nominal or ordinal). Examples: gender, treatment group, pass/fail, education level, political party.
You want to test whether there is a relationship or association between them.
You have frequency counts (how many people fall into each combination of categories).

Examples:

Is there an association between gender (male/female) and voting preference (Democrat/Republican/Independent)?
Is treatment type (drug A/drug B/placebo) related to recovery outcome (recovered/not recovered)?
Is education level (high school/bachelor's/graduate) associated with smartphone brand preference (Apple/Samsung/Other)?

Do NOT use chi-square when:

Your dependent variable is continuous (use a t-test or ANOVA)
You have paired/repeated categorical data (use McNemar's test)
Expected cell counts are too small (use Fisher's exact test)

Assumptions

Independence of observations. Each participant contributes to only one cell of the contingency table. No person is counted twice.
Adequate expected frequencies. All expected cell counts should be 5 or greater. If more than 20% of cells have expected counts below 5, the chi-square approximation is unreliable. Options:
- Collapse categories to increase cell counts
- Use Fisher's exact test (especially for 2x2 tables)
Mutually exclusive categories. Each observation falls into one and only one cell of the table.
Sufficiently large sample. As a rule of thumb, total $N$ should be at least 5 times the number of cells in the table.

Formula

Step-by-Step Computation

1. Set up the contingency table with observed frequencies ( $O$ ).

2. Calculate expected frequencies for each cell:

E_{ij} = \frac{R_i \times C_j}{N}

Where:

$R_i$ = row total for row $i$
$C_j$ = column total for column $j$
$N$ = grand total

3. Compute the chi-square statistic:

\chi^2 = \sum_{i=1}^{r} \sum_{j=1}^{c} \frac{(O_{ij} - E_{ij})^2}{E_{ij}}

4. Degrees of freedom:

df = (r - 1)(c - 1)

Where $r$ = number of rows and $c$ = number of columns.

Effect Size: Cramer's V

V = \sqrt{\frac{\chi^2}{N \times (k - 1)}}

Where $k = \min(r, c)$ — the smaller of the number of rows or columns.

Cramer's V	Interpretation ( $df^* = 1$ )	Interpretation ( $df^* = 2$ )
.10	Small	Small
.30	Medium	.21 Medium
.50	Large	.35 Large

( $df^*$ = $k - 1$ , the smaller dimension minus 1)

For a 2x2 table, Cramer's V equals the phi coefficient ( $\phi$ ), and the interpretation benchmarks are the same as for a correlation coefficient.

Worked Example

Scenario: A university counseling center wants to know if the type of stress management workshop attended (Yoga, Meditation, or Exercise) is associated with whether students report reduced stress at follow-up (Yes or No). Data from 150 students:

Observed frequencies:

	Reduced Stress: Yes	Reduced Stress: No	Row Total
Yoga	35	15	50
Meditation	30	20	50
Exercise	20	30	50
Column Total	85	65	150

Step 1: State the hypotheses.

$H_0$ : Workshop type and stress reduction are independent (no association).
$H_1$ : Workshop type and stress reduction are associated.

Step 2: Calculate expected frequencies.

E = \frac{\text{Row Total} \times \text{Column Total}}{N}

	Reduced Stress: Yes	Reduced Stress: No
Yoga	$\frac{50 \times 85}{150} = 28.33$	$\frac{50 \times 65}{150} = 21.67$
Meditation	$\frac{50 \times 85}{150} = 28.33$	$\frac{50 \times 65}{150} = 21.67$
Exercise	$\frac{50 \times 85}{150} = 28.33$	$\frac{50 \times 65}{150} = 21.67$

All expected counts are well above 5, so the assumption is met.

Step 3: Calculate $\chi^2$ .

\chi^2 = \frac{(35 - 28.33)^2}{28.33} + \frac{(15 - 21.67)^2}{21.67} + \frac{(30 - 28.33)^2}{28.33} + \frac{(20 - 21.67)^2}{21.67} + \frac{(20 - 28.33)^2}{28.33} + \frac{(30 - 21.67)^2}{21.67}

= \frac{44.49}{28.33} + \frac{44.49}{21.67} + \frac{2.79}{28.33} + \frac{2.79}{21.67} + \frac{69.39}{28.33} + \frac{69.39}{21.67}

= 1.57 + 2.05 + 0.10 + 0.13 + 2.45 + 3.20 = 9.50

Step 4: Degrees of freedom.

df = (3 - 1)(2 - 1) = 2

Step 5: Find the p-value.

The critical value for $\chi^2$ with $df = 2$ at $\alpha = .05$ is 5.991. Our $\chi^2 = 9.50$ exceeds this, so $p \approx .009$ .

Step 6: Calculate Cramer's V.

V = \sqrt{\frac{9.50}{150 \times (2 - 1)}} = \sqrt{\frac{9.50}{150}} = \sqrt{0.063} = 0.25

Interpretation

Since $\chi^2(2) = 9.50$ , $p = .009$ , we reject the null hypothesis. There is a statistically significant association between workshop type and stress reduction.

Cramer's $V = 0.25$ indicates a small-to-medium effect size. Workshop type is meaningfully related to stress outcomes, though other factors also play a role.

Looking at the observed vs. expected frequencies:

Yoga had more stress reduction than expected (35 observed vs. 28.33 expected)
Exercise had less stress reduction than expected (20 observed vs. 28.33 expected)
Meditation was close to what independence would predict

This suggests yoga was the most effective workshop for stress reduction, while exercise was the least effective in this sample. However, the chi-square test does not tell you which specific cells drive the significant result — for that, examine standardized residuals. Cells with standardized residuals greater than $|2.0|$ are the primary contributors.

Common Mistakes

Using percentages instead of raw counts. The chi-square formula requires frequency counts. If your data are in percentages, convert them back to raw numbers first.
Violating the expected frequency assumption. If any expected cell count is below 5, the $\chi^2$ approximation is poor. Merge categories or use Fisher's exact test.
Testing dependent (paired) data. If the same participants are measured at two time points (e.g., preference before vs. after), use McNemar's test for 2x2 tables or the Cochran Q test for larger tables. The standard chi-square test requires independent observations.
Interpreting association as causation. A significant chi-square tells you the variables are related, not that one causes the other. The relationship could be driven by a confounding variable.
Forgetting to report the effect size. A significant chi-square with a very large $N$ can reflect a trivially small association. Always report Cramer's V (or phi for 2x2 tables) so readers can judge practical importance.
Applying the test to continuous variables. Do not artificially categorize continuous data (e.g., splitting age into "young" and "old") just to use a chi-square test. This loses information. Use a t-test or correlation instead.
Confusion with chi-square goodness of fit. The test of independence uses a two-way contingency table. The goodness-of-fit test examines whether a single variable's distribution matches an expected pattern. They are different tests.

How to Report in APA Format

A chi-square test of independence was performed to examine the association between workshop type and stress reduction. The relation between these variables was significant, $\chi^2$ (2, $N$ = 150) = 9.50, $p$ = .009, $V$ = .25. Students who attended yoga workshops were more likely to report reduced stress (70%) compared to those who attended meditation (60%) or exercise workshops (40%).

For a 2x2 table, report phi instead of Cramer's V:

A chi-square test of independence showed a significant association between gender and voting preference, $\chi^2$ (1, $N$ = 200) = 6.35, $p$ = .012, $\phi$ = .18.

Ready to calculate?

Now that you understand the concept, use the free Subthesis Research Tools on Subthesis to run your own analysis.

Explore Research Tools on Subthesis

Chi-Square Test of Independence