If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

### Course: AP®︎/College Statistics>Unit 12

Lesson 2: Chi square tests for relationships (homogeneity or independence)

# Introduction to the chi-square test for homogeneity

We investigate a scenario that includes applying the chi-squared statistic for a test of homogeneity, setting up null and alternative hypotheses, calculating expected values, verifying conditions for conducting a chi-squared test, and determining degrees of freedom based on the structure of a contingency table.

## Want to join the conversation?

• Hi could someone explain how the chi-square test for homogeneity is conceptually DIFFERENT from the two-sample inference for the difference between groups? I am having a hard time conceptually wrapping my mind around how this test is different from the one that we learned before. Thank you!
• The first difference is that Chi-Square Tests are used for CATEGORICAL variables rather than Z and T which use QUANTITATIVE Variables. Another difference is that Chi-Square homogeneity is used to compare how data compares to the true KNOWN value and basic (observed-expected)^2/expected is used based on CELL COUNTS not means. On the other hand, 2 sample t or z is used to see if the means of 2 separate groups are equal, greater, or smaller than each other.
• Does someone have a resource that talks in detail about degrees of freedom? I understand what it is, but I don't exactly get why it's applicable in many situations.
• Why do we calculate just one `χ²` that includes both the data for the left-handed and right-handed people? Coming from the previous videos, I would think we would have to compute two `χ²`'s, one for the right-handed data and one for the left-handed, and then compare those by taking the difference between them.
• Is there a video/playlist explaining at length the reason/s for the large expected counts and 10% sample requirements?
(1 vote)
• Is this 'special' Chi-square test considered a non-parametric test? why or why not?
• In case of finding the expected counts, why do we consider the column total and not the row total?
I mean instead of saying that from a sample of 100 people 40 prefer STEM so from 60 people who are right-handed 40% of them are expected to prefer STEM, we can say from 100 people 60 are right-handed so from 40 people who prefer STEM 60% of them are expected to be write-handed.
• If I'm understanding this correctly, the "expected" result essentially creates a baseline (or a baseline for each population). The test then calculates the distance each population has from that baseline.

If the distance is small enough to not be significant (i.e. less than p-value), then the samples can be deemed to be homogenous, i.e. the populations being tested don't have an impact on the variables being looked at.

Is this correct?